Skip to main content
Alpha Cyanea is in public alpha. We're building in the open — expect rough edges and rapid iteration. See what's live

cyanea-chem

v0.2.0 I/O

Cheminformatics in the browser, from SMILES to fingerprints.

I/O layer Apache-2.0 7 functions Interactive playground

Chemical informatics — SMILES parsing, molecular fingerprints, Tanimoto similarity, substructure search, SDF parsing, and molecular property calculation.

Playground

Loading playground…

Overview

cyanea-chem brings cheminformatics to the Cyanea ecosystem. It parses SMILES strings into molecular graphs, computes physicochemical properties (molecular weight, LogP, polar surface area, hydrogen bond donors/acceptors), generates circular fingerprints (Morgan/ECFP) and MACCS keys, and supports Tanimoto similarity search and SMARTS substructure matching.

All operations run in-browser via WASM, enabling interactive molecular exploration without server-side chemistry toolkits.

Key Concepts

SMILES

SMILES (Simplified Molecular-Input Line-Entry System) is a compact string notation for molecules. For example, c1ccccc1 is benzene and CC(=O)Oc1ccccc1C(=O)O is aspirin. canonical produces a unique, deterministic SMILES for any input, making string comparison equivalent to molecular identity.

Molecular Fingerprints

Fingerprints encode molecular substructure into fixed-length bit vectors. Morgan fingerprints (ECFP) enumerate circular substructures at each atom up to a given radius; MACCS keys check for 166 predefined structural patterns. Fingerprints enable fast similarity search over large compound libraries.

Tanimoto Similarity

The Tanimoto coefficient measures the overlap between two fingerprints: |A ∩ B| / |A ∪ B|. A value of 1.0 means identical substructure sets; 0.0 means no overlap. It is the standard metric for virtual screening and compound clustering.

SMARTS patterns extend SMILES with wildcards and logic operators for substructure matching. smiles_substructure("CC(=O)Oc1ccccc1C(=O)O", "[CX3](=O)[OX2H1]") checks whether aspirin contains a carboxylic acid group.

Code Examples

Rust

use cyanea_chem::{smiles_properties, tanimoto, canonical};

let props = smiles_properties("CC(=O)Oc1ccccc1C(=O)O")?;
let sim = tanimoto("c1ccccc1", "c1ccc(O)cc1")?;
let canon = canonical("OC1=CC=CC=C1")?; // → "Oc1ccccc1"

Python

import cyanea

props = cyanea.smiles_properties("CC(=O)Oc1ccccc1C(=O)O")
sim = cyanea.tanimoto("c1ccccc1", "c1ccc(O)cc1")

JavaScript (WASM)

import { smiles_properties, tanimoto, canonical } from '/wasm/cyanea_wasm.js';

const props = JSON.parse(smiles_properties("CC(=O)Oc1ccccc1C(=O)O"));
const sim = JSON.parse(tanimoto("c1ccccc1", "c1ccc(O)cc1"));

Use Cases

  • Virtual screening — Rank a compound library by Tanimoto similarity to a lead molecule.
  • Property filters — Apply Lipinski’s Rule of Five using computed MW, LogP, HBA, HBD.
  • SAR analysis — Check which analogs contain a pharmacophore via substructure search.
  • Data curation — Canonicalize SMILES to deduplicate compound databases.

API Surface

smiles_properties (smiles: &str) -> JSON Compute molecular properties (MW, LogP, HBA, HBD, TPSA)
canonical (smiles: &str) -> String Canonicalize a SMILES string
smiles_fingerprint (smiles, radius, bits) -> JSON Compute Morgan/ECFP circular fingerprint
tanimoto (smi1, smi2: &str) -> f64 Tanimoto similarity between two molecules
smiles_substructure (mol, pattern: &str) -> bool SMARTS substructure search
parse_sdf (text: &str) -> JSON Parse SDF/MOL file into structured records
maccs_fingerprint (smiles: &str) -> JSON Compute 166-bit MACCS structural keys

Depends on

Depended on by

Tags

Chemistry SMILES Fingerprints Tanimoto Substructure