Paste a peptide SMILES — get the ProForma sequence and the monoisotopic mass.
Free, instant, runs in your browser. Output is a fully-qualified
ProForma string —
20 natural amino acids as single letters, side-chain modifications as
[Formula:ΔCxHy],
Aib and N-methyl backbones as X[Formula:…],
N/C-terminal caps detected ([Acetyl]-…-[Amide]),
cyclic peptides and disulfide bridges supported. Validated on liraglutide,
semaglutide, vasopressin and other clinical-grade peptides.
What this tool does
Given a peptide drawn as SMILES, this converter returns the canonical
ProForma sequence (the PSI-MS standard string format used in proteomics),
plus the monoisotopic mass. It works for linear peptides
(e.g. Leu-enkephalin → YGGFL, mass 555.27),
for cyclic peptides (e.g. vasopressin),
for capped peptides ([Acetyl]-MK-[Amide]),
and for clinical peptide drugs where side-chain lipidation on Lys produces a
K[Formula:C28H46N2O9]-style flag with the explicit atomic delta.
How it works
SMILES is parsed by RDKit MinimalLib running in your browser. The tool then runs SMARTS substructure matches for each of the 20 natural amino-acid backbones (with strict aromatic-ring constraints to prevent false-positives on substituted Phe/Tyr), walks the peptide bonds (atom-index continuity) to recover N→C ordering, BFS-walks each residue's atoms to extract its formula, detects the N/C-terminal caps, and computes the monoisotopic mass from a residue-mass table plus modification deltas plus −2 H per disulfide bridge plus side-chain protonation correction. Nothing is uploaded.
Output format (ProForma)
- Pattern:
[Ncap]-(residue[mod]?)+-[Ccap] - 20 natural amino acids → single letters (
A C D E F G H I K L M N P Q R S T V W Y). - Side-chain-modified naturals →
letter[Formula:ΔCxHy](the formula is the net delta — atoms added minus 1 H replaced — so it's directly readable by pyteomics and other ProForma-aware tools). - Non-natural backbones (Aib, Sar, etc.) →
X[Formula:CxHy](full residue atomic composition). - N-terminal caps: free amine = no prefix;
CH₃CO-→[Acetyl]-;HCO-→[Formyl]-;tBu-OCO-→[Boc]-; other →[Formula:Δ]-. - C-terminal caps: free acid = no suffix;
-CONH₂→-[Amide]; methyl ester →-[Formula:CH2]; other →-[Formula:Δ]. - Cyclic peptides →
cyclic[…]. Disulfide bridges contribute −2 H to the mass automatically. - Stereochemistry (D/L) is not distinguished.
Mass accuracy
Reported mass agrees with RDKit's ExactMolWt to within
≈ 0.001 Da. Charged species have a ~0.0005 Da electron-mass offset (negligible for most
applications). Validated on:
di- / tri-peptides with all cap combinations,
Leu-enkephalin (555.27),
vasopressin (1084.45, with disulfide),
liraglutide (3751.26),
semaglutide (4113.58).
Privacy
Your SMILES never leaves your browser. The conversion happens locally via RDKit compiled to WebAssembly. This page makes no network request once RDKit has loaded, so it's safe for proprietary structures.