Welcome! This page explains the different criteria you can use to compare molecules in the Cheminformatics Similarity Explorer. Each criterion is a different "lens" through which we can measure how "similar" two molecules are.
What is it? A measure of the total surface area of a molecule that comes from its polar atoms (usually oxygen and nitrogen).
How to Interpret It: Think of TPSA as a measure of a molecule's "stickiness" for other polar molecules, like water. A higher TPSA means the molecule has a larger polar surface, making it better at forming hydrogen bonds. This often leads to higher boiling points and better solubility in water.
What is it? A calculated value that predicts how a molecule would distribute itself in a mixture of two immiscible liquids: water (polar) and an "oil" (nonpolar, specifically n-octanol).
How to Interpret It: This is a "tug-of-war" for the molecule between oil and water.
A high positive XLogP (e.g., +3.0) means the molecule prefers the oil. It is lipophilic ("fat-loving") or hydrophobic ("water-fearing").
A low or negative XLogP (e.g., -1.0) means the molecule prefers water. It is hydrophilic ("water-loving"). This is a key predictor of how drugs will be absorbed by the body.
What is it? The mass of one molecule, measured in atomic mass units (amu). It's the sum of the protons and neutrons of all atoms in the molecule.
How to Interpret It: This is the simplest measure of a molecule's size—it tells you how heavy it is.
What is it? This is an estimate of the molecule's 3D size. It works by first determining the molecule's overall shape (long and thin, flat, or spherical) and then calculating the volume of the form-fitting 'ellipsoid' (think of a 3D oval or a stretched-out sphere) that represents that shape.
How to Interpret It: This tells you how big a molecule is in three dimensions. Unlike molecular weight (which is about mass), this volume is about the space the molecule occupies. A large, spread-out molecule will have a higher volume than a compact, dense one, even if they weigh the same. This method is especially good at showing the size difference between a long, 'rod-shaped' molecule and a tightly-packed, 'ball-shaped' one.
What is it? A sophisticated metric that describes a molecule's overall 3D shape.
How to Interpret It: This metric classifies molecules by how closely they resemble one of three basic shapes:
Sphere (a ball): A compact molecule like methane.
Rod (a cigar): A long, linear molecule like butane.
Disc (a pancake): A flat, planar molecule like benzene. Molecules with similar shapes will have a high similarity score for this criterion.
What is it? A count of the number of single bonds in a molecule that can freely rotate.
How to Interpret It: This is a measure of a molecule's "floppiness" or flexibility.
A low number (like in cyclohexane) means the molecule is rigid.
A high number (like in a long alkane chain) means the molecule is very flexible and can twist into many different conformations.
What is it? A count of the sites on a molecule that can participate in hydrogen bonding.
Donors are hydrogen atoms bonded to a highly electronegative atom (N, O, F).
Acceptors are electronegative atoms (N, O, F) with lone pairs of electrons.
How to Interpret It: These counts are a direct measure of a molecule's potential to form hydrogen bonds, which is one of the most important intermolecular forces. Molecules with high donor and acceptor counts are very likely to be soluble in water.
What is it? A popular method in cheminformatics for measuring the overall similarity of two molecules' 2D structures.
How to Interpret It: Think of this as a "fingerprint comparison." The computer converts each molecule's structure into a unique digital fingerprint (a series of 1s and 0s). The Tanimoto score, which ranges from 0 to 1, measures how much these two fingerprints overlap. A score of 1.0 means they are structurally identical, while a score closer to 0 means they are very different. It's a great holistic measure of structural similarity.