Aggregating and indexing chemical structures and associated information


Royal Society of Chemistry  

Reviewed by Sean Ekins, 
Collaborations in Chemistry, Jenkintown, US 


ChemSpider is a chemistry search engine, aggregating and indexing chemical structures and their associated information into a single searchable repository, and free at that.   

In my experience it has always been more than a chemistry database (taking in over 21 million molecules, including drugs and drug-like structures). It is a new mindset for information provision and the spirit of collaborative science. Users are encouraged to upload molecules and spectra too. From the beginning it has provided calculated molecular properties which I am sure has pleased countless medicinal chemists that have focused on drug likeness. 

I have used the database for finding structures of interest, downloading structure data (SDF) files for quantitative structure-activity relationship (QSAR) model development,1 and generally learning more about molecules that interest me. I have also found compounds from searches based on similarity to molecules derived from pharmacophore tools that have been purchased by collaborators and found to be biologically active against specific targets. More recently I have seen the database used for searching for particular isosteres of interest. 

What I like about ChemSpider is the unpretentiousness of it, evolving over time, accessible to all, providing a significant new knowledge base and resource for chemists working in different fields. What I would like to see change in the future is improvement in the ease of download of larger numbers of molecules in sdf or other formats, perhaps capped at several thousand. The addition and linkage to more biological resources would greatly expand its audience. Perhaps most people will use it to answer questions, and solve problems, for others it offers the opportunity to create solutions from the abundance of data and services available via ChemSpider. It has also been used with text mining as the basis of chemistry document markup (called ChemMantis), converting chemical names to chemical structures as well as a new chemistry centric journal, the  ChemSpider Journal of Chemistry  . I imagine it as part of a growing free public source of chemistry and biology information. The US National Institutes of Health and other funding bodies could encourage the scientists it funds to deposit their structures and data in it in the same way they must deposit their publications. 

By providing a platform for public crowd-sourced curation, ChemSpider has also facilitated the cleansing of public domain chemistry data and this in and of itself is a huge contribution to public knowledge. By providing a platform for the deposition of new chemical data, ChemSpider has provided a way for chemists to expose their molecules and associated data to the public, thereby facilitating the rapid communication of chemistry not based on a classical publication model. Mainstream scientific publishers have taken note and used it as a repository. 

For more information on ChemSpider contact