As the Cambridge Structural Database reaches its 50th year, Colin Groom gives us some of its greatest ‘hits’

Over its 50 years, the CSD has accumulated a great many unusual structures – enough to keep busy even the most inquisitive minds. The following give some idea of the enormous diversity within the collection. Their six-letter CSD reference codes (refcodes) can be used with the ‘get structures’ feature on the CCDC website to look them up.

Sam Falconer/Debut Art

The crystal community

For chemists, the Cambridge Structural Database (CSD) is part of the furniture. It contains data for every small molecule crystal structure ever determined – over 750,000 of them – and is a mine of useful information for anyone researching or teaching chemistry. It can be accessed online by anyone and is delivered locally to more than 1200 universities and 200 industrial sites, including all of the world’s leading pharmaceutical enterprises. Knowledge derived from the CSD underpins the world of computational chemistry and molecular modelling. For most of us, it’s always been there, but there was a time before it existed.

Back in 1965, there were only around 1500 published structures. Inspired by John Bernal, Olga Kennard created the CSD in order to facilitate data sharing. She had the foresight to understand how this information might drive the creation of new chemical knowledge. In those days, the structures were collated manually – a painstaking process. The first editions of the ‘database’ were printed volumes of geometries, but it wasn’t long before one of the world’s first numeric scientific databases started to take shape.

Establishing the Cambridge Crystallographic Data Centre (CCDC) was essential to the sustainability of this resource. Furthermore, enormous credit must go to the crystallographic community for its exemplary data sharing practices. From the inception of the technique, the results of every published crystal structure have been shared with the CCDC – allowing it, in turn, to share the data with the world. This enduring global cooperation, paired with continuing technological developments, has allowed us to reach the point where referees can see crystal structures which, seconds after publication, are then available to all on computers and mobile devices, anywhere in the world, for whoever needs to see them. 

Structures are now deposited from around the world, with the most rapid growth in India, China, Japan, Germany, France, the UK, Australia and the US. It seems unlikely that the CSD growth will continue at its current rate forever – but it will certainly continue to develop both in size and value. The database team at the CCDC is confidently predicting that it will hit 1 million structures within three years. The growth curve suggests that their confidence is well-founded.

Colin Groom is executive director of the CCDC