Protein x-ray crystallography has come a long way from a 12 year search for the structure of a single protein. Philip Ball reports

Protein x-ray crystallography has come a long way from a 12 year search for the structure of a single protein. Philip Ball reports

When John Kendrew walked into Max Perutz’s room at the Cavendish laboratory in Cambridge and said he wanted to be his research student, Perutz was flattered but also flustered. For one thing, Kendrew was almost Perutz’s age, and he looked rather grand in his Wing Commander’s uniform. This was 1945, and Kendrew had just finished serving as an advisor to Lord Mountbatten in Ceylon. 



For another thing, Perutz had never had a research student before. But what troubled him most was the suspicion that working in his lab would be a very slow and laborious way for Kendrew to get a PhD. For Perutz was trying to use x-ray crystallography to deduce the structures of proteins. The arrangements of atoms in these biological molecules were evidently fearsomely complicated, and not even Perutz was sure they could be truly deduced with beams of x-rays. It is hard to imagine, Perutz said in 1997, ’how courageous Kendrew’s decision was then to take up protein crystallography’. The reason he had no students was because ’responsible dons advised graduates against joining such a forlorn undertaking’. 

In the event it proved anything but forlorn, although the challenge was every bit as difficult as Perutz anticipated. It was not until 12 years later that Kendrew finally worked out the structure of myoglobin, the protein that stores oxygen temporarily in cells before transferring it to haemoglobin, the oxygen-ferrying molecule in the bloodstream. Kendrew’s picture of myoglobin was detailed enough to allow him to figure out how the polypeptide chain is folded up into a compact ball. He published this model 50 years ago this month,1 laying down a landmark in the histories both of molecular-structure determination by x-ray crystallography and of structural biology. 

Two years later Kendrew and his colleagues had refined their techniques sufficiently to see myoglobin’s shape at a resolution of 2 ? (?ngstr?ms) - enough to let them work out where some of the individual atoms are. This work, and the joint efforts of Kendrew and Perutz on the structure of haemoglobin, as well as the pioneering technical advances that entailed, earned the two men the Nobel prize in chemistry in 1962. 

Without achievements like these that revealed the three-dimensional atomic structures of proteins and other complex biomolecules, molecular biology could hardly exist. A protein’s structure holds the key to understanding how it does its often extremely precise and selective job: binding and transforming its substrate, for instance, or sticking to nucleic acids, inducing motion, or supplying a robust fabric in tissues. And the rational design of drugs that can bind to and modify the behaviour of enzymes typically needs a detailed picture of the molecular structure of the cavity into which it must fit with lock-and-key precision.  

Even though the structures of protein molecules with the complexity of myoglobin can now be solved in a matter of days rather than years, structure determination is still a limiting factor in understanding how cells work. Today many of the really difficult structures are those of vast multi-molecule complexes such as the ribosome (the cell’s protein-synthesis machine), which stretch the techniques as far as they can go. The sheer number of different proteins in the cell is prompting the development of automated methods that can churn out protein structures like production plants; but many proteins refuse to form crystals good enough for diffraction techniques to work well. Meanwhile, the emphasis is starting to shift from structure to dynamics: to understand protein function, we need to know not just what the molecules look like in a static crystal, but how they change shape, in the blink of an eye, as they perform their roles in the cell. 

Whale crystals 



John Kendrew (left) and Max Perutz with their model in progress

When Kendrew arrived in his lab, Perutz was attempting to use x-ray crystallography to solve the structure of haemoglobin. The technique, devised by the German physicist Max von Laue, had been developed at Cambridge by William Bragg and his son Lawrence in the 1910s, who worked out the atomic structures of many inorganic crystals. It relied on interference between x-rays scattered from different planes of regularly stacked atoms, which created a pattern of bright spots that could be recorded on photographic film. From the positions of these spots, one could calculate the positions of the crystal planes, and thus where the atoms were situated. 

In the 1930s, J Desmond Bernal and Dorothy Hodgkin in the Cavendish lab showed that protein crystals could yield sharp diffraction patterns, raising the prospect that their structures could be decoded. But it was already clear by then that proteins are large molecules containing hundreds or thousands of atoms, and figuring out where all the atoms sit would be daunting. Perutz began to work as Bernal’s student in 1937, and a year later Lawrence Bragg became director of the Cavendish. Perutz said that Bragg’s enthusiasm for applying x-ray crystallography to proteins was vital to the success of an enterprise that needed patient support over many years. ’He was fascinated by the idea that the powers of x-ray analysis might be extended to the giant molecules which form the catalysts of living cells,’ Perutz wrote. 

At first, Perutz suggested that Kendrew compare the diffraction patterns of adult and foetal sheep haemoglobin. Kendrew soldiered on with this, but it just wasn’t possible at that time to convert the diffraction data to anything more than a rough outline of the protein chains. (Perutz finally published the detailed structure, still at less than atomic resolution after 22 years of effort, in 1960).2 Kendrew decided that he might have more success with myoglobin, which is only a quarter the size of haemoglobin. The problem was to make crystals large enough to diffract well. Kendrew couldn’t get them from horse myoglobin, but then he realised that diving mammals such as whales have much more myoglobin in their muscles, because they need a big oxygen store. Perutz got hold of a large chunk of sperm whale meat from Peru, and Kendrew was able to extract enough myoglobin from this to make beautiful, ’sapphire-like’ crystals. 

But there was a problem. The amplitude of the scattered x-rays is proportional to the brightness of the spots in the pattern. But to calculate back from the diffraction pattern to the location of the atomic planes, it’s also necessary to know the phases of the waves: how their undulations are in or out of step with one another. This information just isn’t there in the diffraction photos. In 1953, Perutz saw how this ’phase problem’ could be cracked. It involved making crystals that contain very heavy metal atoms, such as uranium, which scatter x-rays strongly and thus provide a set of reference points in the diffraction pattern. If the pattern from heavy-atom crystals is compared to that from native crystals, the phase problem can be solved. 

Haemoglobin contains sulfur atoms to which mercury ions could be conveniently attached. But myoglobin lacks convenient attachment sites for metal atoms, and Kendrew and his colleagues had to resort to throwing all sorts of metals at the molecule until some stuck. This allowed them to obtain three-dimensional structures. Their 1958 paper showed pictures of the electron density in slices of the crystals - it’s the electrons that actually scatter the x-rays - from which the shapes of the folded chains could be inferred. 

A brighter view 



Protein crystal

It is a testament to the abiding importance of crystallography in the sciences that at least eight Nobel prizes have been awarded for work directly relating to its development or use. Von Laue and the Braggs were awarded in successive years, and just two years after Perutz and Kendrew’s prize, Dorothy Hodgkin was given the chemistry Nobel for her crystallographic studies of biomolecules, including insulin and vitamin B12. 

For large molecules like proteins, however, the difficulty is not just that there are so many atoms to locate. Crystallography needs crystals, but not all proteins will easily form them. That’s particularly a problem for membrane-bound proteins, which have ’greasy’ surfaces compatible with the fatty-acid lipids of membranes. This makes them rather insoluble in water. But such proteins typically also have water-soluble parts that stick out into the cytoplasm - and so they don’t generally dissolve well in organic solvents either. It’s therefore hard to grow good crystals from them. 

A milestone in tackling this issue was the x-ray structure determination of the bacterial photosynthetic reaction centre (PRC), which won the 1988 Nobel in chemistry for Johann Diesenhofer, Robert Huber and Hartmut Michel. Michel used small, soap-like surfactant molecules to solubilise PRCs from photosynthetic purple bacteria. The surfactants are surrogates for membrane lipids, but instead of packing into continuous sheets they just cluster around the fatty parts of the PRCs, creating micelle-like structures with their water-soluble heads at the surface. Michel managed to get PRC-surfactant complexes to form nice crystals, and he then collaborated with Diesenhofer and Huber to deduce the structure using Perutz’s heavy-atom substitution to solve the phase problem.3  

This work benefited from the availability of very intense x-ray beams, in that case produced at the DESY synchrotron in Hamburg. Such sources have been a boon for protein crystallography, because more intense beams allow good diffraction patterns to be obtained from smaller samples, reducing the need to make large, perfect crystals. In 1999, researchers at the University of California at Santa Cruz used the bright x-ray beam of the Advanced Light Source at the Lawrence Berkeley Laboratory in California to obtain the first crystallographic structure of a complete bacterial ribosome.4 But with a resolution of just 7.8 ?, it was a fuzzy picture. By 2005, Jamie Cate and colleagues at Berkeley, again working at the ALS, had sharpened this to 3.5 ? - which, coupled with atomic-resolution data of individual ribosomal subunits, offers something like an atom-by-atom view of how this protein-synthesis machine works. 

Quick snaps 

Protein structures are now routinely solved at a rate of more than one a week. Many are bacterial proteins, since these tend to be simpler and thus easier to purify and crystallise than human proteins. But proteins relevant to human disease are gradually giving up their secrets - in 2005, for example, a project called the Structural Genomics Consortium, based in Toronto, Oxford and Stockholm, added 50 such structures to the Protein Data Bank. Last October the consortium announced its 500th protein structure, an RNA helicase linked to viral immunity. Even with projects like this, however, mapping the protein landscape is a huge challenge. To understand protein function and to plan drug design, it is often necessary to know the structures not only of native proteins but of many mutant forms too.  

But because the basic task is always the same - make good crystals, collect the diffraction pattern, and crunch the numbers until a ’stable’ structure emerges - it lends itself to automation. There are now highly automated methods for ’high throughput’ protein crystallography that produce and analyse hundreds of crystals a week. Some of these operate on an industrial scale using robotic preparation procedures, and they rely on synchrotron sources to gather enough data for structure determination in a very short time, using crystals that can be as small as ten micrometres across.6 The diffraction patterns can be analysed by computers, but it is hard to leave everything to machines: choosing good crystals is an art that generally depends on human judgement. The small crystals are also often fragile, making them difficult to manipulate with robotics.  

In his 1988 Nobel lecture, Robert Huber pointed out that one of the biggest mysteries of protein structure is how they fold from an extended polypeptide chain into a functional, compact three-dimensional shape. ’An ultimate goal for which we all struggle is the solution of the folding problem,’ he said. If we could understand the rules that lead from protein sequence to shape, we might no longer need protein crystallography at all: in principle, a protein’s shape could then be deduced simply from the sequence of the gene that encodes it. 

But to understand protein folding we need to follow the process from moment to moment: it is a question not of structure but of dynamics. At face value, diffraction might appear to be useless for that, because it relies on the existence of static structures in a crystal. But not necessarily. If we could obtain a diffraction pattern fast enough, we might be able to work out the instantaneous structure of an evolving molecular shape, and thus follow the trajectory it takes. That means collecting lots of data literally in a flash. But this is now becoming feasible.7 Late last year, for instance, Ahmed Zewail’s group at the California Institute of Technology in Pasadena reported a ’four-dimensional’ structure of a vanadium dioxide crystal as it undergoes a phase transition. The fourth dimension is time: using ultrashort pulses of electrons, Zewail’s team could follow atomic rearrangements by electron diffraction on timescales of just a few picoseconds.8 In principle, such methods might also reveal the changes that occur as proteins bind their targets: in contrast to the classic ’lock-and-key’ idea, these events rely on a lot of floppiness and flexibility in the binding sites, which could supply key clues to making protein-binding drugs.9 Researchers at the ALS are now aiming to create ultrashort x-ray pulses that might enable the mapping of protein motions on timescales of attoseconds - billion-billionths of a second. 

Another approach to the protein-folding problem is to use computer simulations. But current simulation methods just aren’t fast or accurate enough to follow folding all the way from start to end. David Baker at Stanford University in California has developed a method in which simulations draw on the knowledge gained from previous protein-structure determinations, using sequence matches between the protein being studied and those in data banks, to provide good guesses about how parts of the chain will fold. Last year, Baker’s team reported that their so-called Rosetta technique could correctly predict the structure of a 112-residue bacterial protein from its sequence.10 This is still a small protein, however, and even then the task was computationally expensive, relying on distributed computing that scavenges spare time from the machines of volunteers. 

Even if a solution to the folding problem is going to render protein crystallography redundant, then, it looks set to rely on the achievements of structure determinations past. As Huber puts it, ’it is certain that the end of protein crystallography will only come through protein crystallography.’ 

Philip Ball is a freelance science writer based in London, UK