The longest synthetic genome shows us life is more complicated than just learning your AGCs

The geneticist Theodosius Dobzhanksy famously made it a kind of axiom that ‘nothing in biology makes sense except in the light of evolution’. But if there were to be a First Law of Biology, I’d suggest a different one: ‘I think you’ll find it’s more complicated than that.’

Evolution is driven by natural selection? I think you’ll find it’s more complicated than that. The genome is a blueprint for the organism? Proteins are readouts of genes? I think you’ll find…

It’s complicated

No wonder that the structure of DNA seemed a blessed relief when it was revealed in 1953. Here at least in the elegant double helix was something we could understand according to simple rules. DNA sequences become protein sequences via the genetic code: codons – triplets of DNA (or RNA) base pairs – correspond directly to one of the 20 amino acids found in proteins. Sure, how proteins fold, how intron editing occurs, all the rest of that, is more complicated. But the code is fixed and transparent.

That code, we’re taught, is also redundant – because there are 64 possible codons (43 triplet permutations of the four nucleotide bases) but just 20 amino acids for them to represent. Some amino acids are thus represented by more than one codon – by synonyms; some codons are also used for the ‘stop’ instruction in transcription. But… it’s more complicated than that.

‘Codon bias’ – a non-random preference for certain codons among apparent synonyms – is well documented, and can influence rates of protein synthesis and cell growth in ways that control cell processes and serve adaptive purposes.1,2 Perhaps we might suggest a Second Law of Biology: ‘Evolution finds a use for almost anything.’

To my mind, this shows that chemistry cannot be coded out of biology. The genetic code that seems to make codons mere proxies for amino acids, and which thereby reduces particular chemical structures to logical symbols and bits, is not enough.

There’s a small reminder of this nuance to the genetic code in a spectacular tour de force of synthetic biology recently reported by Jason Chin at Cambridge University and his colleagues.3 Using DNA synthesis, they have rebuilt the entire 4-million-base-pair genome of Escherichia coli in which two of the six codons for serine and one stop codon were replaced with synonyms, giving the genome just 61 codons rather than 64. They introduced this synthetic genome, with about 18,000 instances of altered codons, into E. coli cells to create a new strain that they call Syn61.

While the Syn61 bacteria were viable, they replicated a little more slowly than the wild type, and were slightly longer on average. To these cells, nominally synonymous codons are evidently not entirely equivalent, although the differences are small.

Yet this is no surprise. It takes nothing away from the stunning achievement to say that the result is pretty much what one would have expected from previous, more modest efforts to recode codons: 4-6 namely, that it only slightly impairs fitness, if at all. Such genome-compressed organisms may be entirely viable, and there are few differences in the complement of proteins they produce. It’s like compressing files or audio: there’s a slight degradation in quality, but not so that it really matters.

On the one hand, Chin’s work has a similar goal to earlier efforts to rewrite and simplify genomes:7 to find a minimal basis for viability, a stripped-down and simplified chassis on which synthetic biologists can more readily exercise principles of rational design. At the same time, the work takes another step towards making ‘orthogonal’ living systems that, with a slightly non-natural chemical basis, might coexist with other organisms without interfering with them and which are more amenable to independent manipulation and control: among other things, that could offer a potential safety measure for these attempts to rewrite life. The space that codon compression ‘frees up’ in Syn61 might be used to code for non-natural amino acids, giving these organisms a different biochemical basis to natural ones. Already Chin’s group has shown that replacing the transfer RNA of one of the edited-out codons with one that encodes a non-natural amino acid is toxic for wild-type E. coli (which incorporates the amino acid in the resulting proteins) but harmless to Syn61 (which doesn’t).

On the other hand, the work is asking a more fundamental question: what are the ultimate constraints on life? To what extent are seemingly foundational principles like the genetic code or the choice of DNA’s nucleotide bases essential or contingent? And that again comes down to the question: how much do the chemical specifics matter? The implications for the origin of life on Earth, and possibly on other worlds, could be profound.