Discussion and news about the modern effort to understand the nature of life on Earth, finding planets around other stars, and the search for life elsewhere in the universe

Saturday, August 21, 2010

Protein universe

A couple months ago a rather stunning paper slipped into the journal Nature. It presents a sophisticated investigation of how quickly the genetic codes for proteins are evolving - across the 3.5 billion years or so of life on Earth.

The idea is that the specific genetic codes, the sequences of amino acids, that describe proteins - the workhorses of molecular biology - cannot withstand big changes, otherwise the complex structures formed will just not do their job properly. Swap in a different amino acid somewhere in a chain of a thousand and you'll no longer fold this big molecule up into the right shape, and you've gone from a 1/2 inch wrench to a pair of tweezers.

Over time though small changes can, and do, occur. As long as the final outcome permits the same job to be done by the protein, all is well. Now, let's suppose that a whole clutch of modern organisms share a common ancestor (something we've touched on before in these pages). We should be able to see just how different the protein coding has become since that time, and we should be able to tell whether this type of gentle evolution has stopped or not.

Povolotskaya and Kondrashov apply to proteins exactly the same methodology that Edwin Hubble did to the measurement of the expansion of the universe. They look to see how fast the coding is changing as a function of how different those proteins are - just as Hubble looked at recession velocities versus the physical distance between galaxies. What they find is that after about 3.5 billion years the protein universe here on Earth is still, slowly, diverging and expanding - it's not yet reached a true optimal state. They also point out that while 98% of locations in a protein sequence can't deal with quick tampering (change an amino acid there and the whole thing ceases to work), over billions of years you could more or less re-write the code for a given protein and get the same function. That's a bit like changing Hamlet by one word every new print run, until you have a totally different script, but the same outcome.

So, what says all this for the nature of life in the universe? These proteins plays roles in things like  metabolic processes that have remained unaltered for billions of years - solutions for how life extracts energy from its environment that are pretty close to optimal. Yet here we see a universe of slowly diverging, expanding, molecular structures - the very fabric of the biological cosmos on Earth. To my mind this might present a huge challenge to the notion of convergent evolution - the idea that there are a limited number of molecular or physiological solutions that life can use. Take a different planet, with a biosphere a couple billion years old. The stately evolution of its protein universe would almost certainly have taken a path unlike that here, exploring this vast multi-parameter space of molecular structures along alien paths. It both supports the notion of life as a potentially extraordinarily robust phenomenon, and as a hugely diverse one.


kurt9 said...

This is quite opposite of Simon Conway Morris's argument that the functional design range of proteins is quite limited and that the diversity of Earth's like likely represents the maximum range of possible diversity of life.

Caleb Scharf said...

Right. Now it may be that although protein coding is still diverging on Earth that all the functional structures have been 'found' - i.e. at this point the changes are just meanderings around the various solutions. On the other hand I think Conway-Morris's idea rests on the protein universe reaching a steady state (to use that analogy) - which it doesn't seem to have done on Earth, after 3.5 Gyrs, which seems to allow for the possibility of further diversity.

Eniac said...

There are roughly 20^N possible amino acid sequences of length N (or smaller). This is a VERY large number, and a random walk through them will for all practical purposes never cover an appreciable fraction. Even if you/they were right and 98% of changes are forbidden in all proteins (Doesn't sound right to me), it would still be (20^N)/50, with identical conclusions. 3.5 Gyr is nothing in comparison. Not even ven billions of billions of organisms on billions of billions of planets can take a bite out of that number.

Thus, I think the "finding" you describe could not have come out any different, and I wonder if it is really Nature-worthy. Perhaps I misunderstand?

Also, convergent evolution is more of a phenomenon than a "notion". It is real and has been observed many times (see http://en.wikipedia.org/wiki/List_of_examples_of_convergent_evolution). I did not find your definition as "the idea that there are a limited number of molecular or physiological solutions that life can use" in the Wikipedia page you reference, and it strikes me as not completely accurate.

Caleb Scharf said...

Right, as you point out the possible combinations are huge. However I think the point is that natural selection does much, much better than a random walk at finding 'solutions' for sequences that end up folding into the right type of protein structure to accomplish useful/critical biological tasks. The assumption has been that 'ancient' protein codes- shared by organisms - get locked down after a couple billion years of evolution, so this finding in the Nature paper that they are in fact still 'diverging' (i.e. changing between diverse species) is surprising.

I guess my interpretation of convergent evolution is along the Conway-Morris lines - i.e. that the *reason* many entirely different species end up looking similar, or using similar biochemical or physiological strategies is that there really is a deep limit to how many workable solutions there are for biological molecular functions. This is therefore predicated on the idea that these solutions are stable and found - protein divergence (as in the Nature paper) indicates that at best even on the Earth this point hasn't yet been reached, hence I would say that evolution has *not* yet 'converged' here - which begs the question of how much longer it will take...

Eniac said...

You have many good points. I disagree, though, that protein _sequence_ divergence (as in the Nature paper) says anything about how completely functional space has been explored. In my opinion it only shows that there are (practically infinitely) many sequences for a given function, and we knew that already.

Evidence that the functional space has not been fully explored is the absence in biology of wheels, electric motors, nuclear power, and many other things that required our intellect rather than evolution to come into being.

protein said...

Have any of you ever read anything by Richard Dawkins?