Tuesday, April 25, 2006

Big science: Regulating the master regulator genes

When I was a child of 19 or so, I helped write Fortran code to simulate the braking system of an ancient freight train. These pneumatic systems were not entirely designed, they evolved over time. A large amount of lost expertise was used to create a kludged system that would reliably activate at roughly the same time over a long train despite large signal latencies. Analysis showed that some of the pneumatic subsystems merely counteracted others, but the emergent behavior was reliable.

I don't recall, at the time, realizing that I was working on a metaphor that was widely applicable to biology, economics, software, and all other complex evolved systems. I think of those lessons again as I read about the very reliable and fantastically arcane bio-nano-machinery that controls the master regulator genes:
Studies Find Elusive Key to Cell Fate in Embryo - New York Times

A question of interest for biologists studying cell identity is what regulates the master regulator genes. The answer has long been assumed to lie in the chromatin, which determines which genes are accessible to the cell and which are excluded. The chromatin consists essentially of millions of miniature protein spools around each of which the DNA strand is looped some one and half times.

The spools, however, are not mere packaging. They can lock up the DNA they are carrying so that it is inaccessible.

Or they can unwind a little, so that the strand becomes accessible to the transcription factors seeking to copy a gene on the DNA and generate the protein it specifies.

... there are protein complexes — essentially sophisticated cellular machines — that travel along the chromosome and mark the spools with chemical tags placed at various sites on the spool.

A complex known as polycomb ... tags spools at a site called K27.

This is a signal for another set of proteins to make the spools wrap DNA tight and keep it inaccessible.

Another complex tags spools at their K4 site, which has the opposite effect of making them loosen their hold on the DNA.

The chromosomes of the body's mature cells are known to have long stretches of K27-tagged spools, where genes are off limits, and other regions where the spools are tagged on K4, allowing the cell to activate the local genes.

... In the current issue of Cell, a team led by Bradley E. Bernstein and Eric S. Lander reports that they looked at the chromatin covering the regions where the master regulator genes are sited.

They found to their surprise that these stretches of chromatin carried both kinds of tags, as if the underlying genes were being simultaneously silenced and readied for action.

... Each cell must avoid being committed to any particular fate for the time being, so all its master regulator genes must be repressed by tight winding of the spools that hold their DNA. But the cell must be ready at any moment to activate one specific master regulator as soon as its fate is determined.

The Broad team then looked at the chromatin state of the master regulator genes in several kinds of mature cell.

... they found that the bivalent domains had resolved into carrying just one type of mark, mostly the K27 tag, indicating the master genes there were permanently repressed.

But in each kind of mature cell one or more of the domains had switched over to carrying just the K4 tags, within which genes would be active.

... Dr. Bernstein's team worked with mouse cells, but its findings have been confirmed in human embryonic stem cells by Tong Ihn Lee and Richard A. Young of the Whitehead Institute...

The new findings raise the question of how the embryonic cell knows where on its chromosomes the bivalent domains should be established. Dr. Bernstein and Dr. Lander believe that the answer lies in the structure of the DNA itself.

The bivalent domains occur at regions on the chromosome where some of the DNA sequence is highly conserved ...

... These particular sequences, however, do not contain genes, so must be conserved for some other reason.

The highly conserved non-gene sequences were first detected in the dog genome, which was decoded last year. It was in trying to figure out what these regions did that the Broad team stumbled across the bivalent domains.

Although only half of the highly conserved regions contain master regulator genes, something in their DNA structure may be the signal that tells the cell where to create the bivalent domains.

... Dr. Young's team has studied another aspect of embryonic stem cells which ties into the new finding about bivalent domains. Three genes, known as oct4, sox2 and nanog, are known to be particularly active in the cells and are regarded as a hallmark of the embryonic state.

Dr. Young showed last year that the genes make transcription factors that act on each other's control sites in ways that in effect form a circuitry for controlling the master regulator genes.

He has now found that these transcription factors bind at many of the bivalent domains created by the polycomb complex.

... a working definition of cell status may be almost at hand, in Dr. Lander's view, in terms of a cell's chromatin state and the transcription factors that can bind to its available genes.

So, how many of these researchers will end up making the trip to Stockholm? There are a lot of moving parts that are coming together here. The Nobel committee will have its hands full sorting things out. I do think it's cool that one of the fundamental breakthroughs appears to have been an unanticipated side-effect of decoding the dog genome.

Fame aside, note the shape of things. The "master signal" that says "control me here" is the shape of the DNA. In other words, this control signal is topological. Signaling by shape is fundamental to DNA and protein alike, in the world of the cell topology is about creating a distinctive signature of electromagnetic fields (and perhaps quantum signals too). On the other hand, there is a binary system of "wrap" and "expose" that controls what DNA is read. On the third hand there's set of 3 genes for transcription factors that regulate each other's activity -- a configuration that will be familiar to anyone who's studied simple transistors.

If one abstracts the control systems as shape and charge, simple circuits and binary actions, it becomes possible to see how such an emergently complex nano-world could evolve a little bit of a time. I am still waiting, however, for the announcement that biologists have uncovered the DNA equivalent of the LZW-compression algorithm. [Note to software designers -- if you're looking for new compression algorithms, look how DNA solves this puzzle.]

This is huge news. The biggest thing I've read for a while.

No comments: