I am very happy that our
paper on agent-based modeling of global ocean microbe evolution is now
published. This study shows that neutral evolution and dispersal limitation can
lead to substantial biogeography in ocean microbe populations. In a nutshell: Microbes
evolve faster than the ocean circulation can mix them. I expect these results
will receive a lot of attention, and I look forward to the discussion. In this
post I want to highlight another aspect of the paper I am excited about,
focused more on modeling technology.
You know, when I start
my research program here at Northeastern about 10 years ago, I set out to
develop models that are consistent with the quantity and types of observations
we generate today. So I went to the literature and started looking at what
people are measuring. I ran across papers by Ed DeLong and Craig Venter that
presented metagenomics observations. My initial reaction of excitement quickly
turned to despair when I realized that the quantity of information generated by
modern observational
tools now greatly surpasses what we can get out of models. In this model we
simulate individual microbes, each with a full 1 Mbp genome. That approach
constitutes one possible direction towards closing this gap. In some ways, our
model application turns the table again, at least in terms of information
quantity. For example, one of our simulations (Fig. 1B in the paper, “start
uniform” simulation) includes 2.9e8 mutations and thus unique genomes, for a
total of 290 Tbp, which is far larger than metagenomics datasets (e.g. GOS has
6.3 Gbp) or what is currently in GenBank (160 Gbp, Feb. 2014). I have an opening for an undergraduate research
assistant to upload this data to GenBank (joke).
Here are links to the
paper, a perspective article, some news features and an animation (which has
been approved for going viral).