I am very happy that our paper on agent-based modeling of global ocean microbe evolution is now published. This study shows that neutral evolution and dispersal limitation can lead to substantial biogeography in ocean microbe populations. In a nutshell: Microbes evolve faster than the ocean circulation can mix them. I expect these results will receive a lot of attention, and I look forward to the discussion. In this post I want to highlight another aspect of the paper I am excited about, focused more on modeling technology.
You know, when I start my research program here at Northeastern about 10 years ago, I set out to develop models that are consistent with the quantity and types of observations we generate today. So I went to the literature and started looking at what people are measuring. I ran across papers by Ed DeLong and Craig Venter that presented metagenomics observations. My initial reaction of excitement quickly turned to despair when I realized that the quantity of information generated by modern observational tools now greatly surpasses what we can get out of models. In this model we simulate individual microbes, each with a full 1 Mbp genome. That approach constitutes one possible direction towards closing this gap. In some ways, our model application turns the table again, at least in terms of information quantity. For example, one of our simulations (Fig. 1B in the paper, “start uniform” simulation) includes 2.9e8 mutations and thus unique genomes, for a total of 290 Tbp, which is far larger than metagenomics datasets (e.g. GOS has 6.3 Gbp) or what is currently in GenBank (160 Gbp, Feb. 2014). I have an opening for an undergraduate research assistant to upload this data to GenBank (joke).
Here are links to the paper, a perspective article, some news features and an animation (which has been approved for going viral).