I’ve now been a part of the publication of three genomes, all grasses. One as a grad student (Brachypodium distachyon). One as a postdoc (Dichanthelium oligosanthes). And now one as a PI (Panicum miliaceum). Each species had different motivations: Brachypodium was intended to be a genetic model selected because it belonged to the same part of the grass family as wheat, barley, rye, and oats, but had a genome that was 1-2 orders of magnitude smaller. Dichanthelium was a comparative grade genome picked because stood between two groups of C4 grasses with sequenced genomes (maize and sorghum on one side, foxtail millet and pearl millet on the other) yet still used C3 photosynthesis, the ancestral state. Panicum miliaceum (proso millet or broomcorn millet) was sequenced because it’s an actual crop people grow in some of the driest cultivated land in the world (like inner Mongolia and western Nebraska), and having a reference genome sequence really does help with things like genomic selection, marker assisted selection, and QTL mapping. And each was sequenced using completely different technologies: Sanger sequencing (Brachypodium), Illumina short reads and mate pairs “next gen sequencing” (Dichanthelium), and PacBio long-reads combined with HiC “third gen sequencing” (proso millet). PacBio assemblies are SO MUCH BETTER than what we could manage with Illumina + mate pairs (I realize this is not news to most of you, but it’s one thing to hear it, it’s another to see it for yourself).
If I’ve learned one thing from these three experiences it is that it makes sense to work together with a whole team of people to put together a genome. The Dichanthelium genome project I was mostly working with a single other postdoc who also thought the potential for comparative genomics/biology of the species was cool, and in retrospect we bit off way more than we could chew, and were lucky to make it across the finish line to a paper. For both proso millet and brachypodium, I had the joy of working with big teams of people including folks whose whole job was genome assembly and annotation, and they were really REALLY good at it.
So what can I tell you about proso millet? It produces grain more efficiently per unit of water transpired than any other grain crop studied. It can produce grain in fewer days than any other crop I’ve worked with (some varieties are ready for harvest 50-60 days after planting!) It’s an allotetraploid, although so far we’ve only found a diploid lineage related to one of its subgenomes, not the other. One early approach we tried (see Ott et al below) was to use a technology designed to separate and phase the haplotypes of a diploid human to separate and phase the two subgenomes of an inbred tetraploid individual of proso millet. I’ve actually met farmers in both China and the USA who grow the crop, which is a really nice feeling. With one of my private sector hats on, I’ll get to use this genome to try to make higher yielding varieties of proso millet for those exact farmers. With my main public sector hat on, I’m excited to have a model for NAD-ME C4 photosynthesis that is easier to germinate, grow, and propagate than Panicum hallii or Panicum virgatum. There is nothing like working with wild grasses to make you appreciate the work all of our ancestors did to select against seed dormancy and photoperiod sensitivity while they were domesticating crops from wild species over dozens and hundreds of generations.
Zou C, Miki D, Li D, Tang Q, Xiao L, Rajput S, Deng P, Peng L, Huang R, Zhang M, Sun Y, Hu J, Fu X, Schnable PS, Li F, Zhang H, Feng B, Zhu X, Liu R, Schnable JC, Zhu JK, Zhang H. (2019) “The genome of broomcorn millet.” Nature Communications doi: 10.1038/s41467-019-08409-5
Ott A, Schnable JC, Yeh CT, Wu L, Liu C, Hu HC, Dolgard CL, Sarkar S, Schnable PS. (2018) “Linked read technology for assembling large complex and polyploid genomes.” BMC Genomics doi: 10.1186/s12864-018-5040-z