John Timmer has started a cool post on sequencing technology over at arstechnica. The writing seems like it would be quite accessible to someone without much biology or chemistry background. Sunday’s post is focused on Sanger sequencing which is the classical technique and still used by people working with one or a few genes at a time. There’s a whole new set of technologies that are now (or soon will be) used to tackle large scale projects like sequencing whole genomes and I assume that’s he will talk about in part 2.
The principles behind sanger sequencing are quite old but it’s a great example of the huge difference optimization and specialization can make. Back in the day, grad students and technicians poured their own gels, ran their own reactions with radioactive reagents and then ran the reactions out and interpreted a couple of hundred base pairs of dna sequence from the pattern of bands which appeared.
By the time I first needed to sequence something, I put my DNA in a little test tube (microfuge tube) with a little bit of the same DNA primer I’d used to amplify my sequence and walked it down two stories and across the hallway to drop it off in the sequencing lab. The technology had come so far that now a single technician loaded dozens of samples, dropped off from all over campus, into a giant machine, and a few hours later sequences data 4-5 times longer than back in the bad old days, was e-mailed back to all the researchers whose samples had been in that run. No radioactivity. No problems with gels and most importantly, so many fewer hours spent by researchers.
The ABI 3700 was a sequencing machine that represented the peak of those trends. It could sequence 96 samples at once and run up to eight times per day. Assuming 1 kb of sequence per sample, which is about the maximum of sanger sequencing, that means each machine could produce ~750 kilobases* of sequence data per day.
Two of the new sequencing technologies you may have heard about are 454 sequencing and solexa sequencing. A single machine using the 454 sequencing technique can generate as much sequence per day as 1,300 of the ABI 3700 machines. A Solexa sequencer can generate 4 times a much sequence as a 454 sequencer, four billion individual A’s T’s C’s or G’s. The only downsides are shorter read lengths (somewhat shorter for 454, and much shorter for solexa), and the fact sanger sequencing is the only technology that can start at specified point on the DNA molecule (specified by a primer.)
Very cool technologies and when Dr. Timmer posts part two which addresses these new techologies I’ll be sure to link to that one as well.
If you’re interested in how the new sequencing technologies stack up against each other PolITgenomics has a great reference chart.
*A kilobase is one thousand A’s T’s C’s or G’s of DNA