Saturday, October 19, 2013

Ripples from 454's Shutdown Announcment

Roche's announcement this week that they planned to shut down the 454 sequencing business in mid-2016 was not completely unexpected, as a number of rumors of shutdown had shown up on Twitter.  Most tweets on the subject fell into two categories: either just-the-facts-ma'am or jokes about the dominant error profile (which I guess you could call just the facts maaa'aaam).  But, certainly I wouldn't have thought Roche on the verge of this decision when I went to AGBT 2013 in February, as 454 had a huge suite in a prime location (just by the main conference hall entrance) and many expensive events. Now, Roche's presence in the genomics space is looking like just the recently announced deal with PacBio to market human diagnostics on that platform.

Regular readers of this space will have picked up that I wasn't fond of the 454 platform, perhaps jaded by the misfire of my first experiment on it.  But I do appreciate that 454 was a trailblazer, the first commercially successful post-Sanger DNA sequencing system.  In my take, 454 followed a nearly classic disruptive technology track.  When first introduced, 454 clearly was radically different than the dominant (heck, only available sequencing) technology: a vastly more expensive instrument that required a huge financial commitment to each run, and each run generated short (at most 100 bp) reads of inferior quality.  The difference, of course, is that 454 delivered piles and piles of these reads.  One memorable early 454 publication for me was the application to sequencing Neaderthal DNA; since ancient DNA is already heavily fragmented, the short reads of 454 were at no disadvantage in this setting.  That was part of the thinking it took for 454 to succeed: find the applications in which the new technology's advantages were key.

It is also worth emphasizing the "first successful commercial" bit; at least two different possible competitors to to 454 failed to get to commercialization.  I was approached in 2000 about a nascent start-up to build a sequencer based on George Church's polony technology; around the same time Manteia was trying to develop a similar approach.    Helicos demonstrated that getting a machine on the market was not synonymous with commercial success; folding after selling a small number of machines.  

However, success for 454 was not to last. The inheritor of both the Church and Manteia intellectual property was Solexa, which would then be acquired by Illumina.  By seeding the new thinking around sequencing, 454 helped clear the obstacles for those that followed.   Illumina had very different operating characteristics and so many programs for one platform weren't terribly useful on the other.  Still, the new platforms stimulated a panoply of new software tools and experimental approaches that cross-fertilized.

I've been predicting the compression of 454's market for some time.  Every time that Illumina or Ion Torrent increased their read length, a bunch of 454's markets were nibbled away.  454 had two key advantages: fast sequencing and relatively long read lengths, but against these were a high per-base cost and expensive up-front purchase price.  Nearly all of 454's performance improvements -- which also translate into cost/base enhancement -- came from increasing read lengths. Improving the original 100bp read lengths to 1Kb; 10X improvement ain't shabby.  But Solexa/Illumina originally thought they would be stuck with 25 basepair reads, and now supports 300 basepairs (per read in a paired-end arrangement).  MiSeq isn't as fast as 454, but quite quick and that is good enough for many applications.  Furthermore, Illumina pushed the density of their system further and further, dropping the cost per base by astounding amounts.  Lex Nederbragt has a nifty plot of read length vs. output for the various platforms which illustrates this: MiSeq at launch exceeded the output of the original Solexa box, despite costing far less time or money for a single run.

Ion Torrent hasn't achieved the same read lengths as 454, but the fast turnaround means it can compete in many markets with 454 in which speed is a priority.  Vastly cheaper, Ion Torrent was probably a key piece of why 454's Jr model didn't take off; the little system's price was significantly higher and it didn't support the very long read chemistry of its big brother.  Sure, there were some applications that Ion couldn't support, but particularly once reads got over 200 basepairs the important FFPE market  (which like ancient DNA is highly fragmented) could be addressed.

A question I've never seen explored much, and don't personally feel competent to comment on, is why 454 never achieved higher densities.  Was this an inherent problem with the 454 system -- further miniaturization caused signal-to-noise problems -- or was it simply a case of Roche failing to invest in the necessary technology development. It's a pity; higher density might have made 454 Jr. a serious competitor to other desktop systems.

In the other direction, Pacific Biosciences has been squeezing 454 in applications requiring long reads.  My second (of two) 454 experiments was in de novo bacterial genome assembly, and some gain over pure Illumina was had -- but just after I got that dataset I got my first PacBio dataset with HGAP correction, and it was game over for any other current platform in that application.  Amplicon sequencing of ribosomal RNA genes was a strong application for 454, and one which PacBio is now making inroads on -- especially since the reads for 454 were not sufficiently long to fully read Sanger-era rRNA amplicons.

Now that 454 is bailing out, I think there are some important opportunities or decisions out there for several groups, which I will lay out now.

First, the remaining platform vendors would be expected to scramble for the remaining business of 454.  PacBio has the most to gain, since they are still clawing out successful niches to propel their business forward.  PacBio can also potentially support very rapid turnaround.  The catch is getting customers comfortable with a sequencing-as-a-service model, since the upfront cost of the machine is so high. There is no shortage of labs offering good PacBio services; the challenge is getting people out  of the "I need my own box" mentality.

A second group with opportunity are software developers.  Many specialized programs had been developed for 454 to support the fields in which it found favor, particularly tools to clean up rRNA amplicon reads.  Similar tools have been slowly appearing for Illumina, but getting them for all the platforms and building user confidence in them is key.  I'll repeat advice I've given in this space before: it would behoove the companies, particularly underdogs such as PacBio, to release publicly large datasets for these applications and do so with great haste. 

The most obvious group faced with a decision is anyone who was relying on 454 for their research program.  At a conversation at AGBT, I asked some other sequencing aficionados why 454 still had legs, and the consensus was that many PIs were reluctant to switch platforms in mid-study.  Unless you are truly planning to end a study in 2016 or earlier, that stark choice is now forced.  If I were in that situation, I wouldn't be thinking "how do I switch in 2016", but rather "how soon can I bail out".  Switching platforms, and concomitantly switching error profiles and perhaps even having to change amplicon designs will require development time and generating comparative data.  But biting the bullet on this now will mean an earlier transition to the many advantages the surviving platforms offer and minimize the risk that the switch-over period encroaches on the shutdown.

If investigators start switching earlier, could 454's business plummet?  At the risk of sounding like a blockbusting real estate agent, I think that is a very real possibility.  If that should happen, would Roche stick out the business to the announced date or end up moving the shutdown to an earlier date?  Closing early would further damage Roche's already poor reputation in the genomics world, but would they be willing to stomach the costs?  Or perhaps the costs, just continuing to manufacture kits and maintain boxes but not build any new ones would be minimal enough to not be an issue? Stay tuned.

6 comments:

Rick said...

Leaving aside the Ion Torrent and miSeq -- both of which can be lab level instruments -- I suspect most PIs are already subscribing to "sequencing-as-a-service" model but at a local level. The customers using our hiSeq, for example, aren't allow to "touch the machine". We provide a service to them but they could be sending their samples to anyone anywhere.

There is a mentality of "the university must have a sequencing center" (sort of like your point behind the "I need my own box" mentality) similar to "the university must have a super-computing center". There is a lot to be said about having local people who you can talk to doing your sequencing work. However this mentality may also fade over time.

James@cancer said...

That was a great summary Keith. A big change for me was Illumina's announcment at AGBT 2011 or 12 of the 600bp reads on MiSeq. It was clear that from thenon Illumina's read length would increase. I suspect 1000bp is achievable and longer may be possible. Once we get 1B 1000bp reads is there any need for anything else?

Keith Robison said...

James:

1000bp reads are great -- but for some de novo assembly they are shockingly short. For the good stuff in the bugs I work in, 4Kb sometimes isn't long enough!

Anonymous said...

The quantum yield of bioluminescence is quite high, 40-90% depending on whom you believe (http://www.nature.com/nphoton/journal/v2/n1/full/nphoton.2007.251.html). However, the problem is likely that of a 1-photon emitter, i.e., every strand produces ~0.5 photons, and unlike photoluminescence, cannot be regenerated. With a typical Illumina cluster size of ~1000 templates you would get ~500 photons emitted in all directions. A high-NA optical system needed to resolve the clusters would collect at most of a few percent of that, so you are looking at tens of photons, not enough to overcome Schott noise. This is just speculation, possibly ill-informed.

Anonymous said...

Quick 454 runs are a fallacy. For one scientist to go through the whole 454 workflow takes about 2 days of full-on bench work (emPCR setup & dispense, emPCR, emulsion breaking, bead enrichment, 454 pre-wash, sequencing reagent prep, PTP loading and a 10 hour run). Alternatively a MiSeq run is 30 minutes of hands-on and 27 hours of furiously refreshing BaseSpace for 30 hours until the run has finished.
Of course, it's quicker than the HiSeq, but the HiSeq produces significantly more data at a faster rate (HiSeq >1Gb per hour, 454 ~50Mb per hour)

Brian Krueger said...

This is in Reply to Rick's comment about every Uni wanting to build a super computing center. I have been pushing to get our center to look at cloud based solutions for data analysis however there are two huge hurdles. One is the speed of data transfer to the cloud which is a huge hinderance if you don't have a 10gigy line and the other big problem we've encountered is that a lot of the work we do is now considered protected health information and a lot of the big guns in the offsite cloud storage space won't sign a Business Associates Agreement (BAA) which basically says they're liable for any breaches that result in the release of PHI. So for those two reasons we're FORCED into keeping all of the computing analysis local.