To keep in tune with yesterdays theme of Bioinformatics and big data, today I was looking at the growth of our lab, in particular the amount of sequencing preformed at our facility.

Rplot01

The HiSeq2500 is illumina's major workhorse, and perhaps the most versatile and widely used next generation sequencing platform used in the industry. Running this machine can cost between $10,000 to $30,000 per use, depending on type of  sequencing being performed and the "depth" of sequencing we are looking to do. Depth of sequencing typically refers to how much we look at a particular part of the genome. For most experiments, we like to look at each region about 30 times . . . which gives us some degree of confidence about the underlying sequencing at a given spot.

The number of times we run the sequencer has steadily increased over the past few years, though the 2009 time point is a little bit misleading, since we received the instrument in November and were getting trained (and technically it was the GAIIx sequencer).  This steady production of data has lead to that data management issue that I brought up yesterday. It also has led us to many cool projects to work on too.

More and more, newspapers and articles use catchy headlines about personalized medicine: using a persons underlying DNA sequence to inform a physician about the best treatment path. The more I've worked in the field, the more I truly believe this is rapidly approaching reality. Most articles kind of summarize a whole lot of really very difficult and technical things, I believe we are getting closer and closer to wide-spread adoption of NGS based treatment paths.

In our lab alone, we are initiating projects to profile hundreds of Western New Yorkers for all sorts of different things, looking at their genome, epigenome (dna structure), and microbiome (bacterial profile). This information, while insanely difficult to process and understand, will give us fantastic insights into disease development.

The more we sequence, the more we learn and the more we can improve our algorithms and data analysis pipelines. The growth has been exceptional, and I am excited to see the continued adoption of this technology.