Multiplex microbiota sampling

Multiplex microbiota profiling allows hundreds of samples to be combined and run on a single Illumina lane. Multiplexing samples dramatically reduces the cost associated with each sample and the Illumina sequencing method gives much more information per sample and a lower per base error than does the commonly used pyrosequencing method.

We are offering microbiota profiling as a cost-recovery service because hundreds of samples must be run to make the protocol cost effective.

Q score graph showing that the Q score is higher in the overlapped region We use paired-end sequencing of PCR products derived from the eubacterial 16S V3,V5 or V6 regions to identify and enumerate the bacteria in the sample. The paired-end sequencing reads are overlapped. As shown in the ajacent figue, the use of overlapping reads circumvents the higher per-base error rate at the ends of the Illumina runs. In fact, the confidence in the base call, as given by the Q score is actually higher in the overlapped region. This methodology has two additional advantages. First, chimeric sequences are rare because we amplify and sequence only one variable region. Second, unlike pyrosequencing, the Illumina method rarely introduces indels.

Our service

Concptual workflow given in the textMultiplexing is a blessing and a curse because we need hundreds of samples to keep the per-sample cost reasonable. Thus, we are able to offer multiplex microbiota amplification suitable for sequencing for $20 per sample. This covers the cost of determining the mix of barcodes required for the proper Illumina run (the exact mix varies for each number of samples, see our papers for the reasoning), for the amplification of a set of sequences suitable for Illumina library preparation, for the QC and quantitation of the PCR reactions and for the initial data analysis. If desired, we will include the amplifications on our next available run, or provide you with the purified, amplified sequences for analysis on the Illumina run of your choice. If done with our provider, library preparation and sequencing on an Illumina Hi-Seq at TCAG will be invoiced at our direct cost apportioned according to your sample's contributions to the cost. As an example, for a minimum of 10000 reads per sample in a run of 350 samples, your cost would be about $10/sample for sequencing. A V3 run is substantially more expensive because of the longer read length required and because the spot density on the Illumina lanes are lower and so we cannot multiplex as many samples.

We will process the reads by overlapping the paired ends into one virtual read that encompasses the entire variable region. The reads will then be grouped into identical sequences, and then into Operational Taxonomic Units (OTUs) with the percent identity of your choice. We will provide you with the reads for your samples along with two tables containing the sample, the number of reads per ISU or OTU in each sample. We will not perform any further downstream analyses; i.e., we will not classify the reads, perform richness estimates, etc. This is to respect your right to analyze your own data.

We would be happy to discuss sequence classification and other analyses on your data as a collaboration, or as a further value-added service.

Sample Submission

Samples can be submitted via sending an email to the address below.

Data generation and release policy

DNA will be sent for sequencing when enough samples have been accumulated. This means 500 samples for the V6 region, and 300 samples for V3. Please be patient, we must access a machine that is performing a paired-end sequencing run of the appropriate length. Each run requires at least 3 weeks of time on the sequencing machine. Data will be released when payment is received.

Questions? Please send email to: g g l o o r at u w o . c a.