EST sequencing and gene expression profiling in Scutellaria baicalensis

Abstract

Scutellaria baicalensis is an important medicinal plant, but few genomic resources are available for this species, as well as for other non-model plants. One of the major new directions in genome research is to discover the full spectrum of genes transcribed from the whole genome. Here, we report extensive transcriptome data of the early growth stage of S. baicalensis. This transcriptome consensus sequence was constructed by de novo assembly of shotgun sequencing data, obtained using multiple next-generation DNA sequencing (NGS) platforms (Roche/454 GS_FLX+ and Illumina/Solexa HiSeq2000). We show that this new approach to obtain extensive mRNA is an efficient strategy for genome-wide transcriptome analysis. We obtained 1,226,938 and 161,417,646 reads using the GS_FLX and the Illumina/Solexa HiS-eq2000, respectively. De novo assembly of the high-quality GS_FLX and Illumina reads (95 % and 75 %) resulted in more than 82 Mb of mRNA consensus sequence, which we assembled into 51,188 contigs, with at least 500 bp per contig. Of these contigs, 39,581 contained known genes, as determined by BLASTX searches against non-redundant NCBI database. Of these, 20,498 different genes were expressed during the early growth stage of S. baicalensis. We have made the expressed sequences available on a public database. Our results demonstrate the utility of combining NGS technologies as a basis for the development of genomic tools in non-model, medicinal plant species. Knowledge of all described genes and quantitation of the expressed genes, including the transcription factors involved, will be useful in studies of the biology of S. baicalensis gene regulation.

Description

Table of contents

Keywords

de novo assembly, expression profiling, next generation sequencing, Scutellaria baicalensis

Citation