Long Non-Coding RNA landscape in cancer
MCTP researchers analyzed the global landscape of a portion of the genome that has not been previously well-explored – long non-coding RNAs (lncRNAs). Long noncoding RNAs are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, they utilized 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. Applying ab initio
assembly methodology to this data set yielded a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 7% (3,900) of the lncRNAs overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, a non-parametric differential expression testing was employed that nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.
In addition, the complete dataset, named the MiTranscriptome compendium, has been made available on a public website, www.mitranscriptome.org, for the scientific community to explore.
The results of this study was in Nature Genetics