Genome Biology - Most accessed articles
http://genomebiology.com
The most accessed research articles published by Genome Biology2010-06-02T00:00:00Z
This is an RSS newsfeed from BioMed Central
It is intended to be used with an RSS reader. For more information about RSS newsfeeds from BioMed Central, visit
http://www.biomedcentral.com/info/about/rss/
Screening the human exome: a comparison of whole genome and whole transcriptome sequencingBackground:
There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important.
Results:
Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage.
Conclusions:
We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.
http://genomebiology.com/2010/11/5/R57
Elizabeth CirulliAbanish SinghKevin ShiannaDongliang GeJason SmithJessica MaiaErin HeinzenJames GoedertDavid GoldsteinThe Center for HIV/AIDS Vaccine Immunology (CHAVI)Genome Biology 2010, 11:R572010-05-28T00:00:00Zdoi:10.1186/gb-2010-11-5-r57Genome Biology1465-690611R572010-05-28T00:00:00ZXMLThe case for cloud computing in genome informaticsWith DNA sequencing now getting cheaper more quickly than data storage or computation, the time may have come for genome informatics to migrate to the cloud.
http://genomebiology.com/2010/11/5/207
Genome Biology 2010, 11:2072010-05-05T00:00:00Zdoi:10.1186/gb-2010-11-5-207Genome Biology1465-6906112072010-05-05T00:00:00ZXMLA human functional protein interaction network and its application to cancer data analysisBackground:
One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes into protein functional relationship networks. We are building such a pathway-based analysis system.
Results:
We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information including protein-protein interactions, gene coexpression, protein domain interaction, gene ontology annotations and text mined protein interactions, which covers close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers.
Conclusions:
We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases.
http://genomebiology.com/2010/11/5/R53
Guanming WuXin FengLincoln SteinGenome Biology 2010, 11:R532010-05-19T00:00:00Zdoi:10.1186/gb-2010-11-5-r53Genome Biology1465-690611R532010-05-19T00:00:00ZPDFEvidence for natural antisense transcript-mediated
inhibition of microRNA function
Background:
MicroRNAs (miRNAs) have the potential to regulate diverse sets of protein targets. In addition, mammalian genomes contain numerous natural antisense transcripts, most of which also appear to be non-protein-coding RNAs (ncRNAs). We have recently identified and characterized a highly conserved non-coding antisense transcript for beta-secretase-1 (BACE1), a critical enzyme in Alzheimer's disease pathophysiology. The BACE1-antisense transcript is markedly up-regulated in brain samples from Alzheimer's disease patients and promotes the stability of the (sense) BACE1 transcript.
Results:
We report here that BACE1-antisense prevents miRNA-induced translational repression of BACE1 mRNA by masking the binding site for miR-485-5p. Indeed, miR-485-5p and BACE1-antisense compete for binding within the same region in the open reading frame of the BACE1 mRNA. We observed opposing effects of BACE1-antisense and miR-485-5p on BACE1 protein in vitro and showed that Locked Nucleic Acid-antimiR mediated knockdown of miR-485-5p as well as BACE1-antisense over-expression can prevent the miRNA-induced translational suppression. The expression of BACE1-antisense as well as miR-485-5p was shown to be dysregulated in RNA samples from Alzheimer's disease subjects as compared to control individuals.
Conclusion:
Our data demonstrates an interface between two distinct groups of regulatory RNAs in the computation of BACE1 gene expression. Moreover, bioinformatics analyses revealed a theoretical basis for many other potential interactions between natural antisense transcripts and miRNAs at the binding sites of the latter.
http://genomebiology.com/2010/11/5/R56
Mohammad FaghihiMing ZhangJia HuangFarzaneh ModarresiMarcel Van der BrugMichael NallsMark CooksonGeorges St-LaurentClaes WahlestedtGenome Biology 2010, 11:R562010-05-27T00:00:00Zdoi:10.1186/gb-2010-11-5-r56Genome Biology1465-690611R562010-05-27T00:00:00ZPDFTowards a comprehensive structural variation map of an individual human genomeBackground:
Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.
Results:
We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.
Conclusions:
Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.
http://genomebiology.com/2010/11/5/R52
Andy PangJeffrey MacDonaldDalila PintoJohn WeiMuhammad RafiqDonald ConradHansoo ParkMatthew HurlesCharles LeeJ Craig VenterEwen KirknessSamuel LevyLars FeukStephen SchererGenome Biology 2010, 11:R522010-05-19T00:00:00Zdoi:10.1186/gb-2010-11-5-r52Genome Biology1465-690611R522010-05-19T00:00:00ZXMLThe role of transposable elements in the evolution of non-mammalian vertebrates and invertebratesBackground:
Transposable elements (TEs) have played an important role in the diversification and enrichment of mammalian transcriptomes through various mechanisms such as exonization and intronization (the birth of new exons/introns from previously intronic/exonic sequences, respectively), and insertion into first and last exons. However, no extensive analysis has compared the effects of TEs on the transcriptomes of mammals, non-mammalian vertebrates and invertebrates.
Results:
We analyzed the influence of TEs on the transcriptomes of five species, three invertebrates and two non-mammalian vertebrates. Compared to previously analyzed mammals, there were lower levels of TE introduction into introns, significantly lower numbers of exonizations originating from TEs and a lower percentage of TE insertion within the first and last exons. Although the transcriptomes of vertebrates exhibit significant levels of exonization of TEs, only anecdotal cases were found in invertebrates. In vertebrates, as in mammals, the exonized TEs are mostly alternatively spliced, indicating that selective pressure maintains the original mRNA product generated from such genes.
Conclusions:
Exonization of TEs is widespread in mammals, less so in non-mammalian vertebrates, and very low in invertebrates. We assume that the exonization process depends on the length of introns. Vertebrates, unlike invertebrates, are characterized by long introns and short internal exons. Our results suggest that there is a direct link between the length of introns and exonization of TEs and that this process became more prevalent following the appearance of mammals.
http://genomebiology.com/2010/11/6/R59
Noa SelaEddo KimGil AstGenome Biology 2010, 11:R592010-06-02T00:00:00Zdoi:10.1186/gb-2010-11-6-r59Genome Biology1465-690611R592010-06-02T00:00:00ZXMLUltrafast and memory-efficient alignment of short DNA sequences to the human genomeBowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.
http://genomebiology.com/2009/10/3/R25
Ben LangmeadCole TrapnellMihai PopSteven SalzbergGenome Biology 2009, 10:R252009-03-04T00:00:00Zdoi:10.1186/gb-2009-10-3-r25Genome Biology1465-690610R252009-03-04T00:00:00ZXMLBetween a chicken and a grape: estimating the number of human genesMany people expected the question 'How many genes in the human genome?' to be resolved with the publication of the genome sequence in 2001, but estimates continue to fluctuate.
http://genomebiology.com/2010/11/5/206
Genome Biology 2010, 11:2062010-05-05T00:00:00Zdoi:10.1186/gb-2010-11-5-206Genome Biology1465-6906112062010-05-05T00:00:00ZXMLModeling non-uniformity in short-read rates in RNA-Seq dataAfter mapping, RNA-Seq data can be summarized by a sequence of read counts commonly modeled as Poisson variables with constant rates along each transcript, which actually fit data poorly. We suggest using variable rates for different positions, and propose two models to predict these rates based on local sequences. These models explain more than 50% of the variations and can lead to improved estimates of gene and isoform expressions for both Illumina and Applied Biosystems data.
http://genomebiology.com/2010/11/5/R50
Jun LiHui JiangWing Hung WongGenome Biology 2010, 11:R502010-05-11T00:00:00Zdoi:10.1186/gb-2010-11-5-r50Genome Biology1465-690611R502010-05-11T00:00:00ZXMLAccurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genesBackground:
Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem.
Results:
We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data.
Conclusions:
The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.
http://genomebiology.com/2002/3/7/research/0034
Jo VandesompeleKatleen De PreterFilip PattynBruce PoppeNadine Van RoyAnne De PaepeFrank SpelemanGenome Biology 2002, 3:research00342002-06-18T00:00:00Zdoi:10.1186/gb-2002-3-7-research0034Genome Biology1465-69063research00342002-06-18T00:00:00ZXML