Year: 2016

Members of the Extracellular RNA Communication Consortium have recently elucidated a mechanism through which hypoxia leads to increased tumor aggressiveness and metastasis. Anil Sood and his group at University of Texas MD Anderson Cancer Center have identified a miRNA that downregulates the very pathway responsible for miRNA biogenesis, a finding that should generate excitement due to the identification of a potential new way of treating cancers: through miRNA or RNA interference-based gene targeting.

The development of tumors is a complex process that involves the cooperation of cancer cells with non-cancer cells, together making up the tumor microenvironment that is vital for tumor survival. As tumors grow and become dense, they begin signaling for the formation of new blood vessels to bring oxygen and nutrients to their core, a process called angiogenesis. Inhibiting angiogenesis is thus a highly attractive therapeutic avenue and is typically accomplished through the targeting of vascular endothelial growth factor (VEGF) and the subsequent induction of hypoxia, or a condition of low oxygen, in the tumor. Hypoxia, however, comes with its own set of problems.

The recent study suggests that hypoxia itself can lead to an increase in the aggressiveness and metastatic potential of a tumor (Rupaimoole et al, 2016). Previous studies have found that hypoxic conditions lead to the downregulation of Drosha and Dicer, two components of the miRNA biogenesis pathway, and that this decrease in expression is associated with poor clinical outcomes through a decrease in the pool of miRNA present in the tumor. In investigating the root cause of Dicer downregulation, Sood and his team identified a miRNA, miR-630, that is upregulated during hypoxia and that targets the 3’ UTR of Dicer. The study validated this targeting by monitoring Dicer mRNA and protein levels both in cells and in vivo in mouse models. Mice that had miR-630 delivered to them via nanoliposomes developed larger tumors and metastases in more places than control mice.

Of particular importance, the researchers treated mice with a combination of anti-miR-630 or anti-VEGF therapy (bevacizumab). Mice that were treated with both bevacizumab and anti-miR-630 developed smaller tumors and fewer metastatic nodules compared with mice treated with bevacizumab alone; Dicer expression was rescued upon treatment with anti-miR-630.

Apart from furthering our understanding of Dicer regulation during cancer, this research demonstrates the potential of treating cancers with anti-miRNA therapies, which would be particularly useful in situations where antibodies or chemical agents are not able to reach a target.

The potential of extracellular vesicle (EV) RNA as biomarkers of disease is increasingly being recognized. Circulating extracellular RNAs can potentially indicate the presence of disease without the need for an invasive biopsy of diseased tissue. A recent study by Yuan et al (Scientific Reports, Jan 2016) performed a systematic analysis of circulating exRNA in plasma obtained from 50 healthy individuals and 142 cancer patients. The authors conducted the largest RNA sequencing study reported to date for profiling circulating extracellular RNA (exRNA) species in order to provide useful insights into their baseline expression level. High-throughput multivariate statistical analysis identified a set of RNA candidates that were associated with age, sex, and cancer type.

The study directly addressed a major challenge in the field of exRNA, namely the lack of a baseline reference to accurately determine RNA abundance. Currently qRT-PCR is the most common method used to quantitate gene expression, but it relies on well-established reference controls for normalization. However, information on reference controls has not been established for exRNA. Control RNAs used for normalization in qRT-PCR experiments of cellular RNA cannot be reliably used for exRNA. Exogenous spike-in RNAs such as those derived from species like C. elegans cannot be used to normalize across biological or pathological states. Here the authors surveyed a pool of almost 200 exRNA profiles to identify a few potential reference candidates for exRNA quantification. Most candidate were miRNAs like miR-99a-5p. The study also provided convincing evidence for the presence in plasma of species of RNA other than miRNA, such as piwiRNA (see figure).

The dataset generated by this study is available in the exRNA Atlas for other researchers to explore. Future studies will be needed to validate these findings; however, with the current study we have taken a big leap towards the goal of determining the biomarker potential of exRNA for human diseases.

RNA species detected in the plasma of 142 cancer patients and 50 healthy controls

RNA species detected in the plasma of 142 cancer patients and 50 healthy controls
Source: Scientific Reports / CC BY 4.0

In previous studies in humans, measurement of extracellular RNAs (exRNAs) have primarily focused on microRNAs (miRNAs) or studied a small handful of subjects. The specific question of how large numbers of exRNAs are expressed in broader, non-diseased populations has remained. To examine this question, several groups from multiple institutions collaborated to measure exRNAs in the blood plasma of participants from the Framingham Heart Study, an observational cohort study based in Framingham, MA. In their recent publication (1), they first analyzed RNA sequencing data from the plasma of 40 individuals and identified over a thousand human exRNAs including miRNAs, piwi-interacting RNA (piRNAs), and small nucleolar RNAs (snoRNAs).

Study Design

Study Design

Although miRNAs have been commonly observed in the circulation and plasma, little is known about the presence of other common varieties of small human RNAs such as piRNAs and snoRNAs, known to be key components of molecular interactions and gene regulation in eukaryotes. Using a targeted RT-qPCR approach in an additional 2,763 individuals, the groups then characterized almost 500 of the most abundant extracellular RNA transcripts. The presence in plasma of many non-microRNA small RNAs was confirmed in this independent cohort. The findings show that diverse classes of circulating non-cellular small RNAs, beyond miRNAs, are consistently present in plasma from multiple human populations. Further work will determine how the presence of these exRNAs in the circulation correlates with the presence and progression of a broad number of human traits and diseases.

1. Freedman JE, Gerstein M, Mick E, Rozowsky J, Levy D, Kitchen R, Das S, Shah R, Danielson K, Beaulieu L, Navarro FCP, Wang Y, Galeev TR, Holman A,, Kwong RY, Murthy V, Tanriverdi SE, Koupenova-Zamor M, Mikalev E, Tanriverdi K. Diverse Human Extracellular RNAs are Widely Detected in Plasma. Nature Communications. Published online 26 April 2016.

Overly active KRAS leads to increased serine phosphorylation of Ago2 downstream of MEK and ERK. Phosphorylated Ago2 associates more with P-bodies than with multivesicular endosomes, which reduces the sorting of Ago2 and miRNAs into exosomes bound for export from the cell.

Source: Cell Reports


miRNA release into extracellular vesicles (EVs) is a mechanism to control the gene expression and cellular phenotypes of neighboring cells. A key question is how specific miRNAs are sorted into EVs. Active sorting of RNAs to extracellular carriers such as EVs likely depends on binding to specific RNA binding proteins. As a key member of the RNA-induced silencing complex (RISC) machinery that directly binds miRNA, Argonaute 2 (Ago2) has been a strong candidate as a miRNA carrier in EVs. However, the presence of Ago2 in EVs has been controversial.

In a new paper, we show that Ago2 is carried in both microvesicles and exosomes. Using isogenic cell lines for mutant oncogenic KRAS, we show that Ago2 sorting to exosomes is specifically down-regulated by KRAS-MEK-ERK signaling at late endosomes. Tests of three candidate miRNAs showed that this mechanism can regulate sorting of miRNAs to exosomes. Overall, these data indicate that Ago2 sorting to exosomes is a regulated event and may control miRNA sorting. Furthermore, previous studies that were performed in the presence of serum or growth factors in the media may have detected little Ago2 in exosomes due to growth factor activation of KRAS-MEK-ERK signaling. We hypothesize that this may be a mechanism for cells to sense the growth factor milieu and send that information to other cells via alterations in Ago2 and miRNA secretion.

McKenzie et al. “KRAS-MEK signaling controls Ago2 sorting into exosomes.” Cell Reports AOP 21 April 2016.

Sample output from Target Interaction Finder

Sample output from Target Interaction Finder

We recently developed two tools for the Genboree Workbench: Target Interaction Finder and Pathway Finder. Given a set of miRNAs, these tools find miRNA-target interactions and pathway targets from public databases. To support consortium scientists in using these tools, we have created two training videos which describe the necessary input file, how to use the tool, and downstream network visualization in Cytoscape for Target Interaction Finder and Pathway Finder. The videos are available in the collection of ERC consortium videos at YouTube and in the Resources/Presentations section of this website.

The tools are open to anyone with a (free) Genboree account and can be used with any arbitrary input list of miRNA identifiers or with public datasets available in the exRNA Atlas. Target Interaction Finder generates a network of miRNA and protein target interactions, which is returned as a tabular summary and an XGMML formatted network file. The network file can be imported into network visualization and analysis tools like Cytoscape. Pathway Finder generates a table of pathways containing the miRNA and/or their protein targets based on information from WikiPathways. Embedded in the results window of Pathway Finder is an interactive pathway viewer.

Sample Pathway Finder results file

Sample Pathway Finder results file


Interactive Pathway Finder viewer on Genboree Workbench


This blog post is adapted from a TGen press release, found here.

Uncovering the genetic makeup of patients using DNA sequencing has in recent years provided physicians and their patients with a greater understanding of how best to diagnose and treat the diseases that plague humanity. This is the essence of precision medicine.

Now, researchers at the Translational Genomics Research Institute (TGen) are showing how an even more detailed genetic analysis using RNA sequencing can vastly enhance that understanding, providing doctors and their patients with more precise tools to target the underlying causes of disease and help recommend the best course of action.

In their review, published recently in the journal Nature Reviews Genetics, TGen scientists highlight the many advantages of using RNA-sequencing in the detection and management of everything from cancer to infectious diseases such as Ebola and the rapidly spreading Zika virus.

RNA’s principal role is to act as a messenger carrying instructions from DNA for the synthesis of proteins. Building on the insights provided by DNA profiling, the analysis of RNA provides an even more precise look at how cells behave and how medicine can intervene when things go wrong.

Dr. Sara Byron

Dr. Sara Byron

“RNA is a dynamic and diverse biomolecule with an essential role in numerous biological processes,” said Dr. Sara Byron, Research Assistant Professor in TGen’s Center for Translational Innovation and the review’s lead author. “From a molecular diagnostic standpoint, RNA-based measurements have the potential for broad application across diverse areas of human health, including disease diagnosis, prognosis, and therapeutic selection.”

DNA (deoxyribonucleic acid) sequencing spells out — in order — the billions of chemical letters that make up the genes that drive all of our biologic make-up and functions, from hair and eye color to whether an individual may be predisposed to cancer or other diseases.

RNA (ribonucleic acid) sequencing provides information on the genes that are actively being made into RNA in a cell and are important for cell function. While more complex, RNA holds the promise of more precise measurement of the human physical condition.

There are more forms of RNA than of DNA present in the body, explains Dr. Byron. “RNA sequencing provides a deeper view of a patient’s genome, revealing detailed information on the diverse spectrum of RNAs being expressed.”

One of the most promising aspects of RNA-based measurements is the potential of using extracellular RNA (exRNAs) as a non-invasive diagnostic indicator of disease. Monitoring exRNA simply takes a blood sample, as opposed to doing a tumor biopsy, which is essentially a minor surgery with greater risks and costs.

“The investigation of exRNAs in biofluids to monitor disease is an area of diagnostic research that is growing rapidly,” said Dr. Kendall Van Keuren-Jensen, TGen Associate Professor of Neurogenomics, Co-Director of TGen’s Center for Noninvasive Diagnostics, and one of the review’s authors. “Measurement of exRNA is appealing as a non-invasive method for monitoring disease. With increased access to biofluids, more frequent sampling can occur over time.”

The first clinical test to measure exRNA was released earlier this year, the review said. The test is for use in evaluating lung cancer progression, and the potential for using RNA-seq in other cancers is expanding rapidly. Commercial RNA-seq tests are now available, providing the opportunity for clinicians to more comprehensively profile cancer and use this information to guide treatment selection for their patients, the review said.

In addition, the authors reported on several recent applications for RNA-seq in the diagnosis and management of infectious diseases, such as monitoring for drug-resistant populations during therapy and tracking the origin and spread of the Ebola virus.

Using examples from discovery and clinical research, the authors also describe how RNA-seq can guide interpretation of genomic DNA sequencing results. The use of integrative sequencing strategies in research studies is growing across a broad range of health applications, which promises to drive the incorporation of RNA-seq into clinical medicine as well, the review said.

The paper, Translating RNA-sequencing into Clinical Diagnostics: Opportunities and Challenges, was published online recently in the journal Nature Reviews Genetics. Authors Kendall Van Keuren-Jensen and David W. Craig are participants in the Extracellular RNA Communication consortium.

Source: Translational Genomics Research Institute

The exRNA pathway portal at WikiPathways was created in 2014 and now includes 50 exRNA-related pathways, including 19 from publications by the Extracellular RNA Communication consortium. We are committed to capturing every published pathway figure from the consortium as a properly modeled pathway at WikiPathways. If your work involves pathways, especially if you are publishing it, look into contributing to WikiPathways.

The latest pathways to be curated from consortium publications are:

  1. ApoE and miR-146 in inflammation and atherosclerosis
  2. miR-148a/miR-31/FIH1/HIF1α-Notch signaling in glioblastoma
  3. mir-124 predicted interactions with cell cycle and differentiation
  4. miR-517 relationship with ARCN1 and USP1

The goal of the Wikipathways exRNA portal is to build a collection of pathway models for exRNA researchers to use for illustration, data visualization, and analysis. Each pathway is a self-contained data model that connects to identifier and annotation databases. In addition to providing static images for figures and presentations, these pathways can also be used by bioinformatics and network analysis packages such as Cytoscape and PathVisio. Furthermore, as a wiki, anyone can sign up to improve and grow the content. We invite you all to edit, fix, and add to the pathway models in the exRNA pathway portal at WikiPathways.


This blog post is adapted from an article at by Earl Lane. The original can be found here.

Over the past decade, David Wong of UCLA and his colleagues have been developing a method for detecting circulating tumor DNA in bodily fluids such as blood and saliva. The approach, known as a liquid biopsy, holds the promise of quicker, less invasive identification of cancers and easier tracking of disease status during the course of treatment.

In a news briefing at the 2016 annual meeting of the American Association for the Advancement of Science (AAAS), Wong described a prototype device — called electric field-induced release and measurement (EFIRM) — that can detect biomarkers in saliva for a malignancy called non-small cell lung cancer (Wei et al., 2014). The device has high accuracy compared to current sequencing technology and is entering clinical testing in lung cancer patients in Asia this year.

The test requires just one drop of saliva and can be completed in 10 minutes in a physician’s office. Dr. Wong envisions using it in conjunction with other diagnostic tools. If a lung X-ray were to show a suspicious nodule, for example, a doctor could do the saliva-based test to help quickly determine whether a cancer is likely.

The test can reliably find genetic mutations involving epidermal growth factor receptor (EGFR), a protein on the surface of cells. It normally helps the cells grow and divide, but some cells in non-small cell lung cancer have too much EGFR, which makes them grow faster. Several drugs can block the growth signal from EGFR, and their use could be ordered promptly by a clinician.

Saliva-based tests might someday allow screening for a variety of cancers in a doctor’s office or laboratory, says Wong, director of UCLA’s Center for Oral/Head and Neck Oncology Research. He and his colleagues have also been exploring the use of saliva-based liquid biopsy technology to detect mutations linked to cancers of the mouth and the back of the throat (called oropharyngeal cancers). Dr. Wong’s work with the ERC consortium explores the use of liquid biopsy of extracellular RNA to detect gastric cancer.

Source: AAAS

The ability to perform small RNA sequencing on extracellular RNA samples allows us to measure, at least in a semi-quantitative manner, the abundance of miRNAs, piRNAs, fragments of tRNAs, long non-coding RNAs, and coding RNAs. One of the most significant advantages of RNA-Seq technology is that it can detect and measure any RNA that is present, whether or not it is a known sequence. Nonetheless, there are important unanswered questions about the accuracy of RNA-Seq and the optimal approach for processing the data obtained. All RNA-seq experiments are subject to sources of systematic variation such as library size, transcript length, and G-C content (Dillies et al., 2013). Small RNA-seq experiments are further impacted by the highly non-normal distribution of expression of different small RNAs, particularly miRNAs. Often a few miRNAs account for a very large fraction of total reads while the vast majority of miRNAs each contribute a small percentage of reads. Moreover, sample input amounts of extracellular RNA are often extremely limited, increasing the potential for both sampling error and experimental bias.

To address these issues, a number of normalization methods have been developed which can be assigned three basic categories: 1) scaling; 2) normalizing in order to achieve similar data distributions; and 3) Reads Per Kilobase Mapped (RPKM). There appears to be a lack of consensus regarding the optimal normalization method, but it is known that different methods can result in different results in downstream analysis, particularly differential expression analysis (DE).

Two studies highlighted here, Garmire and Subramaniam, 2012 and Dillies et al., 2013, do not agree on the best way to do normalization for small RNA-seq, but point to a number of methods analyzed and elucidate some of the key problems. We review these studies briefly here and point out the importance of rigorous comparisons of normalization methods for the future of differential comparisons of data sets.

The first class of methods, scaling methods, involves application of a standard linear mathematical operation to each sample. Scaling generally means changing the size of something, and normalizing by simply dividing by the total read number (going from read numbers to read fractions) is probably the simplest kind of scaling. Scaling approaches to normalization include global scaling, Lowess, and the Trimmed Mean Method (TMM) (Robinson and Oshlack, 2010). These approaches each use a different method to calculate a linear scaling factor. Global scaling uses a factor that is based on the difference in the means of the data sets to be compared. Lowess normalizes based on a multiple-regression model. TMM determines a scaling factor, which is the weighted trimmed mean of log expression. This factor is calculated after double trimming values at the two extremes based on log-intensity ratios (M-values) and log-intensity averages (A-values). According to Garmire and Subramaniam, variability in estimated RNA content may be even more pronounced for microRNA-seq datasets after application of this method.

Another general approach to normalization is to preserve aspects of the distribution of the data among different data sets. As with scaling, there are a number of different approaches to achieving matched data distributions, including quantile, Variance stabilization (VSN), the invariant method (INV), and DESeq. Quantile normalization has been extensively used for microarray data, with the goal of making the distribution of expression levels across samples similar. Conditional quantile normalization is a modification of quantile normalization that combines robust generalized regression and quantile normalization (Hansen, Irizarry, and Wu, 2012). The goal of the VSN method is to make the distribution of variance across different levels of expression similar. In INV normalization, a set of invariant miRNAs are selected, which are then used with one of the other methods (such as Lowess or VSN). In DESeq, a scaling factor for each sample in the dataset is obtained by computing the median of the ratios of each gene in one sample over the geometric mean of that gene across all samples. The same scaling factor is then applied to the read counts for all of the genes in that sample. RPKM is a method that has been widely used for long RNAseq datasets. RPKM is performed on each sample separately, and consists of taking the ratio of the number of counts for a given gene and the product of the total counts for all genes and the mature transcript length for that gene. Methods such as DEseq and TMM rely on the assumption that most genes are not differentially expressed (Dillies et al., 2013), which may not hold true for all microRNA-seq data sets.

Garmire and Subramaniam compared a number of normalization methods applied to mammalian microRNA-Seq data using two publicly available datasets that were chosen by the authors due to the availability of matched PCR data. They assessed the performance of the normalization methods by calculating the mean square error (MSE) and the Kolmogorv-Smirnov statistic (which is a measure of the difference of two distributions), as well as comparison with PCR data on the same samples and inspection of the results of differential expression. The authors show that Lowess, quantile and VSN normalization resulted in a smaller MSE, while TMM and VSN produced a higher MSE. Similarly, TMM and INV resulted in a larger K-S statistic, indicating a bigger change in the distribution. Quantile and Lowess normalization also had the best concordance with qPCR data. The authors thus concluded that Lowess and quantile normalization performed better than other methods studied. The primary limitations of this study include the fact that it did not incorporate strategies to compare the results of the normalization method to any “gold standard” for which the real distribution was known, and that the number of data sets used for the analysis was very small. In addition, other methods have been implemented since the publication of their study.

Dillies et al. published another comparison of normalization methods for RNAseq data in about the same time frame. Most of the datasets used were long RNAseq data, but a murine microRNA dataset was included. The seven methods compared in this study included TC (total counts), UQ (Upper Quartile), Median, DEseq, TMM, quantile and RPKM. TC, UQ and median are scaling approaches that are quite similar, involving the calculation of a scaling factor based on either the ratio of total counts (TC), upper quartile of counts (UQ) or median of counts. The authors assessed these normalization methods on real as well as simulated data. In their analysis, it is apparent that when boxplots of counts before and after normalization are assessed, the differences are most apparent in conditions where there are large differences in library size between samples. These differences do not improve after TC and RPKM normalization of the microRNA-seq data. Other features of microRNA-seq data that influence the results of the normalization according to the authors are the presence of high-count genes and a large number of 0 counts. Quantile normalization increases the intra-condition variability in the murine miRNA data. Based on the results of differential expression in this study, it is apparent that the differences in the final results depend on the normalization method and not on the model chosen to assess for differential expression (DESeq or TSPM). In the simulated data, which contains high count genes and may resemble microRNA seq data more closely than the other datasets, only DESeq and TMM were successful in achieving a low false positive rate and high power. The authors conclude that DESeq and TMM are the methods of choice based on their ability to perform in the presence of different library sizes and composition.

The differences in the conclusions from these two studies are likely due to the different normalization methods and specific datasets each used, and on the parameters they used to evaluate the performance of the normalization methods. Since the Garmire and Subramanian paper looks only at miRNA data and the Dillies et al. paper looks at a mix of data, including miRNA and mRNA data, it is difficult to compare the results. We suggest that the additional complexities of compiling mRNA levels from multiple read sequences may confuse the latter analysis. However, both studies agree that the use of different normalization approaches can result in significant differences in downstream differential expression results. One of the key weaknesses of both of these papers is the lack of data from “gold standard” datasets for which the quantities of the different RNAs were definitively known.

For this reason, it would be valuable to develop the tools necessary for rigorous comparison of the available normalization methods. Such tools may include the generation of standardized data sets, such as small RNA libraries constructed from purely synthetic miRNAs, for which the content can be completely controlled, or RNAs from biological specimens that contain synthetic miRNAs spiked in at controlled concentrations. Corresponding qPCR results, used with calibration curves, should be generated for such datasets, which will serve as an orthogonal measurement technology for developing and evaluating normalization methods. Overall, this important topic needs careful attention for the establishment of reference exRNA profiles, and for the realization of the full potential of the powerful technology of high throughput RNA-seq.

* * *

Note that one of us recently presented a web seminar on a related topic, Understanding and using small RNA-seq, that is available for viewing. Also, members of the ERCC will be jointly presenting a workshop on Data Normalization Challenges and Solutions as part of the CHI conference Extracellular RNA in Drug and Diagnostic Development in Cambridge, MA, 3-6 April 2016. See the Event page for more details.

Exosome Diagnostics

Exosome Diagnostics, Inc. has announced the launch of ExoDx Lung(ALK), the first ever CLIA-validated exosome based blood test. This test detects EML4-ALK fusion transcripts in the plasma of lung cancer patients whose primary tumors carry this mutation. Although these patients make up a small minority of cases, identifying them is important because their tumors are particularly sensitive to ALK inhibitors. Current clinical tests are performed on biopsied tumor tissue. Major drawbacks include not only the risks associated with the invasive biopsy procedure but also low test performance for some of the commonly used methods (especially immunocytochemistry-based tests). In contrast, ExoDx Lung(ALK) requires only a standard blood draw and has 88% sensitivity and 100% specificity (as reported by the manufacturer). This is a major milestone in the application of exosome-based biomarkers to precision medicine, especially in the areas of companion diagnostics and targeted therapies.

The key discovery that let to this test was made eight years ago when Dr. Johan Skog, the Chief Scientific Officer at Exosome Diagnostics, demonstrated that a mutation present in the tumors of patients with glioblastoma could be detected in their blood (Skog et al., 2008). The publication reporting this finding has been cited over 1600 times, indicating its significance and potential for broad application. Now lung cancer patients can directly benefit from the first clinical test based on this discovery, in the form of a non-invasive test to determine the EML4/ALK mutation status of their tumors. We are sure that many more biofluid-based tests, targeting not only cancers but also other conditions where non-invasive testing is desired, will soon follow.