Overly active KRAS leads to increased serine phosphorylation of Ago2 downstream of MEK and ERK. Phosphorylated Ago2 associates more with P-bodies than with multivesicular endosomes, which reduces the sorting of Ago2 and miRNAs into exosomes bound for export from the cell.

Source: Cell Reports

 

miRNA release into extracellular vesicles (EVs) is a mechanism to control the gene expression and cellular phenotypes of neighboring cells. A key question is how specific miRNAs are sorted into EVs. Active sorting of RNAs to extracellular carriers such as EVs likely depends on binding to specific RNA binding proteins. As a key member of the RNA-induced silencing complex (RISC) machinery that directly binds miRNA, Argonaute 2 (Ago2) has been a strong candidate as a miRNA carrier in EVs. However, the presence of Ago2 in EVs has been controversial.

In a new paper, we show that Ago2 is carried in both microvesicles and exosomes. Using isogenic cell lines for mutant oncogenic KRAS, we show that Ago2 sorting to exosomes is specifically down-regulated by KRAS-MEK-ERK signaling at late endosomes. Tests of three candidate miRNAs showed that this mechanism can regulate sorting of miRNAs to exosomes. Overall, these data indicate that Ago2 sorting to exosomes is a regulated event and may control miRNA sorting. Furthermore, previous studies that were performed in the presence of serum or growth factors in the media may have detected little Ago2 in exosomes due to growth factor activation of KRAS-MEK-ERK signaling. We hypothesize that this may be a mechanism for cells to sense the growth factor milieu and send that information to other cells via alterations in Ago2 and miRNA secretion.

McKenzie et al. “KRAS-MEK signaling controls Ago2 sorting into exosomes.” Cell Reports AOP 21 April 2016.

Sample output from Target Interaction Finder

Sample output from Target Interaction Finder

We recently developed two tools for the Genboree Workbench: Target Interaction Finder and Pathway Finder. Given a set of miRNAs, these tools find miRNA-target interactions and pathway targets from public databases. To support consortium scientists in using these tools, we have created two training videos which describe the necessary input file, how to use the tool, and downstream network visualization in Cytoscape for Target Interaction Finder and Pathway Finder. The videos are available in the collection of ERC consortium videos at YouTube and in the Resources/Presentations section of this website.

The tools are open to anyone with a (free) Genboree account and can be used with any arbitrary input list of miRNA identifiers or with public datasets available in the exRNA Atlas. Target Interaction Finder generates a network of miRNA and protein target interactions, which is returned as a tabular summary and an XGMML formatted network file. The network file can be imported into network visualization and analysis tools like Cytoscape. Pathway Finder generates a table of pathways containing the miRNA and/or their protein targets based on information from WikiPathways. Embedded in the results window of Pathway Finder is an interactive pathway viewer.

Sample Pathway Finder results file

Sample Pathway Finder results file

exRNA_WP_blog_tools-PF-interactive

Interactive Pathway Finder viewer on Genboree Workbench

TGEN_LOGO

This blog post is adapted from a TGen press release, found here.

Uncovering the genetic makeup of patients using DNA sequencing has in recent years provided physicians and their patients with a greater understanding of how best to diagnose and treat the diseases that plague humanity. This is the essence of precision medicine.

Now, researchers at the Translational Genomics Research Institute (TGen) are showing how an even more detailed genetic analysis using RNA sequencing can vastly enhance that understanding, providing doctors and their patients with more precise tools to target the underlying causes of disease and help recommend the best course of action.

In their review, published recently in the journal Nature Reviews Genetics, TGen scientists highlight the many advantages of using RNA-sequencing in the detection and management of everything from cancer to infectious diseases such as Ebola and the rapidly spreading Zika virus.

RNA’s principal role is to act as a messenger carrying instructions from DNA for the synthesis of proteins. Building on the insights provided by DNA profiling, the analysis of RNA provides an even more precise look at how cells behave and how medicine can intervene when things go wrong.

Dr. Sara Byron

Dr. Sara Byron

“RNA is a dynamic and diverse biomolecule with an essential role in numerous biological processes,” said Dr. Sara Byron, Research Assistant Professor in TGen’s Center for Translational Innovation and the review’s lead author. “From a molecular diagnostic standpoint, RNA-based measurements have the potential for broad application across diverse areas of human health, including disease diagnosis, prognosis, and therapeutic selection.”

DNA (deoxyribonucleic acid) sequencing spells out — in order — the billions of chemical letters that make up the genes that drive all of our biologic make-up and functions, from hair and eye color to whether an individual may be predisposed to cancer or other diseases.

RNA (ribonucleic acid) sequencing provides information on the genes that are actively being made into RNA in a cell and are important for cell function. While more complex, RNA holds the promise of more precise measurement of the human physical condition.

There are more forms of RNA than of DNA present in the body, explains Dr. Byron. “RNA sequencing provides a deeper view of a patient’s genome, revealing detailed information on the diverse spectrum of RNAs being expressed.”

One of the most promising aspects of RNA-based measurements is the potential of using extracellular RNA (exRNAs) as a non-invasive diagnostic indicator of disease. Monitoring exRNA simply takes a blood sample, as opposed to doing a tumor biopsy, which is essentially a minor surgery with greater risks and costs.

“The investigation of exRNAs in biofluids to monitor disease is an area of diagnostic research that is growing rapidly,” said Dr. Kendall Van Keuren-Jensen, TGen Associate Professor of Neurogenomics, Co-Director of TGen’s Center for Noninvasive Diagnostics, and one of the review’s authors. “Measurement of exRNA is appealing as a non-invasive method for monitoring disease. With increased access to biofluids, more frequent sampling can occur over time.”

The first clinical test to measure exRNA was released earlier this year, the review said. The test is for use in evaluating lung cancer progression, and the potential for using RNA-seq in other cancers is expanding rapidly. Commercial RNA-seq tests are now available, providing the opportunity for clinicians to more comprehensively profile cancer and use this information to guide treatment selection for their patients, the review said.

In addition, the authors reported on several recent applications for RNA-seq in the diagnosis and management of infectious diseases, such as monitoring for drug-resistant populations during therapy and tracking the origin and spread of the Ebola virus.

Using examples from discovery and clinical research, the authors also describe how RNA-seq can guide interpretation of genomic DNA sequencing results. The use of integrative sequencing strategies in research studies is growing across a broad range of health applications, which promises to drive the incorporation of RNA-seq into clinical medicine as well, the review said.

The paper, Translating RNA-sequencing into Clinical Diagnostics: Opportunities and Challenges, was published online recently in the journal Nature Reviews Genetics. Authors Kendall Van Keuren-Jensen and David W. Craig are participants in the Extracellular RNA Communication consortium.

Source: Translational Genomics Research Institute

The exRNA pathway portal at WikiPathways was created in 2014 and now includes 50 exRNA-related pathways, including 19 from publications by the Extracellular RNA Communication consortium. We are committed to capturing every published pathway figure from the consortium as a properly modeled pathway at WikiPathways. If your work involves pathways, especially if you are publishing it, look into contributing to WikiPathways.

The latest pathways to be curated from consortium publications are:

  1. ApoE and miR-146 in inflammation and atherosclerosis
  2. miR-148a/miR-31/FIH1/HIF1α-Notch signaling in glioblastoma
  3. mir-124 predicted interactions with cell cycle and differentiation
  4. miR-517 relationship with ARCN1 and USP1

The goal of the Wikipathways exRNA portal is to build a collection of pathway models for exRNA researchers to use for illustration, data visualization, and analysis. Each pathway is a self-contained data model that connects to identifier and annotation databases. In addition to providing static images for figures and presentations, these pathways can also be used by bioinformatics and network analysis packages such as Cytoscape and PathVisio. Furthermore, as a wiki, anyone can sign up to improve and grow the content. We invite you all to edit, fix, and add to the pathway models in the exRNA pathway portal at WikiPathways.

News_20160212_oralcancer_full_169

This blog post is adapted from an article at AAAS.org by Earl Lane. The original can be found here.

Over the past decade, David Wong of UCLA and his colleagues have been developing a method for detecting circulating tumor DNA in bodily fluids such as blood and saliva. The approach, known as a liquid biopsy, holds the promise of quicker, less invasive identification of cancers and easier tracking of disease status during the course of treatment.

In a news briefing at the 2016 annual meeting of the American Association for the Advancement of Science (AAAS), Wong described a prototype device — called electric field-induced release and measurement (EFIRM) — that can detect biomarkers in saliva for a malignancy called non-small cell lung cancer (Wei et al., 2014). The device has high accuracy compared to current sequencing technology and is entering clinical testing in lung cancer patients in Asia this year.

The test requires just one drop of saliva and can be completed in 10 minutes in a physician’s office. Dr. Wong envisions using it in conjunction with other diagnostic tools. If a lung X-ray were to show a suspicious nodule, for example, a doctor could do the saliva-based test to help quickly determine whether a cancer is likely.

The test can reliably find genetic mutations involving epidermal growth factor receptor (EGFR), a protein on the surface of cells. It normally helps the cells grow and divide, but some cells in non-small cell lung cancer have too much EGFR, which makes them grow faster. Several drugs can block the growth signal from EGFR, and their use could be ordered promptly by a clinician.

Saliva-based tests might someday allow screening for a variety of cancers in a doctor’s office or laboratory, says Wong, director of UCLA’s Center for Oral/Head and Neck Oncology Research. He and his colleagues have also been exploring the use of saliva-based liquid biopsy technology to detect mutations linked to cancers of the mouth and the back of the throat (called oropharyngeal cancers). Dr. Wong’s work with the ERC consortium explores the use of liquid biopsy of extracellular RNA to detect gastric cancer.

Source: AAAS

The ability to perform small RNA sequencing on extracellular RNA samples allows us to measure, at least in a semi-quantitative manner, the abundance of miRNAs, piRNAs, fragments of tRNAs, long non-coding RNAs, and coding RNAs. One of the most significant advantages of RNA-Seq technology is that it can detect and measure any RNA that is present, whether or not it is a known sequence. Nonetheless, there are important unanswered questions about the accuracy of RNA-Seq and the optimal approach for processing the data obtained. All RNA-seq experiments are subject to sources of systematic variation such as library size, transcript length, and G-C content (Dillies et al., 2013). Small RNA-seq experiments are further impacted by the highly non-normal distribution of expression of different small RNAs, particularly miRNAs. Often a few miRNAs account for a very large fraction of total reads while the vast majority of miRNAs each contribute a small percentage of reads. Moreover, sample input amounts of extracellular RNA are often extremely limited, increasing the potential for both sampling error and experimental bias.

To address these issues, a number of normalization methods have been developed which can be assigned three basic categories: 1) scaling; 2) normalizing in order to achieve similar data distributions; and 3) Reads Per Kilobase Mapped (RPKM). There appears to be a lack of consensus regarding the optimal normalization method, but it is known that different methods can result in different results in downstream analysis, particularly differential expression analysis (DE).

Two studies highlighted here, Garmire and Subramaniam, 2012 and Dillies et al., 2013, do not agree on the best way to do normalization for small RNA-seq, but point to a number of methods analyzed and elucidate some of the key problems. We review these studies briefly here and point out the importance of rigorous comparisons of normalization methods for the future of differential comparisons of data sets.

The first class of methods, scaling methods, involves application of a standard linear mathematical operation to each sample. Scaling generally means changing the size of something, and normalizing by simply dividing by the total read number (going from read numbers to read fractions) is probably the simplest kind of scaling. Scaling approaches to normalization include global scaling, Lowess, and the Trimmed Mean Method (TMM) (Robinson and Oshlack, 2010). These approaches each use a different method to calculate a linear scaling factor. Global scaling uses a factor that is based on the difference in the means of the data sets to be compared. Lowess normalizes based on a multiple-regression model. TMM determines a scaling factor, which is the weighted trimmed mean of log expression. This factor is calculated after double trimming values at the two extremes based on log-intensity ratios (M-values) and log-intensity averages (A-values). According to Garmire and Subramaniam, variability in estimated RNA content may be even more pronounced for microRNA-seq datasets after application of this method.

Another general approach to normalization is to preserve aspects of the distribution of the data among different data sets. As with scaling, there are a number of different approaches to achieving matched data distributions, including quantile, Variance stabilization (VSN), the invariant method (INV), and DESeq. Quantile normalization has been extensively used for microarray data, with the goal of making the distribution of expression levels across samples similar. Conditional quantile normalization is a modification of quantile normalization that combines robust generalized regression and quantile normalization (Hansen, Irizarry, and Wu, 2012). The goal of the VSN method is to make the distribution of variance across different levels of expression similar. In INV normalization, a set of invariant miRNAs are selected, which are then used with one of the other methods (such as Lowess or VSN). In DESeq, a scaling factor for each sample in the dataset is obtained by computing the median of the ratios of each gene in one sample over the geometric mean of that gene across all samples. The same scaling factor is then applied to the read counts for all of the genes in that sample. RPKM is a method that has been widely used for long RNAseq datasets. RPKM is performed on each sample separately, and consists of taking the ratio of the number of counts for a given gene and the product of the total counts for all genes and the mature transcript length for that gene. Methods such as DEseq and TMM rely on the assumption that most genes are not differentially expressed (Dillies et al., 2013), which may not hold true for all microRNA-seq data sets.

Garmire and Subramaniam compared a number of normalization methods applied to mammalian microRNA-Seq data using two publicly available datasets that were chosen by the authors due to the availability of matched PCR data. They assessed the performance of the normalization methods by calculating the mean square error (MSE) and the Kolmogorv-Smirnov statistic (which is a measure of the difference of two distributions), as well as comparison with PCR data on the same samples and inspection of the results of differential expression. The authors show that Lowess, quantile and VSN normalization resulted in a smaller MSE, while TMM and VSN produced a higher MSE. Similarly, TMM and INV resulted in a larger K-S statistic, indicating a bigger change in the distribution. Quantile and Lowess normalization also had the best concordance with qPCR data. The authors thus concluded that Lowess and quantile normalization performed better than other methods studied. The primary limitations of this study include the fact that it did not incorporate strategies to compare the results of the normalization method to any “gold standard” for which the real distribution was known, and that the number of data sets used for the analysis was very small. In addition, other methods have been implemented since the publication of their study.

Dillies et al. published another comparison of normalization methods for RNAseq data in about the same time frame. Most of the datasets used were long RNAseq data, but a murine microRNA dataset was included. The seven methods compared in this study included TC (total counts), UQ (Upper Quartile), Median, DEseq, TMM, quantile and RPKM. TC, UQ and median are scaling approaches that are quite similar, involving the calculation of a scaling factor based on either the ratio of total counts (TC), upper quartile of counts (UQ) or median of counts. The authors assessed these normalization methods on real as well as simulated data. In their analysis, it is apparent that when boxplots of counts before and after normalization are assessed, the differences are most apparent in conditions where there are large differences in library size between samples. These differences do not improve after TC and RPKM normalization of the microRNA-seq data. Other features of microRNA-seq data that influence the results of the normalization according to the authors are the presence of high-count genes and a large number of 0 counts. Quantile normalization increases the intra-condition variability in the murine miRNA data. Based on the results of differential expression in this study, it is apparent that the differences in the final results depend on the normalization method and not on the model chosen to assess for differential expression (DESeq or TSPM). In the simulated data, which contains high count genes and may resemble microRNA seq data more closely than the other datasets, only DESeq and TMM were successful in achieving a low false positive rate and high power. The authors conclude that DESeq and TMM are the methods of choice based on their ability to perform in the presence of different library sizes and composition.

The differences in the conclusions from these two studies are likely due to the different normalization methods and specific datasets each used, and on the parameters they used to evaluate the performance of the normalization methods. Since the Garmire and Subramanian paper looks only at miRNA data and the Dillies et al. paper looks at a mix of data, including miRNA and mRNA data, it is difficult to compare the results. We suggest that the additional complexities of compiling mRNA levels from multiple read sequences may confuse the latter analysis. However, both studies agree that the use of different normalization approaches can result in significant differences in downstream differential expression results. One of the key weaknesses of both of these papers is the lack of data from “gold standard” datasets for which the quantities of the different RNAs were definitively known.

For this reason, it would be valuable to develop the tools necessary for rigorous comparison of the available normalization methods. Such tools may include the generation of standardized data sets, such as small RNA libraries constructed from purely synthetic miRNAs, for which the content can be completely controlled, or RNAs from biological specimens that contain synthetic miRNAs spiked in at controlled concentrations. Corresponding qPCR results, used with calibration curves, should be generated for such datasets, which will serve as an orthogonal measurement technology for developing and evaluating normalization methods. Overall, this important topic needs careful attention for the establishment of reference exRNA profiles, and for the realization of the full potential of the powerful technology of high throughput RNA-seq.

* * *

Note that one of us recently presented a web seminar on a related topic, Understanding and using small RNA-seq, that is available for viewing. Also, members of the ERCC will be jointly presenting a workshop on Data Normalization Challenges and Solutions as part of the CHI conference Extracellular RNA in Drug and Diagnostic Development in Cambridge, MA, 3-6 April 2016. See the Event page for more details.

Exosome Diagnostics

Exosome Diagnostics, Inc. has announced the launch of ExoDx Lung(ALK), the first ever CLIA-validated exosome based blood test. This test detects EML4-ALK fusion transcripts in the plasma of lung cancer patients whose primary tumors carry this mutation. Although these patients make up a small minority of cases, identifying them is important because their tumors are particularly sensitive to ALK inhibitors. Current clinical tests are performed on biopsied tumor tissue. Major drawbacks include not only the risks associated with the invasive biopsy procedure but also low test performance for some of the commonly used methods (especially immunocytochemistry-based tests). In contrast, ExoDx Lung(ALK) requires only a standard blood draw and has 88% sensitivity and 100% specificity (as reported by the manufacturer). This is a major milestone in the application of exosome-based biomarkers to precision medicine, especially in the areas of companion diagnostics and targeted therapies.

The key discovery that let to this test was made eight years ago when Dr. Johan Skog, the Chief Scientific Officer at Exosome Diagnostics, demonstrated that a mutation present in the tumors of patients with glioblastoma could be detected in their blood (Skog et al., 2008). The publication reporting this finding has been cited over 1600 times, indicating its significance and potential for broad application. Now lung cancer patients can directly benefit from the first clinical test based on this discovery, in the form of a non-invasive test to determine the EML4/ALK mutation status of their tumors. We are sure that many more biofluid-based tests, targeting not only cancers but also other conditions where non-invasive testing is desired, will soon follow.

exRNAatlas

The first public release of the exRNA Atlas is now available via the ExRNA Atlas link in the Quick Links section of the exRNA Portal. The Atlas is produced by the NIH Common Fund’s Extracellular RNA Communication (ERC) Consortium and includes 519 small exRNA profiles from eight laboratories. Each profile in the exRNA Atlas acknowledges the contributing laboratory. The profiles were derived from about 6.4 billion reads uniformly processed using the exceRpt small RNA-seq pipeline. Faceted filtering and data navigation tools — hosted by GenboreeKB — are enabled by rich metadata standards developed by the consortium and metadata annotations contributed by the data producers. Uniform data quality metrics agreed by the consortium were applied to all datasets. On behalf of the Bioinformatics Research Lab at Baylor College of Medicine and the whole Data Management and Resource Repository (DMRR), I would like to thank the contributors and the consortium for the outstanding team effort required to reach this important milestone!

To balance the desire of data contributors to have a protected period of time to analyze and publish the data they have produced, the data access policy for datasets in the exRNA Atlas provides for a 12-month embargo period. The embargo period expires on 1 July, 2016 for the profiles in the current release. Researchers may analyze embargoed datasets from the Atlas but may not publish or make scientific presentations about them until the embargo period has ended. The Atlas will be updated regularly with new profiles, each new profile having its own 12-month embargo period per the data access policy. The read-level information for the profiles in the Atlas will be deposited in GEO (unrestricted access) or dbGaP (controlled access). The exRNA Atlas profiles will contain links to these archival records as the data are deposited.

The exRNA Atlas website is currently optimized for the Firefox browser. Extensive testing on other browsers is yet to be performed. Not too many problems are expected on other browsers, but if you do encounter a problem, consider using Firefox. Optimization for mobile devices is also yet to be completed.

Sai Subramanian of the DMRR highlighted the features of this new release of the exRNA Atlas on an ERCC webinar on 4 Feb, 2016 at 1pm ET. If you missed the live talk, it will be available soon afterwards at exRNA.org/About.

Where do we go from here? Of course, we are just at the beginning. By the end of 2016, the amount of data from the consortium’s reference profile projects will likely dwarf this first release. Our next focus here at the DMRR will be to “test drive” the data by performing a number of integrative analyses and to deploy analysis tools that may be applied both to the Atlas profiles and to profiles that are not yet public. Stay tuned and Happy New exRNA Year!

Non-coding RNAs (ncRNAs), for example microRNAs (miRNAs), are frequently dysregulated in cancer and other diseases, and have shown great potential as tissue-based markers for cancer classification and prognostication. ncRNAs are present in membrane-bound vesicles, such as exosomes, in extracellular human body fluids. Circulating miRNAs are also present in human plasma and serum and cofractionate with the Argonaute2 (Ago2) protein and high-density lipoprotein (HDL). Since miRNAs and other ncRNAs circulate in the bloodstream in highly stable forms, they may be used as blood-based biomarkers for cancer and other diseases. A knowledge base of non-invasive biomarkers is a fundamental tool for biomedical research in this field.

In 2012, miRandola was developed as the first database of circulating extracellular miRNAs (Russo et al., 2012). miRandola is a comprehensive, manually curated collection and classification of circulating extracellular miRNAs. We recently updated miRandola with 271 papers, 2695 entries, 673 miRNAs and 12 long non-coding RNAs. The future direction of the database is to be a resource for all potential non-invasive circulating nucleic acid biomarkers.

miRandola_schema

miRandola is the first online resource which gathers all the available data on circulating RNAs into one environment (see Figure). It represents a useful reference tool for anyone investigating the role of extracellular RNAs as biomarkers, as well as their physiological function and their involvement in pathologies.

The database is constantly updated as soon as new data is available, and the online submission system is a crucial feature which helps to ensure that the system is always up-to-date. We are working on a second version of the database to increase the amount of data and to improve usability. miRandola is available online at https://mirandola.iit.cnr.it/.

The exRNA portal at WikiPathways continues to grow. It now includes 46 exRNA-related pathways, including 15 from publications by the Extracellular RNA Communication consortium. We are committed to capturing every published pathway figure from the consortium as a properly modeled pathway at WikiPathways. If your work involves pathways, especially if you are publishing it, then look into contributing to WikiPathways.

The latest pathways to be curated from consortium publications are

  • 1) EV release from cardiac cells and their functional effects (WP3297),
  • 2) the Rac1/Pak1/p38/MMP-2 pathway (WP3303),
  • 3) eIF5A regulation in response to inhibition of the nuclear export system (WP3302), and
  • 4) MFAP5-mediated ovarian cancer cell motility and invasiveness (WP3301).

The goal of the Wikipathways exRNA portal is to build a collection of pathway models for exRNA researchers to use for illustration, data visualization, and analysis. Each pathway is a self-contained data model that connects to identifier and annotation databases. In addition to providing static images for figures and presentations, these pathways can also be used by bioinformatics and network analysis packages such as Cytoscape and PathVisio. Furthermore, as a wiki, anyone can sign up to improve and grow the content. We invite you all to edit, fix, and add to the pathway models in the exRNA portal at WikiPathways.