I have been thinking a lot during the past year about biomarkers… actually, more specifically about the data needed to develop the best biomarker panels, the sample sizes needed, and how to convince patients – and even healthy individuals – to participate in research to help develop these biomarker panels. I have come to the conclusion that we might all be doing it wrong. In fact, I think we all KNOW we are doing it wrong deep down inside.
First, don’t we all believe that the best biomarker panels are going to be based on longitudinal analysis of biomolecules? In other words, the eventual successful biomarker approaches will mostly measure “rate of change” of the biomarker panel and not just presence/absence or something similar like quantitative level differences. Are there examples of good biomarker panels that AREN’T rooted in longitudinal measures? Feel free to post your favorite example of a good clinical biomarker/panel in the comments section. Think of the PSA test as an example. A single PSA level measurement isn’t generally accepted as a good indicator for surgical intervention as it can be biased by other things. The utility of PSA testing may primarily reside in an examination of the longitudinal measurements. Work in the NEJM suggested that the rate of rise of PSA during the years prior to a diagnosis of prostate cancer can indicate altered severity of the disease (D’Amico et al, N Engl J Med 2004). So, why aren’t we all focused on collecting longitudinal samples? Is it because our biospecimen of choice isn’t all that really “non-invasive” even though we like to try and convince ourselves and our colleagues that it is?
Coming from a neuroscientist this is crazy talk bordering on sacrilege. Everyone knows that cerebrospinal fluid (CSF) is the best place to go looking for CNS biomarkers. Come on, they say, a spinal tap isn’t that bad. We have the approach down – no pain at all. We can even get samples from age-matched healthy controls without too much issue. I believe all of this, of course. I know there are excellent clinical teams out there that can achieve precisely those things… the problem is that there aren’t enough people out there willing to undergo it just for the sake of research. We have what I would describe as a chicken:egg problem here. Of course folks would be willing to donate CSF immediately if they knew that the biomarker panel would tell them something about their personal risk for disease AND if that knowledge would help their treating physician “change their future”. But we aren’t there yet… we are still in the phase of asking folks to participate in research with the hope of developing those idealized biomarker panels. Couple that with the fact that CSF collection is costly – in both donor time and clinical costs – and we are left with an underpowered study. Look, I still think it is the best place to look for CNS-disease biomarkers, but I lose sleep at night worrying about our actual statistical power to detect biomarker panels with anything but a pretty large effect size.
Secondly, are we really getting as rich of a data collection on our research participants as we should be (or could be)? How many of us are using data in Excel sheets that was transcribed from medical notes? Or maybe we are “cutting edge” by getting all of our patient data directly from the clinic’s electronic medical records… achieving that level of integration is definitely to be applauded. But how much data are we leaving “on the table”? I suggest that we are leaving a whole lot of meaningful data out there. This came to the front of my mind recently when I was traveling with a colleague from our hotel to the site of our mutual meeting. She was trying to remember exactly what she had for dinner the night before so she could log it into a diet tracking app on her smartphone. As luck would have it, I was sitting next to her and we had dinner together the night before so we essentially tag-teamed our remembrances of what she ate… it probably would have been easier without the wine we had WITH dinner. Anyway, my point is not mind-bending as researchers have known about it for a long time… self-report data becomes pretty unreliable quickly. So, if I wanted to include data about her dietary choices a few days after she made them it would probably be a very incomplete picture. But the remedy for this is simple – we should be doing more interacting with our study subjects via their already existing habits for tracking data in real time. You are probably thinking that this daily tracking of diet information isn’t all that common… not so fast. MyFitnessPal, an app that tracks activity and dietary choices, has over 40 million users. The new iPhone has an activity tracking feature built into the phone. The whole area of “wearables” and the “internet of things” is gaining significant momentum. Jawbone’s UP sleep tracking bracelet is worn by millions of people each night as they drift off to sleep. We are collectively leaving this data “on the table” and it could be really important when we start to attempt to identify the best biomarkers associated with our disease or condition of interest.
Last, do we need to wait for an official clinical visit to phenotype our patients / subjects? Couldn’t we collect data on a daily basis by interacting with them in their own home through the internet? Many useful pieces of data could be collected by simply asking a few questions or having them complete a few web-based tasks once a day or once a week or once a month even. We wouldn’t have to go for months without assessing them during that clinical follow up visit. And what happens during that clinical visit? Is that truly the “average” snapshot of the patient and their progress? Or are they changed – and their habits changed – simply by the need to go to the doctor that day. White coat hypertension is a great example of how a patient can respond differently phenotypically when at the doctor’s office. We can do better.
Thinking about these three points my laboratory has started to do some work to change our path in our biomarker studies. We have started to reduce our RNA sequencing assays to practice in smaller and smaller samples and in biospecimen types that can be collected by subjects on their own in the comfort of their own home and mailed to our laboratory. The idea is simple – we need larger sample sets to develop the best biomarker panels; therefore we need to utilize biospecimens that are easier and less invasive to collect. I concede that it could be that our biomarker may never even end up in the fluid that we are assaying, but that is exactly what we need to explore. We have started to partner with companies producing wearable devices that measure various parameters. Take the sleep data for example… it may not be as rigorously collected as one can achieve during an overnight stay at a sleep study center, but we are betting it is “good enough” since we can get data for 30 days in a row or more. To me, it just seems logical that it will generate a much clearer picture of the study subject. Eventually we will have years and years of data from the same subject measured with the exact same device in the privacy of their own home. Additionally we have started to explore the area of phenotypic assessment over the internet. This is easier to do in some fields compared to others. My lab is interested in the genomics of cognition. We developed a short web-based visual reaction time and episodic memory task at our study site – www.mindcrowd.org. We have had almost 50,000 unique test takers during the past two years. The insights provided by this large dataset and the various slices and dices we can do of the data across different demographic variables (even the rare combinations) is pretty staggering. Soon we will begin to collect two year decline data on thousands of test takers who participated in our study during the first month we started.
I guess research is in a constant state of “doing it wrong” actually. We are always learning. However, I really feel strongly that the speed at which “personal technology” – like cell phones – changes today is so fast that we in the clinical translational research area have been left behind a little bit. We definitely need to catch up, because we are leaving a lot of data on the table.