NIH’s Ambitious Precision Medicine Research Program

Mr. John Wilbanks is the Chief Commons Officer at Sage Bionetworks. Previously, Wilbanks worked as a legislative aide to Congressman Fortney “Pete” Stark, served as the first assistant director at Harvard’s Berkman Center for Internet & Society, founded and led to acquisition the bioinformatics company Incellico, Inc., and was executive director of the Science Commons project at Creative Commons. In February 2013, in response to a We the People petition that was spearheaded by Wilbanks and signed by 65,000 people, the U.S. government announced a plan to open up taxpayer-funded research data and make it available for free.

Prostatepedia spoke with Mr. Wilbanks about Sage Bionetworks role in All of Us, the National Institute of Health’s ambitious precision medicine research program.

How did you come to work at Sage Bionetworks?

Mr. John Wilbanks: I got involved with Sage when it was first beginning. Sage was an informatics unit of Merck, and in 2009, they began to explore what they could get for the unit. But we convinced them to spin it out into a nonprofit organization instead of selling it off.

I got involved then as a board member because I was able to help negotiate what the IP structure would look like, how we would get rid of some of the patent constraints and other kinds of intellectual property so that we could build a nonprofit. I have been involved ever since, at first as a board member, then as a consultant, and then in 2012, as a full-time employee.

I lead the Governance team at Sage, which means that my group works on things like informed consent, clinical protocol design, data-sharing and access policies. We work on strange and weird structures that enable collaboration in a variety of ways, and we have a pretty broad view across the organization as a result.

What is the All of Us program?

Mr. Wilbanks: All of Us is a longitudinal cohort study. It is fundamentally an attempt to enroll a million people and to characterize them as completely as we can. This means we collect and look at their health records, pharmacy records, their environment, biospecimens, metabolic data, their genomes, data that we collect from their devices and smartphones, surveys over a ten-year period—you name it. Then, we make that data liberally available so that we can run all sorts of interesting queries.

We’re trying to take the Framingham Heart Study model and reimagine it for the 21st Century. Framingham is a breakthrough study, but it studied one town in Massachusetts, and then its diaspora over time. That means that it’s fairly white, and it has all these biases in it. Also, it doesn’t study anything besides heart health.

All of Us aims to take the idea and the impact of a study like Framingham and reimagine it using a completely modern, digital approach to everything. What would happen if you made that data liberally available? What would happen if you made a point of including 700,000 out of 1,000,000 being from populations that are underrepresented in biomedical research?

That’s one of the reasons it’s been hard to talk about; it’s not a study of prostate cancer. It’s a study that will involve hundreds of thousands of people, some of whom may have prostate cancer, some of whom may have survived prostate cancer, and some of whom may develop prostate cancer. But that’s not the focus. The idea is that we’d be able to subdivide that cohort endlessly in ways that let us think about public health and identify populations for sub-studies as easily as possible.

So then, the goal is to pull in as much data about these people as you can and then make inquiries into the data in various ways?

Mr. Wilbanks: That’s right. And we also want to open up who gets access to the data. It’s one thing to say the people at Harvard can run analytics; it’s very different to say that the community being studied can run analytics. That is also part of the design.

A lot of the questions that will be asked will come from advocates who know what questions need to be asked, questions the scientists don’t know need to be asked. We’ve been trying to design the system to maximize the number of people who are allowed to be data analysts and not just data donors. In many cases, we hope that the donors and analysts are the same people. That level of engagement leads people to start asking questions, not just providing information.

Will people be getting their own information back? Obviously, wearables and devices would feed information to their own electronic records, but I know they’re going to be doing some genomic tests. Will people get the results from those kinds of tests?

Mr. Wilbanks: Yes The study is guided by a set of core values and principles, and one is to prioritize the participant’s right to their data. All data provided by the participant will be provided back to the participant—nothing about me without me. We’re still figuring out how to do that because it’s really complicated.

Don’t you de-identify data first? Then, how do you re-identify it?

Mr. Wilbanks: That’s a little easier. You have to de-identify data before you get it to the data user. But, it’s easy to know for a given sample who that sample came from because that’s what allows us to connect it to the demographic data.

It’s relatively easy to get it back to the individual, but the question of what to return to them is difficult. If it’s their genome, do we give them their BAM files, which are massive? Or do we give them a VCF, which is the differences between their genome and the reference genome, which is tiny? Do we give them images? How many times do you let people download data because the cloud transfer cost would be high? How do we get consent for that? It’s complicated.

We still have to figure out exactly how we’re going to do all of those things, but it is a core principle of the study that nothing about you happens without you, and by the end of the study, you should have as much of your entire electronic health records in one place as possible, in one form. You should have your genome, all of the survey data you offered, all your wearable data, and you should have all the ancillary information we discovered about you. You should be able to take that with you and do what you want with it.

What is Sage’s role in all this?

Mr. Wilbanks: We are a sub-awardee of what’s called the Participant Center and the Participant Center is led by the Scripps Translational Science Institute in San Diego. We have two different lines of work inside the program, two core jobs. One is governance-based. We work on the clinical protocol, informed consent, and data-sharing systems. The other job is digital health technologies, and that’s a different team than mine. They work on building software modules that sit on smartphones and pull data off as measurements. They design them, figure out how to validate them, and how to feed them into the technology system.

You’re basically trying to figure out how you can pull data from the apps or wearables that participants already use?

Mr. Wilbanks: That’s part of the DHT group, and that’s led more by Scripps. We use the features of devices.

For examples, we think we can get a tremor measure for neurodegeneration with a module that measures the accelerometer in a smartphone. We can measure their gait by having them put their phone in their pocket and taking 20 steps forward and 20 steps back. We can measure phonation through a microphone. We can measure memory and tapping through the touchscreen.

We want to design modules like these that are clinically validated to measure those things so that anyone who wants to measure gait, lung capacity, memory, or what have you can rapidly access that inside the All of Us app or a related app. And they should feel confident that the data is relatively consistent and valid.

