Big Data Triggers Revolution in Discovery

Oncology Business News®, November 2016,

In the old days, not too long ago, doctors’ offices were full of paper records, which filled shelf after shelf. This cumbersome form of keeping track of patients was effective in its own way, but with the rise of modern medicine and the power of computing, there is a need to move beyond.

Atul Butte, MD, PhD

In the old days, not too long ago, doctors’ offices were full of paper records, which filled shelf after shelf. This cumbersome form of keeping track of patients was effective in its own way, but with the rise of modern medicine and the power of computing, there is a need to move beyond. Atul Butte, MD, PhD, believes in the growing power of computing to bring about the next stage of precision medicine, a world where a doctor can give you not only standard advice not to smoke or overeat, but can also tell you, for example, that based on a scan of your entire personal genome, you had better stay away from all forms of pesticides, as you are statistically more likely to develop cancer from this kind of exposure.

Butte, the keynote speaker at this year’s CFSTM Chemotherapy Foundation Symposium: Innovative Cancer Therapy for Tomorrow®, held at the Marriott Marquis in New York City November 9-11, heads the Institute for Computational Health Sciences at the University of California, San Francisco, which has taken on the role of harnessing the power of “big data” to bring about new cures for patients. He has spoken to numerous audiences about the exponentially increasing power of computing to capture more and more data, as well as the phenomenal miniaturization of chips and computers that recently delivered a genome sequencer that can fit in the palm of your hand. However, he also emphasizes that it is the job of humans to interpret that data and turn it into something useful and productive for mankind. Butte’s talk is entitled, “Translating a Million Points of Data Into Therapies, Diagnostics, and New Insights Into Disease.”

Butte, a noted expert in pediatrics and medical informatics, also serves as executive director of clinical informatics for UC Health Sciences and Services, which is working to build a data warehousing and analytics platform that can function as a tool for researchers to make use of the huge amount of clinical data coming from the five medical centers in the University of California system. Better understanding of disease and better therapeutic approaches to handling disease are the goals.

Data aggregation is not new, but it is a growing phenomenon, and the opportunities in this field are so vast that a plethora of new companies are springing up to capitalize on the services and research aspects of clinical data, Butte said. These developments are breaking down the barriers to information flow and are making it possible to learn and achieve more than ever before, he said. “Pharma companies, payers, and electronic health records are sources of data, but each of these are siloed right now. A few companies have been visionary, because they see the potential for aggregating this sort of data—integrating it. But I think more and more data on patients is going to be generated electronically. Medical practice is becoming a digital field, instead of pen to paper like in the old days.”

Butte himself is an entrepreneur, having founded two biotechnology companies, Personalis and NuMedii. Personalis provides DNA sequencing and human exome and genome testing for researchers and clinicians. NuMedii uses proprietary methods developed by Butte at his Stanford University laboratory to mine big data for drug candidates and biomarkers likely to prove helpful in combatting disease.

These companies have a lot of competition. Butte himself noted that 90 different companies have formed recently to leverage the power of machine learning and artificial intelligence in medicine. These are heavily concentrated in Boston, Pittsburgh, and the San Francisco Bay Area. In tandem with the rise of these companies, it has become much cheaper to contract for trials and other forms of pre-clinical testing. Butte believes these services vastly improve options for pharma companies looking for validation of their drug candidates while also opening the door to less-well-funded concerns in need of confirmatory trials.

“These contract research organizations are all over the world, and there are now websites that aggregate these companies: Assay Depot, Science Exchange. There are many others. So, you could learn how to do all of this testing for yourself or you could find a company that might be able to do the work for you. The business world has been outsourcing for a long time, and now you can imagine this process also happening in drug discovery—in biomedicine, as well,” he said.

With such an explosion of big data entrepreneurship, such rapid growth, and so many new entrants, the potential may exist for sloppy work on the part of some companies, but Butte believes that the high level of competition in this environment is itself a built-in safeguard that is enhanced by the reduction in prices brought about by the power of computing. “When the price is so low that you can order the same experiment from so many independent companies, the quality issue just goes away,” he said. You can cross-check the results from one organization with the results from another, and another.

He said even the quality of work done by many traditional research institutions and laboratories has also been questioned—and, in some cases, been found not to be reproducible, whereas newer, cheaper research groups can potentially do the job in less time and at a higher quality, equipped as they are with more advanced, more powerful technology.

Similarly, Butte believes that protection of patient records has been suitably addressed as the data aggregation movement continues. Whereas more and more institutions are working to eliminate barriers to interoperability and create larger and larger pools of data, which contribute to the statistical validity of research findings, the Health Insurance Portability and Accountability Act continues to do its job giving patients a say in how their information is used and restricting the latitude physicians and others have in sharing that information, Butte said. “The vast majority of institutions follow these practices.” In addition, data used at the pure research level is typically “de-identified” so that data on patient characteristics and sensitive financial or family histories are reduced or stripped from what is made available to second- or third-tier users of that information, he noted.

Restricted data sets are the answer to privacy concerns on the research level, he said, while acknowledging the rising incidence of patient data hacking that has plagued such giants as Anthem, which, last year, reported that unauthorized persons had gained access to 78.8 million patient records. That said, there’s plenty of evidence that physicians are working to head off this problem, as well they should, he said.

“I think physician practices need to take their information technology seriously. I think they do. I think escalating costs in implementing electronic health records—keeping them secure—might be a driving factor in these practices looking for aggregation relationships. It’s not just a physician with a computer anymore. I think we’re way past that.”

Ultimately, patients have long known the limitations of paper and siloed data storage systems, and they are among the strongest proponents for aggregation efforts that both improve data sharing and investigatory research into new drug candidates using warehoused data, Butte said.

“The unmet needs now are that patients don’t get enough access to their records,” he added. “That’s where health IT is going—toward patient engagement. Most people would want more access to their own records as patients, and more patients expect doctors to share information with each other at a level that is not happening. It is sad that many times information is shared by fax or by FedExing a printout of a record, whereas most patients expect that kind of interchange to be much smoother.”