Tuesday, October 11, 2011

Managing the Influx of Data

According to the top experts in the field of data management, our society will need to deal with mammoth amounts of health and medical data in the coming years. To prepare for this influx of data means that professionals in the field will need to be equipped with up-dated management skills and specialized expertise to effectively be able to transform the medical and healthcare fields. Discussion on this important topic took place at the October 5th HIT Lunch Briefing held as part of the Capitol Hill Steering Committee on Telehealth and Healthcare Informatics Series.

Neal Neuberger, Executive Director for the Institute for e-Health Policy coordinator for the event as well as moderator, stressed the importance of making valuable research data accessible but yet there are still a number of issues to resolve in order to effectively cope with this huge influx of data.

He mentioned that the National Library of Medicine provides the world with the largest collection of data in all areas of biomedicine and healthcare and is able to deliver trillions of bytes of data to millions of users every day.

Speaking from a historical perspective, Dr. Gary Christoph, HHS Client Executive at Northrop Grumman said, “Medicare is a large source for claims data and most of it on magnetic tape. The Integrated Data Repository (IDR) was planned to be sourced from the Medicare claims data and then make the data available on a daily basis.

In 2003-2004 when Medicare began building the IDR, the aim was to have the last three years of data on disk and on a Teradata platform optimized for Online Analytical Processing (OLAP). In 2011, GAO soundly criticized the IDR for still not being complete since claims are being submitted 18 months afterward with the result that the information is more than a year out of date.

Christoph described how the Center for Program Integrity (CPI) is in place to handle fraud, waste, and abuse issues, and is building a new data warehouse sourced from claim processors. However, the HHS Office of Inspector General uses a separate data warehouse for its own investigations of fraud, waste, and abuse but even that data is now about a year out of date.

Christoph emphasized that with the new health IT environment, the data collected will expand from purely administrative data and will add vast amounts of clinical data. As Christoph sees it, the IDR will rapidly outgrow its current size as more state HIEs are established, as the NwHIN becomes more accepted, and as exchangeable EHRs become more prevalent.

Some of Christoph’s other thoughts are that states will become the hotbed of innovation—not the federal government. Also, industry will have to settle on standards without federal government intervention, security and privacy will need to be better enforced, and the Universal ID needs to be in place so people can be correctly matched to records.

For clinical trials and the resulting data to greatly contribute to the development of lifesaving therapies and cures, more patients need access to clinical trials, providers need to be made more aware of clinical trials, and costs need to be reduced for clinical trial recruitment and retention, according Jim Bialick, Vice President, Technology and Public Policy, for the Health IT Now Coalition.

He reports that clinical trials consume nearly 40 percent of pharmaceutical R&D funding. It has also been shown that a good percentage of pharmaceutical research studies are delayed for more than a month partly because people are not always informed about clinical trials and not because they don’t want to participate. It has been found that for every day that a drug is delayed, the sponsors lose up to $8 million.

In surveying patient participation in trials, it was found that 87 percent of patients are willing to share medical information that is included in their EHRs with researchers as long as it can’t be linked back to them personally. However, 91 percent of patients want to give approval before the information is shared among healthcare providers.

Cloud-based services although new to healthcare are in our future according to Jonathan Bush, CEO and Chairman, Athena Health. As Bush explained, cloud-based services must operate with a business model and provide enough incentives to use the services.

According to Bush, individuals and organizations exchanging data should have to pay in some way for this service. As he indicated, health IT has not progressed as fast as it should not because there isn’t appropriate technology, but due to the fact that there is no financial market for healthcare information as currently exists in other industries such in banking.

David Hartzband, D.Sc. Director for Technology Research at the RCHN Community Health Foundation and associated with MIT pointed out it is important to not only manage enormous amounts of data, but at the same time, be able to use ultra-large data sets efficiently. Today, the amount of data that needs to be analyzed is limited but increasing every year and as a result, analysts will be forced to work with more and more data on a huge scale.

A number of possibilities exist to analyze vast amounts of data from many sources. For example in the future, data could be used to determine how many patients in an EHR system had flu-like symptoms, be able to correlate the length of an acute respiratory infection while administering doses of specific medicines, and determine how effectively seasonal respiratory infections respond to different drug therapies.

As Hartzband said, “If we are going to be able to aggregate huge amounts of data, then new skills are needed to design and interpret the results. We are only at the beginning of using ultra-large data sets but eventually outcomes will improve, enable greater control of costs, and enable public health information to be better analyzed.”

Sandeep Purao, Ph.D., Research Director for the Center for Enterprise Architecture at Penn State University continued by adding his thoughts on handling the influx of data in the field and how important it is to add structure to deal effectively with the information explosion facing society today.

As he said, “Today, primary data comes from institutions and from public sources to support administrative and clinical tasks and systems. Since data is available everywhere in different formats this means that the entire life cycle of data has to make sense to be useful in healthcare.”

Critical areas to think about when working with data are scale, interoperability, security, and making sense of the data. Specifically, the tasks involve:

• Scale—Working with healthcare data can involve moving from gigabytes, to Terabytes, and then to Petabytes, working with cloud-based data, and then extracting the data

• Interoperability—Means doing important work on voluntary and consensus standards including HL7

• Security—Means dealing with malicious users, working with measures such as Role-Based Access, and preventing unlawful access to EHRs

• Sense-making—Means measuring data quality in crowd-based forums, searching for patterns and user behaviors, and delivering data to be used for e-health

For more information, email asimmons@e-healthpolicy.org or neal@e-healthpolicy.org.