Researchers from the Pervasive Technology Institute at the Digital Science Center at Indiana University are working on developing and using cloud computing techniques to support life science research. NIH awarded the university $1.5 million in grant funding to construct an experimental supercomputing network called “FutureGrid”.
Technological advances have made medical and biological research increasingly data-rich in recent years and scientists believe that this trend will continue to accelerate. In the future, processing extremely large sets of digital data resulting from gene sequencing and other medical research technologies generally cannot be met by a single facility or supercomputer.
Cloud computing provides a way to outsource computing infrastructure and can create virtual supercomputers with greater computational power than can be provided by any one facility. Clouds also support new data parallel technologies that can process massive data sets.
Users of clouds can access nearly unlimited computational power created by pooling distributed computation resources and using simple and straightforward web interfaces. This eliminates the need for users or their institutions to own and maintain large expensive computational equipment, and in addition users don’t need to have detailed technical understanding of the computational resources supporting their research.
“Cloud computing approaches are likely to affect research in the coming years, “said Principal Investigator Geoffrey Fox, Director of the Digital Science Center. “These technologies hold significant promise in the life sciences and medical sciences as they offer the potential for greater computational power and faster speeds at a lower cost.”
The project team is developing a software infrastructure to make use of the substantial hardware and networking investment made by Indiana University and the National Science Foundation (NSF). Both the university and NSF have worked together to develop “FutureGrid” a national experimental testbed, and TeraGrid, a national network of high performance computing resources. In the future, the project will also harness commercial cloud computing infrastructure such as Amazon Web Service, Microsoft Azure, and other open source software.
In addition to developing new cloud computing approaches, the research team will partner with several IU life science research teams to apply and test these techniques in their specific areas of life science research. These areas include projects related to population genomics as well as projects involved in assembling and sequencing gene fragments. Cloud technologies will also be applied to gene family clustering and the visualization of their structure in three dimensions.