The University of Washington has announced two new research projects that will utilize cloud computing platforms from Internet companies such as Google, Microsoft, Amazon and IBM. According to the press release published on Genetic Engineering News, the University of Washington has won grants from the National Science Foundation to fund projects examining ocean climate simulations and analyzing astronomical images. Both of these projects will utilize cloud computing to examine and interact with “the massive datasets that are becoming more and more common in science.”
The University of Washington projects tie into a couple of major trends in the current era of the Web: there’s now much more data being created for the Web, or being transported to the Web; and we’re seeing Web technologies being used to analyze and make sense of that data.
It’s not only in scientific realms. We’re seeing this on the Consumer Web too, as Marshall Kirkpatrick explained this morning in an article about social media monitoring tools. He wrote that data mining tools are being democratized and used more nowadays, similar to how online publishing tools were democratized in Web 2.0. The cloud computing servers that the University of Washington will utilize are relatively cheap and easy to use Web platforms that will enable data mining on a scale not seen before. These projects will access a cloud datacenter established for educational use in 2007, through a partnership between Google, IBM and six academic institutions (including the University of Washington).
Oceans and Galaxies of Data
Bill Howe, a researcher at the UW’s eScience Institute, explained the impact of cloud computing on his ocean climate simulation project. Instead of running a simulation to test a single hypothesis, he said, climate scientists are now running long-term simulations and then sifting through tens of thousands of gigabytes of resulting data to discover trends.
Andrew Connolly, a UW associate professor of astronomy, explained that for his project analyzing astronomical images, cloud computing makes it easier to store and process information in the cloud and make the information available over the Web. He said that whereas scientists once competed for time on telescopes, recorded data and then studied the individual images in detail, now “telescopes continuously record high-resolution images that are available to all, providing millions of times more information.” So the shift is that the data gathering has been automated and is available on a much larger scale than before for scientists to analyze it.
Data Rich – And Useful
This current era of the Web, which some are calling ‘Web 3.0’ (but we frankly don’t know what it’s called yet) is increasingly data rich. The same thing could have been said about the Web 2.0 era, when oceans of ‘User Generated Content’ were created. However the world of sensors is rapidly pouring even more data onto the Web. Ed Lazowska, a UW professor of computer science and engineering, noted that “the rapid evolution of sensors is transforming all sciences from data-poor to data-rich.” He said that “the challenge is to use modern cloud computing resources, such as Amazon Web Services, and modern computer science advances, such as data mining and machine learning, to explore these massive volumes of data.” He claimed that this new computational science will be pervasive and will have enormous impact.
We’re always pleased when the Web has a meaningful impact on the ‘real world’ – and particularly on science projects such as this, where the findings could be profound.