If The Graduate were remade today, the advice to young Benjamin Braddock might be “just one word… statistics.”
The explosion of digital data has generated a need for technology to store, serve, and analyze petabytes of data. But it’s also creating a lot of opportunities for people who are trained in the field of statistics. And more and more, that training involves learning R, the open source statistical programming language.
R was developed in the 1990s and has become the de facto standard for computational statistics and predictive analytics. Boasting over 2 million users, R has seen widespread adoption in part because it allows statisticians to do complex analyses without knowing other programming languages. Furthermore, as an open source project, R encourages users to add to the code, and there are over 2000 people who regularly write packages that others can use to help solve particular data analytics. “There is no statistical concept that cannot be rendered in R,” according to Norman Nie, inventor of SPSS (Statistical Package for the Social Sciences) and now CEO of Revolution Analytics, a company that offers a commercially supported, open core variant of R for its enterprise and academic customers.
Revolution Analytics Brings R to Big Data
Revolution Analytics released a new version of its Revolution R Enterprise package earlier this month which becomes generally available today. This new version contains an add-on package called RevoScaleR that is designed and optimized specifically to handle terabyte-class data sets without the RAM barriers of standard R. It also includes a collection of the widely-used statistical algorithms optimized for big data.
Currently, R offers command line programming. But in early 2011, Revolution Analytics plans to release an enhanced graphical user interface to open up these statistical tools even more.
Enterprise statistical software tools were once the purview of the financial and pharmaceutical industries. But data mining, business intelligence and statistical analysis are becoming more common business practices for many more industries – retail, gaming, information services, entertainment.
And while SAS claims it remains the leader in business intelligence, Nie says that the students who graduate today with advanced degrees in statistics are trained in R, something that creates a very strong ecosystem around the open source language. And Revolution Analytics hopes it can bridge the academic and the enterprise worlds of statisticians.