They say Big Data is going to be big business, big innovation - a big deal. But how is it going to go down? Applied math and decision science company Mu Sigma announced more than $100 million in new venture backing yesterday, including from previous investors Sequoia Capital, bringing the company's total investment to $150 million. Mu Sigma provides big data services to some of the biggest companies in the world.

How do they do it? With a combination of math, science, creative thinking and long hours of hard work. As democratized publishing, network connected devices and the instrumentation of everyday life combine to create a great blue ocean of big data all around us, the latest Mu Sigma funding is a valuable opportunity to get a taste of how one emerging leader in that market combines technology, math and art to engage with this big trend. Not everyone agrees that outsourcing Big Data work like this is the solution, though.

Can Big Data Be Outsourced?

Mu Sigma says it exists to "enable businesses to institutionalize data-driven decision making." Its 1300 employees in Chicago and Bangalore help clients with marketing, supply chain and risk analytics. The firm says it "is arguably the world's largest pure-play decision sciences and analytics services company."

Employee reviews of the company on website GlassDoor paint a picture of hard-driving young employees working grueling hours for low pay, but learning a lot at a young company.

The seven year old firm helps clients with things like customer segmentation and purchase likelihood analysis in marketing, fraud detection and severity and statistical analysis for FDA trials in risk analysis and supply chain work like trend plotting, due date quoting, expedition optimization, location allocation "decisioning", etc. All based on data.

How can Mu Sigma compete in each of those tasks with other firms that specialize in one or the other? That's unclear, but the company has developed momentum based on its broad approach. Mu Sigma says that it's profitable, though the company declined to provide any specific financial numbers.

Not everyone believes that solutions like Mu Sigma are the answer to Big Data problems and opportunities. "I'm skeptical of the idea of end to end 'analytics outsourcing' right now," says Peter Skomoroch, of DataWrangling.com.

"There is value in having external experts embedded with internal teams to help with big data, but to compete companies will also need to build up in-house talent.

"It is tough to find good data people, and even more difficult to find ones with business sense and domain knowledge. Insight and creativity are not likely to be commoditized any time soon. The competitive advantage in this space will go to companies that build up unique datasets and build teams that know how to leverage them. Most game changing analytics is going to come from a small set of talented individuals, not an army of contractors."

In-house data scientists are incredibly hard to find, though. Cathy O'Neil, data scientist at ad startup Intent Media, says this is in part because "It is far less sexy to try to honestly find the confidence interval of a prediction than it is to model behavior."

"Data scientists are considered magical when they forecast behavior that was hitherto unknown, and they are considered total downers when they tell their CEO, 'hey there's just not enough data to start that business you want to start,' or 'hey this data is actually really fat-tailed and our confidence intervals suck.'

"In other words, it's something like what the head of risk management had to face at a big bank taking risks in 2007. There's a responsibility to warn people that too much confidence in the models is bad, but then there's the political reality of the situation, where you just want to be liked and you don't actually have the power to stop the relevant decisions anyway."

Perhaps given that reality, outside big data firm Mu Sigma is clearly a company with some economic wind in its sails. Deborah Gage at the Wall St. Journal's Venture Wire provides a good look at the company's fast growth and interesting training program in her coverage this morning.

Mu Sigma and Innovation

Reading previous coverage of the company's work elsewhere, one name keep coming up: Zubin Dowlaty, Vice President and head of innovation and development at Mu Sigma.

Dowalty spent the 1990's doing statistical modeling at UPS. Then he joined the publicly traded InterContinental Hotels Group, where he was first the Director of Analytics in Consumer Insight and then the VP of Decision Sciences. He was featured prominently in a 2008 New York Times story about corporations using Prediction Markets to surface cost-saving and other ideas from inside their companies. Dowalty was photographed for the story wearing a wizard's cap and holding a magical looking walking staff in his hands.

He built an elaborate system to invite the hotel company's employees to submit and vote on ideas, win rewards if theirs were selected and to surface via crowdsourcing strategic initiatives the company could act on. "We wanted to tap the creative class that may not be able to voice their ideas," Dowlaty told the Times.

Once at Mu Sigma, Dowlaty has become one of the company's most visible public figures. His statements, as the head of innovation and development at a firm so focused on innovation, are noteworthy.

In a January 2011 article from The Data Warehousing Institute on the rise of the data scientist, Dowlaty articulates the role of art and of science in big data.

"I'm not a big fan of the spaghetti method. It makes me nervous when people run a lot of analytic techniques just to get the answer they want, instead of being objective. Doing this job properly requires the rigor of a scientist. The scientist can see things that other people cannot see."

As a standalone statement, that doesn't sound particularly creative. It is important, though. "The 'spaghetti method,'" cautions Josh Wills, Chief Data Scientist at Cloudera, "frantically searching for a technique that gives you the answer you want (or potentially, the answer that someone higher up in the org wants), as opposed to using the scientific method. This is a big problem in the industry, and the theory is that using an external firm mitigates that habit to some extent. Being a good data scientist often means telling powerful people stuff that they don't want to hear."

Other statements from Dowlaty help put that sentiment about rigor in creative context. Mu Sigma itself uses a variety of different analytic models to tackle all the problems they engage with.

Dowlaty told Revolution Analytics, whose R statistics software Mu Sigma makes use of:

"We like to diversify our models...We have a portfolio of about 10 models that we'll run to assess the stability of the coefficient and the predictive capability of that particular model. By running all the models, you can see which ones are the best predictors."

Revolution says of Dowlaty's use of R at Mu Sigma, "The benefit of an 'ensemble' approach is that when new analytic techniques emerge, they can be brought into the mix without causing disruption. This makes the R especially valuable to Dowlaty, since the R software library evolves continually as members of the worldwide R community contribute new packages and programs."

In fact, both rigor and flexibility are key to the paradigm Dowlaty advocates. "The trend is toward a multi-disciplinary approach to extracting value from data," he told The Data Warehousing Institute early this year. "It's not just about math anymore. You also need technology skills, but what ultimately separates the analyst from the scientist is the dimension of artistic creativity. It's the soft skills that make the big difference."

That combination of skills is what enables the firm to tackle the incredibly complex work they do. Dhiraj Rajaram, Mu Sigma's CEO and the man who founded the company in 2004, spoke at the 2010 Predictive Analytics World conference on a panel with Mu Sigma customer Walmart.

Walmart Financial Services, which named Mu Sigma its Supplier of the Year in 2011, works with the big data company to analyze and optimize the marketing of its financial products.

Decision Management analyst James Taylor blogged the following summary of the conference presentation about the collaboration between Walmart Financial Services and Mu Sigma. This sounds like very complicated work.

"WFS uses transaction life analysis around run rate and growth, price / mix analysis, financial returns and qualitative analysis of the creative. Marketing Mix modeling, optimization, lets them see the effect of individual campaigns (there's a lot of Walmart stuff going on in the market), account for seasonality and manage at the store level. The idea is to make sure the marketing investment is optimized, targeted and repeatable.

"Marketing mix optimization uses weekly sales, store traits and demographics, event information and macro-economic data to see how effective specific events were and what was the contribution to the overall effect. What was the value or contribution of each element, did they cannibalize each other, did they resonate in specific areas etc."

That sounds like a potent combination of math, science and creative thinking. It's probably more a picture of the sector than of one company alone. Forrester analyst James Kobielus specializes in big data and says he's done one briefing with Mu Sigma but didn't detect any particular unique flavor to the firm's work relative to others in the sector. Mu Sigma hasn't yet responded to our request for comment on this article.

Perhaps this company is typical of the sector and the questions to ask about it are more general.

"My caveat with services like MuSigma is that they can analyze your data, but they can't change your business," says Cloudera's Wills.

"You are free to ignore what they tell you, and it is often the case that the answers they can give you are limited by your business practices and the data that you currently collect. The advantage of having an in-house data scientist, especially one with some programming skill, is that they can develop systems that collect better data so that they can come up with better answers.