Make room on the bleachers, the robot reporter wants to sit down and watch the game. Sports statistics company StatSheet says it will have technology ready this Summer to turn statistics for hundreds of small college basketball games into richly reported blow-by-blow coverage of how the contests unfold.
People have been talking about robot reporters for years, but sports coverage is a logical, structured field for it to happen in and StatSheet says it will soon bring a product to market.
A veteran sports reporter can recall from memory all kinds of stories and history, but that’s no match for the bulk number crunching that a computer can perform to discover patterns and context over the history of a sports season or a player’s career. Engineer and StatSheet founder Robbie Allen says his company will soon launch technology that produces sports narrative that 90% of readers won’t be able to discern from human reporting of college basketball. Then he’ll expand into NFL, NBA, NHL and MLB games.
StatSheet today offers embeddable statistics to sports media sites around the web. The company licenses bulk stats from a vendor and then analyzes that data to draw out higher-level insights. Allen says the next logical step is to build narrative prose around those insights. Once he’s got the tools built to narrate one game, there’s zero marginal cost to apply them to hundreds of college basketball teams around the country. Many of those teams are small enough that they don’t get much attention from human reporters, Allen contends.
Human reporters know a team and a season, but Allen says they also “have their scripts written.” “They already think they know what to look at as the most interesting things that have happened,” he says. “I’m talking about codifying that knowledge, to build a wider corpus of interesting facts to draw from.”
Scientists in Belgium have built software that automates live video coverage of basketball games. It balances tracking the ball with capturing the most movement of players on the court and alternates between wide angle and close-up shots. Might players someday “play to the robot camera”?
Qualitative events like defensive plays are often not made explicit in sports stats but Allen says that’s the new frontier for stats companies and will become easier to incorporate in the future.
Even if at any given moment a human can generally beat a machine writer, “in many ways this going to surpass a lot of the sports media that is out there,” Allen says. “There are going to be times that any writer can outperform a computer, but when you look at the breadth it’s going to be hard to beat a computer.”
Allen says he’s not trying to replace human sports writers, just to augment their coverage. Sports media organizations are currently limited by the number of people they can throw at a league and at statistical analysis. There’s no reason not to automate much of that work, he says.
Traditionalists might doubt that the writing could possibly be as good, but a look at excerpts generated from a competing academic project called StatsMonkey makes robot reporters look pretty capable of the basics.
An outstanding effort by Willie Argo carried the Illini to an 11-5 victory over the Nittany Lions on Saturday at Medlar Field.
Argo blasted two home runs for Illinois. He went 3-4 in the game with five RBIs and two runs scored.
Illini starter Will Strack struggled, allowing five runs in six innings, but the bullpen allowed only no runs and the offense banged out 17 hits to pick up the slack and secure the victory for the Illini.
The Illini turned the game into a rout with four in the ninth inning.
That’s not perfect, but it’s pretty good! It’s quite basic, too. It will be interesting to see how much more StatSheet can offer in its robot coverage. Allen says he’s having a lot of fun building out complex flow charts and tracking for statistical anomalies.
“It’s going to follow a standard disruptive curve,” StatsSheet’s Allen told us, “maybe version one will be a little rough but there will be plenty of opportunity and incentive for improvement. It’s not like an algorithm is going to have writer’s block.”
He says he plans on offering a variety of writing voices to interpret the facts of a game in. Readers or publications could choose between the “over the top” vs the “subtle” coverage, for example, depending on their tastes.
There’s all kinds of artistic and ethical ways to consider this vision. Will Google punish these non-human content creators? Should it? If a reporter breaks a news scoop, is its engineer responsible for making sure it protects its sources? How will an athlete feel when they graduate from a minor sports league covered by machine media into the big time and a human sports reporter’s beat? “They may be disappointed,” Allen quips, “because the coverage may not be as good.”