One of the advantages of open source is that it can accelerate standards adoption on a level playing field. If there is a big enough problem to solve, smart people can attract the best minds to work together, investigate and share the solution.
That said, standards bodies often become little more than a parlor game for incumbent vendors seeking to position the standard to their market advantage.
In other words, there’s lots of talk, but not much code.
In such a scenario, it’s easy to end up with implementations of a standard that each works differently due to unclear or ambiguous specifications. I recently sat down with Viktor Klang, Chief Architect at Typesafe, one of the lead organizers of reactivestreams.org, an open source attempt to standardize asynchronous stream-based processing on the Java Virtual Machine (JVM).
Klang and his group—along with developers from Twitter, Oracle, Pivotal, Red Hat, Applied Duality, Typesafe, Netflix, the spray.io team and Doug Lea—saw the future of computing was increasingly about stream-based processing for real-time, data-intensive applications, like those that stream video, handle transactions for millions of concurrent users, and a range of other scenarios with large-scale usage and low latency requirements.
The problem? Lack of backpressure for streaming data means if there’s a step that’s producing faster than the next step can consume, eventually the entire system will crash.
ReadWrite: What is driving this shift in computing to reactive streams today?
Viktor Klang: It’s not a new thing. Rather, it’s more like it was becoming a critical mass as more people started using Hadoop and other batch-based frameworks. They needed real-time online streaming. Once you need that, then you don’t know up front how big your input is because it’s continuous. With batch, you know up front how big your batch is.
Once you have potentially infinite streams of data flowing through your systems, then you need a means to control the rate at which you consume that data. You need to have this back pressure in your system to make sure the producer of data doesn’t overwhelm the consumer of data. It’s a problem that becomes visible once you start going to real-time streaming from batch-based.
Users have been asking for more “reactive” streams for a long time, for building their own network protocols or for their specific application needs. Any time you need to talk to a network device, you want to use this abstraction. Anything that has an IP address.
With reactivestreams.org, we’re trying to address a fundamental issue in a compatible way to hook all these different things together to work while being inclusive. Long-term, the plan for this is to build an ecosystem to build implementations that can be connected to other implementations and then have developers building more things on top of it. For example, connect Twitter’s streaming libraries with RxJava streaming libraries, and pipe into Reactor, Akka Streams, or other implementations on the JVM.
RW: Who are key members today?
VK: Certainly Typesafe jumped in early, since we have an open-source software platform that deals with a lot of what the industry calls “reactive application challenges.” We were thrilled to have Twitter join, the Reactor guys from Pivotal, and