“Many modern web sites need fast access to an amount of information so large that it cannot be efficiently stored on a single computer,” Nick Kallen wrote on Twitter’s blog. “A good way to deal with this problem is to ‘shard’ that information; that is, store it across multiple computers instead of on just one.
As an alternative to sharding, Twitter has developed a framework that can be used in lieu of either custom-building data-store systems or using untested open-source alternatives and is sharing the code with the public.
From a number of data-store building experiences, Twitter has “extracted Gizzard, a Scala framework that makes it easy to create custom fault-tolerant, distributed databases.”
As an example, Kallen provides “Rowz.”
“To get up-and-running with Gizzard quickly, clone Rowz and start customizing!”
The full code for Gizzard is also available.
He describes Gizzard as a middleware networking service that handles partitioning through a forwarding table, supports migration and prosecutes “eventual consistency.”
The implication of this may be that startups and smaller companies may better be able to deal with large amounts of data quickly, and thereby better serve the needs their users with fewer resources expended.