Home Database Analytics Startup Aster Data Launches, Analyzes MySpace

Database Analytics Startup Aster Data Launches, Analyzes MySpace

Grid-computing startup Aster Data Systems will officially launch today, three years after it was founded. Aster, which began in the Ph.D program at Standford, is a provider of “massively parallel processing databases” for organizations that have mammoth quantities of data that need to be stored and analyzed quickly. The Redwood City, California-based company is backed by Sequoia Capital, Cambrian Ventures, and First-Round Capital.

Aster’s nCluster software allows companies with large amounts of data to store it on commodity hardware and scale with one-click, adding new servers as the data set grows. The company’s first major client is MySpace, which generates 100s of terabytes of traffic data from its 110 million monthly unique users. Mining that data to understand how customers use and interact with the site requires some pretty robust architecture.

Aster’s solution for MySpace uses a 100-node cluster of off-the-shelf commodity servers that can capture and load 100% of the data and run complex queries quickly. “MySpace needed to analyze complete datasets – not just samples or summaries. Sampling would completely miss infrequently occurring but highly profitable patterns,” according to Aster, which says that nCluster has allowed MySpace to work with all of its terabytes of data and avoid the need to sample.

nCluster works by splitting up the cloud into smaller bits that each have a specific task. “Loader” nodes load data from external sources (and export to them), while “worker” boxes keep data stored on local disks. A “queen” layer directs the entire operation intelligently routing queries to the proper node. The “loader” tier can scale independently as needed, say Aster. “This enables query load-balancing to eliminate hot-spots and increase performance, returning results in seconds or minutes versus hours or ‘did not finish,'” writes the company in a case study.

The software reminds me of 3Tera’s AppLogic (our coverage), which is a grid computing operating system that makes it easier for companies to deploy their own compute cloud on commodity hardware. nCluster is essentially the same idea, but with an eye specifically toward managing and querying massive databases.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.