<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
        <channel>
        <title>Profiles - ReadWrite</title>
        <link>http://readwrite.com</link>
        <description />
        <language>en</language>
        <copyright>Copyright 2012 SAY Media, Inc.</copyright>
        <managingEditor>readwriteweb@gmail.com</managingEditor>
        <docs>http://blogs.law.harvard.edu/tech/rss</docs> 
        <lastBuildDate>Thu, 03 May 2012 09:03:00 -0700</lastBuildDate>
        <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://rww.superfeedr.com/" />

                    <item>
                <title><![CDATA[Boutique Chic: Five Great Analysts Who Are Under the Radar]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/fields/shutterstock_analyst.jpg" style="" />
			</span>
There's a reason that IDC, Forrester, and Gartner are so big. They offer scale and coverage that small firms can't match, and they attract industry heavyweights who can make or break emerging technologies. But there's a downside to scale. Unless you're a corporate whale, it's easy to get lost in the shuffle, and getting that superstar on the phone in a pinch might take more time than you have.</p>
<p>I'm certainly not suggesting that you throw away your existing subscriptions, particularly if you're a vendor or solution provider. Put some effort into those relationships, and they'll pay themselves back several times over. But there's something to be said for the little guy, and there are hundreds of smaller analysis firms that can provide you with the kind of service and support you need to make informed decisions on a daily basis.</p>
<p>There's no way to provide a comprehensive list of analysts or coverage areas in small firms, but I've chosen five analysts who exemplify the kind of breadth in business model, coverage areas and perspective you can find when you look beyond the Big Three. Full disclosure: I've worked with some of these people before, but don't hold that against them.</p>
<p>Billy Pidgeon<br /><a href="http://www.m2research.com/" target="_blank">M2 Research<br /></a>Coverage Area: Gaming</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/billypidgeon.jpg" style="" />
			</span>
The gaming industry is a tough nut to crack. It's an art, a business and a unique exercise in supply-chain economics. Plenty of analysts cover financials ("300,000 units shipped!") and tech ("11 million polygons!"), but most leave the games themselves to the press.</p>
<p>M2's Billy Pidgeon understands all three worlds. While he's spent the last dozen years at various research houses, Pidgeon will always be a gamer at heart. He's produced <a href="http://www.mobygames.com/developer/sheet/view/developerId,805/" target="_blank">more than 20 games</a>, including major releases such as 1997's Turok: Dinosaur Hunter. This street cred gives him access to insights and talent that more buttoned-up analysts might miss. If you're looking for one-on-one practical advice about the gaming market from someone who's been there but also gets the big picture, check him out.</p>
<p>The Guys at RedMonk<br /><a href="http://www.redmonk.com/" target="_blank">RedMonk<br /></a>Coverage Area: Multiple (Tech-Related)</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/redmonk.png" style="" />
			</span>
If you're a <a href="http://en.wikipedia.org/wiki/Firefly_(TV_series)" target="_blank">Firefly</a>&nbsp;fan, think of RedMonk as the BrownCoats of the analyst world. If you're not, their motto should tell you what you need to know. "Analysis by the people, for the people" says it all. I would have chosen just one of their four analysts, but that would have violated their whole "community" vibe.</p>
<p>RedMonk tips its hat to the open-source world it covers by giving away its research, believing that an open discussion provides the greatest benefit to everyone, including their paying customers. They make their money from consulting services that start at a flat $5,000 per year, increasing with the size of your company or your consulting demands. For your money, you get access to very astute technical minds focused on helping vendors produce tools that developers will actually want to use. As the business model might suggest, it's a very populist approach in which the end user, IT manager, or systems analyst is a lot more important than the CIO, which is dramatically different than the coverage aims of most larger firms. If you're a software developer, $5,000 a year is a very small price to pay for a contrarian perspective.</p>
<p>David Schatsky<br /><a href="http://greenresearch.com/" target="_blank">Green Research<br /></a>Coverage Area: Sustainability</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/david_schatsky.jpg" style="" />
			</span>
Sustainability is no longer just hip; it's an essential (and sometimes mandated) part of doing business, sitting on a growing pile of hard science. It's a big industry, so hundreds of consultancies have bolted on an "eco-" to get your business. It's tough to weed out the pretenders.</p>
<p>David Schatsky has a background in technology, policy and finance. He also spent nearly 10 years at JupiterResearch as a Research Director and President (yet more disclosure: He was also my boss there for a while), so he understands the analyst gig. But what sets him apart from the rest of the eco-kids is his understanding that he shouldn't do it alone. When he founded Green Research, Schatsky brought in David Meyers, an environmental heavyweight, to build out the company's real-world expertise and complement his research experience, and they've further rounded out their expertise with associated content providers. The result is a small, personalized shop that should be able to address most of your environmental concerns directly, but has the connections to pull in other experts where needed.</p>
<p>Tony Byrne<br /><a href="http://www.realstorygroup.com/" target="_blank">Real Story Group<br /></a>Coverage Area: Content Management</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/tonybyrne.jpg" style="" />
			</span>
Real Story Group doesn't work with vendors they cover. At all. No consultations, white papers, or appearances at vendor events - nothing that could possibly influence their coverage. This independence irritates the industry and helps their clients (anyone working with content or knowledge management) trust what they read. While RSG has a number of top-notch analysts (<a href="http://www.realstorygroup.com/Who-We-Are/Analysts/15-Regli" target="_blank">Theresa Regli</a>&nbsp;deserves a shout-out, particularly regarding international content management issues), the man behind the business model is Tony Byrne, the company's founder.</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/consumerreports.png" style="" />
			</span>
RSG's Evaluation Reports are their most popular deliverable, largely because of their Consumer Reports-style comparison charts. They aren't cheap (running around $2,500 per report), but they can save you tens or hundreds of thousands during your evaluation process and give you the answers you need to ask the right questions of your vendors. Byrne is convinced that RSG's objectivity and laser focus will convince most one-off purchasers to stick around as clients for further research, as well as advisory services to help manage the tools and content with the software you've bought. So far, so good.</p>
<p>Laurie Orlov<br /><a href="http://www.ageinplacetech.com/" target="_blank">Aging in Place Technology Watch<br /></a>Coverage Area: Seniors, Health Technology</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/laurieorlov.jpg" style="" />
			</span>
Seniors are our fastest-growing demographic segment, and the technology required to help them age is of tremendous social and financial importance. So it's strange that until fairly recently, most major research firms treated the category like an afterthought. Laurie Orlov is one of the few experts in that space, and the foremost authority in the study of using technology to remain in the home as you age. In fact, she kind of created it.</p>
<p>Jeff Makowka, AARP's Senior Strategic Advisor, Thought Leadership, explains her impact: "She's a real visionary. She took her past life (as a Forrester analyst) and overlapped it with a caregiving experience and basically thought up the category. Solutions already existed, but she defined and legitimized Aging in Place Technology."</p>
<p>Like every boutique analyst, Orlov's journey is unique, and probably impossible at one of the largest firms. Small firms will never give you the coverage of the Big Three, and can't shout your voice as loudly to the world, but they do a great job of filling the gaps if you're willing to do some searching.</p>
<p>Have you had experiences with small research firms? Let us know who you've used and how it worked.</p>
<p><em>Lead image courtesy of <a href="http://www.shutterstock.com">Shutterstock</a>.</em></p>
                    ]]></description>
                <link>http://readwrite.com/2012/05/03/boutique-chic-five-great-analysts-who-are-under-the-radar</link>
                <guid>http://readwrite.com/2012/05/03/boutique-chic-five-great-analysts-who-are-under-the-radar</guid>
                <category>Analysis</category>
                <pubDate>Thu, 03 May 2012 09:03:00 -0700</pubDate>
                <author>Cormac Foster</author>
            </item>
                    <item>
                <title><![CDATA[OpenStack Leader: Open Source Needs to Rethink Its Priorities]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/Josh%252520McKenty%25252C%252520Piston%252520Cloud%252520%252528150%252520sq%252529.jpg" style="" />
			</span>
Philosophically, the open source concept borrows some selected elements from socialism.  It upholds a notion of the "common good," it eschews the appearance of authority or hierarchy, and it often frowns upon capitalizing on one's own work, insofar as being exclusive.  In practice, however, open source projects may look less like Big Brother from <i>1984</i> and more like Big Brother from reality TV.</p>

<p>Joshua McKenty's still-young career is, compared to those of other capitalist executives, surprisingly replete.  He's led development teams for the Netscape browser, and is intimately familiar with Netscape's successors at Mozilla.  His next stroke of luck was with the space program, helping to create and then lead one of the world's most successful cloud computing projects, NASA Nebula.  His work with NASA spawned the open source community's most successful - and perhaps most important - project in the last few years, the <a href="http://openstack.org">OpenStack cloud operating system</a> - and he sits on that project's governing body.  In-between jobs, he just happened to pioneer <a href="http://secondmuse.com/portfolio/view/GEM/">an earthquake modeling system for the World Bank</a>.</p>
<p>What McKenty's learned from a life that may not be one-third of the way over yet, are the lessons you'd expect to tell your grandkids.  <a href="http://www.readwriteweb.com/cloud/2011/09/the-dream-of-openstack-in-a-co.php">Now he's the CEO of Piston Cloud</a>, the first commercial vendor dedicated to OpenStack.  Already, he has a boatload of life lessons for the open source development community at large, and he's not about to wait for grandkids to come along to start sharing them.  With a notable degree of eagerness and enthusiasm, Josh McKenty shared his insights with RWW.</p>

<h2>Open source vs. customer focus</h2>

<p>McKenty's story begins with Netscape.  His time there began with the somewhat confused period late in the company's existence as a division of AOL.  Immediately he learned that "open" comes in many shades and colors.</p>

<p>He calls the late period of Netscape 8 and 9 "probably the most complicated and nuanced open source environment you can imagine.  Netscape was released as the Mozilla code base, and so everyone thinks of Firefox being to the benefit of Netscape.  But <a href="http://mpl.mozilla.org/scope/tri-license-and-the-gpl/">the code is a tri-license</a> that allows Netscape to close it and develop it for proprietary [purposes] afterwards.  Then you have the fact that Firefox, which everybody thinks of as being a Mozilla project, was actually a fork of Mozilla by a sole individual that Mozilla then reverse-forked back into their organization and turned into a $300 million-per-year business, which they did not share."</p>

<p>It's ownership and licensing schemes such as this, McKenty says, that make personal politics more prominent in open source projects than philosophies and governance models.  He believes one of the Google Chrome project's greatest strengths comes from the insistence by the Chromium development team - to which Google appears to be adhering - that the code base remain "fully open-sourced, in the open."</p>

<p>It was at about this time when Mozilla began mitigating what the community described as "the tooltip bug" (typically with an amalgam of punctuation attached), and what the organization officially recorded as dozens of related bugs (<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=218223">just one culmination was recorded here</a>).</p>

<p>"It was a bug that was recorded and watched and re-reported and duplicated 460 times over 7 years," McKenty relates.  "And the response from the Mozilla developer community was always the same:  'If you really cared, you would learn how to code and fix it yourself.'</p>

<div class="super-pullquote"><em>&ldquo;We are an open source company, and every open source company lives or dies by their ability to balance their interaction with the community with their interaction with their customers.&rdquo;</em><br /><span style="font-size:8px">Joshua McKenty<br />CEO, Piston Cloud</span></div>

<p> "It has always been the worst part of many open source projects, but I think Mozilla's more guilty of it than anyone else: the attitude that the developers tend to develop over time that <i>they</i> are the important users of the product," the <a href="http://www.pistoncloud.com/">Piston Cloud</a> CEO continues.  "And this has never really been true.  That's what's interesting about most open source software:  The developers who are really deeply engaged in building it may have started out solving their own pain.  But if they're really successful, there are usually two or three orders of magnitude more people using it than actually building it.  So the disconnect between what it is and what it needs to become, gets larger and larger.  This is why the open source projects that really survive in the long term figure out how to build that bridge back to the end user's requirements.  And those are often commercial entities."</p>

<h2>Stage fright</h2>

<p>After Netscape, McKenty went on to be a software architect and business developer for Flock, Inc., whose product was a socially-oriented Web browser built on the Mozilla code base.  While there, he tells us, "we would hire new developers turn them loose, and say, 'Every commit that you make is going to be looked at by other people in the world.'  That's a terrifying experience, especially if you're a young coder or you're new to the code base... It has nothing to do with the philosophy of open source.  It has to do with a sense of embarrassment or nervousness to have your daily commits be scrutinized by folks that maybe you think of as being more experienced than you."</p>

<p>McKenty points to the very project he oversees now - OpenStack - as one example where important components are <i>not</i> produced in the open: for example, support for IPv6 contributed by NTT Data of Japan.  "[Of] the 260-odd folks who are actively contributing code, as opposed to design features or documentation or localization or whatever else... about a third of them are <i>not</i> building in the open.  They all <i>design</i> in the open; that's the community requirement.  But then they go home and they code for a couple of months until they have something that they can show, and then they make a big code drop, and then we attack it."</p>

<!-- <h3>Next page: To space and beyond...</h3> -->

<p><!--nextpage--></p>

<h2>To space and beyond</h2>

<p>Some of the earliest Mercury astronauts credit their good fortune with having been at the right place at the right time.  By virtue of being teamed with just the right group of consultants in 2009, Joshua McKenty found himself as a lead architect on a project launched at NASA.  <a href="http://www.readwriteweb.com/archives/from_a_basement_to_the_stars_how_the_openstack_clo.php">Called Nebula</a>, it was America's most significant government-funded computer research facility since the heyday of the supercomputer.</p>

<p>On paper, McKenty worked with a firm called Anso Labs.  He soon learned that doing any kind of government project involved a separation of what you see in reality from what's on paper.</p>

<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/Josh%252520McKenty%25252C%252520Piston%252520Cloud%252520%252528Chicago%252520garb%252529.jpg" style="" />
			</span>
"Despite whatever retrospective history has been applied to it, [Anso] was simply, at the time, a maneuver to deal with NASA internal politics around contracting," McKenty tells RWW.  "The Nebula Project itself was built inside a larger IT contract that employs a couple of hundred people, and we were two or three subcontractors down from NASA proper at the time.  So we were just trying to guarantee that we could keep our developer team together working on Nebula, while NASA went through a re-compete on the contract."</p>

<p>Anso made it possible for Nebula to continue for as long as it did, keeping the project open and sharing its benefits with other developers through OpenStack.  But that was only for so long.  Eventually Nebula was de-funded, and the operation was folded back into NASA's existing supercomputing projects.  There, McKenty says, outright opponents to cloud computing have taken full advantage of the opportunity to do next to nothing with Nebula.</p>

<div class="super-pullquote"><em>&ldquo; It has always been the worst part of many open source projects, but I think Mozilla's more guilty of it than anyone else: the attitude that the developers tend to develop over time that <i>they</i> are the important users of the product.  And this has never really been true.&rdquo;</em><br /><span style="font-size:8px">Joshua McKenty<br />CEO, Piston Cloud</span></div>

<p>Meanwhile, Anso Labs was acquired by cloud service provider RackSpace.  While some sources <a href="http://www.theregister.co.uk/2011/02/10/rackspace_buys_openstack_partner/">played the acquisition as though it were a conspiracy</a>, McKenty tells us that RackSpace's motives are purely genuine: to see OpenStack succeed, and to continue what their own people, essentially, started.<br />
"I think what Rackspace has done with the contribution of Swift to OpenStack is really unprecedented in their market segment," says Piston Cloud CEO Joshua McKenty.  "But RackSpace does not want to be a software company, and they don't really even want to be an open source software company.  They want to be a fanatical support company; that's who they are, that's their DNA.  Now they have this core open source project... around which they can provide their service.</p>

<p>"But to my mind, OpenStack isn't finished," he continues.  "We haven't finished changing the world yet, and I wanted to stay focused on building out OpenStack to what it really needs to be."</p>

<h2>Spinout from the Nebula</h2>

<p>While many of McKenty's personal best friends went with Anso to RackSpace, he stayed out to pursue his own dream.  Exactly what that was hadn't quite formed yet, even as late as the fall of 2010.  He took a six-month sabbatical, during which he found time for that little earthquake modeling project for the World Bank.  He joined a working group on the role of governments and organizations in building technology infrastructure, along with a handful of friends and associates - Vint Cerf, Sergey Brin, and Vivek Kundra.</p>

<p>"None of those environments actually gave me the same opportunity to understand the requirements that I've gotten out of being the CEO of a startup," McKenty tells us.  "The reason to be Piston Cloud right now - aside from the fact that it's going to be the best business ever - [is] to be in the room with the people who are going to use the product, and actually talk to them about what they need.  It's very hard to get in the room with the folks who will be most impacted by the technology without being a vendor."</p>

<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/cloud/assets_c/2011/03/openstack_logo_0111-thumb-150x150-26160-thumb-150x150-27180-thumb-150x150-28472.png" style="" />
			</span>
Despite the most honest and thoughtful community outreach, says McKenty, a non-commercial or semi-commercial open source project simply does not garner the level of confidence from its customers that a commercial project does.  Almost immediately upon building Piston Cloud, McKenty found himself in direct communication with existing OpenStack users who had never been in contact with anyone representing OpenStack in the past.  Being a CEO makes one more accessible than being a committee.</p>

<p>One example:  A disaster risk reduction facility is headquartered in Indonesia, but has OpenStack-based cloud services in Australia and customers in Australia.  It wanted to expand its cloud to Jakarta, and needed Piston Cloud's help.  It's the type of customer contact Anso Labs would have only dreamed about, and now it's coming directly to the source.</p>

<p>"Nobody in the OpenStack community even knew that the Australian government was evaluating it, let alone had pushed it into production," remarks McKenty.  "In one phone call from an inbound request, I ended up on the phone with the folks in Australia who had been operating OpenStack at scale, and was able to talk to them about <i>exactly</i> what their needs were, and what they wanted to see in the next release.  And there was no way they were going to make it to Boston in person to talk about this.</p>

<p>"The property of being a vendor, and the property of being in the press, as odd as that sounds, gives us this amazing opportunity to really understand what people need," he continues.  "The commercial side of this has an amazingly crystalizing effect on people's priorities of what they want built.  We are an open source company, and every open source company lives or dies by their ability to balance their interaction with the community with their interaction with their customers...  Can we present ourselves in ways that are understandable to those two different communities without ever becoming two-faced, or engaging in a dichotomy?  It has to be, honestly, <i>this is who we are</i>, in all cases."</p>
                    ]]></description>
                <link>http://readwrite.com/2011/10/02/openstack-leader-open-source-n</link>
                <guid>http://readwrite.com/2011/10/02/openstack-leader-open-source-n</guid>
                <category>Interviews</category>
                <pubDate>Sun, 02 Oct 2011 03:00:00 -0700</pubDate>
                <author>Scott M. Fulton</author>
            </item>
                    <item>
                <title><![CDATA[Some Thoughts on the Passing of Dan McCracken (1930 - 2011)]]></title>
                <description><![CDATA[
                                        <p><a href="http://www.readwriteweb.com/hack/Dan%20McCracken.jpg"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/assets_c/2011/08/Dan%252520McCracken-thumb-150x203-32488.jpg" style="" />
			</span>
</a>There is a missing characteristic to most of what is published today on the subject of computing.  At some point in time, as a matter of course, we stopped treating the subject with respect and reverence, and we started adorning it with buzzwords, marketing promises and metaphors borrowed from the self-help department.</p>

<p>What Daniel D. McCracken managed to accomplish as early as 1957 was to give the knowledgeable layperson a foundation for understanding business processes in terms of procedural mathematics.  As a young author decades ago, I studied McCracken's methods and I attempted to take his lessons to heart.  In some of my first books on Visual Basic, I was inspired by McCracken to demonstrate a relatively simple concept using a substantively more complex tool:  I demonstrated program control using sort algorithms.</p>
<p>McCracken did this from the very beginning, and he may have mystified his audience as much as I did mine.  But he performed a necessary task, and he was probably the first to do it.  He showed how a complete program that performs a complex function is developed from the core out.  Although the concept was probably born in the first IBM laboratories to put FORTRAN to use on a daily basis, McCracken was the first to demonstrate a concept borrowed from artistry: the ability to see the complete product holistically, build working models instead of partial fragments and keep it working with each implementation.</p>

<p>McCracken knew exactly what he was doing - the man reasoned on all levels at once, as evidenced by his later treatises on public policy and theology.  In this excerpt from his 1975 paper, "How to Teach Structured COBOL to Beginners," McCracken reiterates the principle he had already, by that time, been putting to use in his classroom and his published work for two decades.  What he said, and how he said it, has not in the nearly four decades since.</p>

<blockquote>Structured programming, for the purposes of this paper, may be defined as a style of programming in which only three logic control elements are used, namely, sequence, selection (IFTHENELSE), and iteration (DOWHILE).  The scope of control of each selection and iteration element is displayed by consistent indentation.  The use of only three logic elements applies both to coding, using whatever logic elements the chosen language provides, and to program design, which is done using either structured flowcharts or, preferably, some form of pseudo-code.  A complete program design is achieved in a series of approximations, beginning with a simpler problem from which the desired design can be developed, in a process commonly described as stepwise refinement.

<p>The goal of structured programming is the production of programs that as clearly as possible display their structure, i.e., the interrelationship of their parts.  This clarity is a primary benefit of the restriction to only a few logic elements, which leads to programs that can be read in a "top-down" fashion, that is, without skipping around through the program.  The purpose and function of any program statement can generally be understood by looking at only a few other statements, all physically close by, which is very seldom true of programs written by conventional methods.  The main drawback of GO TO statements is that they tend to destroy this locality of context.</p>

<p>The design and coding stages of programming may or may not be shorter when structured programming methods are used, but the check-out and maintenance phases are generally much faster and easier to manage since programs are easier to understand.  As a result overall programmer productivity tends to increase, sometimes by a dramatic factor.</blockquote></p>

<p>There is a missing characteristic to what is <i>said</i> about computing, but nothing at all missing about what is <i>meant</i> by the finest of its practitioners.  Daniel D. McCracken gave the world six wonderful decades of brilliant explanation, most of which today is buried in barely accessible texts and vastly dispersed libraries.  Most of it is treated as obsolete.  A great deal of it, however, is about as obsolete and inapplicable to the current age as the three brilliant, illustrative paragraphs cited above.<br />
</p>
                    ]]></description>
                <link>http://readwrite.com/2011/08/15/some-thoughts-on-the-passing-o</link>
                <guid>http://readwrite.com/2011/08/15/some-thoughts-on-the-passing-o</guid>
                <category>Profiles</category>
                <pubDate>Mon, 15 Aug 2011 07:03:31 -0700</pubDate>
                <author>Scott M. Fulton</author>
            </item>
                    <item>
                <title><![CDATA[How Washington University is Developing the Next Generation of iOS Programmers]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/wustledu150.jpg" style="" />
			</span>
This week was finals week for the summer semester at Washington University in St. Louis, and one event that I regularly enjoy attending is the iOS programming class final presentations. Being the summer term, it was a very compressed schedule: the students, some of whom are older and have full-time day jobs, have about a month to learn how to use the various Apple tools, spec out and code their apps. </p>

<p>I've gone to several of these final presentations in the past (<a href="http://strom.wordpress.com/2009/12/08/developing-the-next-gen-of-iphone-apps-programmers/">here is a report from 2009</a>) and I continue to be impressed with what the students come up with. Sure, there are the usual mishaps: code that doesn't compile, or last-minute hacks to add one more feature or tweak a particular icon to display properly. But the class is a great arena for preparing these senior computer science majors with what they are going to have to face in the real world. </p>
<p>The students have to propose an idea for their app, look around on the App Store to see what is currently available, put together a data and coding plan and then write the code. Often they have to access particular Web services and public APIs for their app, and these interfaces can and do change over the course of the course, of course. (Sorry.) One of the teams was trying to access information from the video streaming site Justin.tv, and lamented the lack of any programmatic connection. Another was wrestling with a badly formed series of RSS feeds. How many computer science grads could even think about these things, let alone debate these issues? It brought a smile to my face. </p>

<p>Over the six semesters that the instructor, Todd Sproull, has taught the class, he has had close to 150 students, and mostly male (last night was 100% guys). There is usually a waiting list, as there are only so many computers to go around. "The students have always been bright, but it seems that more of them are using tools from other courses and companies (Google and Facebook APIs as examples) to create more compelling apps," he said. Some of the class band together and program as a team. This creates a certain wow factor at the final presentation - "speaking to the power of teamwork," as Sproull mentioned. Others go it alone.  </p>

<p>Some of the graduates have gone on to get great jobs: one student interviewed at Apple, and as a result of demoing their iPhone project, got hired. It helped that they were doing active debugging of the app during the interview. So much for asking silly questions like how many manhole covers it would take to pave over a baseball field and such. Almost all of these students go on to work in the industry, no surprise. Recruiters and others: take note and come take a look at these fertile fields in the future.</p>

<p><a href="http://www.readwriteweb.com/hack/washu.jpg"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/assets_c/2011/08/washu-thumb-322x477-32182.jpg" style="" />
			</span>
</a>There are five apps from all of the classes that have been actually published on the App Store, with a few more awaiting campus approval. One of them is a dandy. It is called <a href="http://itunes.apple.com/us/app/wu-map/id403202850?mt=8">WU Map, and has the entire campus map right on your iPhone</a>, as you can see in the screenshot. This is a little thing, but for those freshmen and people like me that aren't that familiar with the campus, it sure beats having to print out a map every time we have to visit. According to Sproull, several of these apps have at least made back their $100 listing fees paid by the students. </p>

<p>Sproull has done a better job preparing his students to understand Web data sources and how to get around some of the quirks of the Apple simulator, too. The summer class seemed to have more fun than in previous semesters, maybe it was just the concentrated amount of time that the students had for class together. One of the students had an app that would allow the user to note his favorite beers, geolocated to the bar that it was consumed. He found a database of more than 8,000 beers, and we joked that we could see that hands-on research really paid off with his app.</p>

<p>What surprised me was that only one of the apps presented was for iPads. This was an app that will be finished soon for the business school magazine. This was very polished, indeed looking better than many mag apps that I have seen from professional editorial operations. It used an RSS feed to grab the various articles from the magazine's Web site. The rest of the projects were focused on the iPhone.  </p>

<p>The biggest hurdle these days has nothing to do with the quality of the code, however. It is about the ownership and approval of intellectual property of the student-created apps. The university is still cogitating on exactly how this transpires. There is some concern about a student intentionally (or even unintentionally) distributing a malicious app, and how the school will approve apps that get distributed on the App Stores.</p>

<p>Nevertheless, it was an enlightening evening, and I wish all these students well with their apps.  </p>
                    ]]></description>
                <link>http://readwrite.com/2011/08/04/how-washington-university-is-d</link>
                <guid>http://readwrite.com/2011/08/04/how-washington-university-is-d</guid>
                <category>Profiles</category>
                <pubDate>Thu, 04 Aug 2011 03:30:00 -0700</pubDate>
                <author>David Strom</author>
            </item>
                    <item>
                <title><![CDATA[Teaching Creative Writing with Programming]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/python_logo_0311.png" style="" />
			</span>
 One of my favorite sessions at <a href="http://www.oscon.com">OSCon</a> this week was <a href="http://www.oscon.com/oscon2011/public/schedule/detail/19022">Teaching Creative Writing with Python</a>. Adam Parrish talked about his course <a href="http://rwet.decontextualize.com/">Reading and Writing Electronic Text</a>, which he teaches at New York University as part of the Interactive Telecommunications Program (ITP). Although the title emphasizes teaching creative writing through programming, the reverse is also true: the course teaches programming through experimental writing.</p>
<p>So how exactly is Python programming useful in creative writing? Parrish's course doesn't deal with artificial intelligence, or attempts at creating narratives or creating interactive hypertext or anything like that. It covers, for lack of a better term, procedural poetry. Typically, a student takes a starting set of text, writes a Python program to modify that text and then interprets the results.</p>

<p>Parrish cited non-electronic procedural poetry experiments as inspirations for the course. For example, he talked about <a href="http://en.wikipedia.org/wiki/Hundred_Thousand_Billion_Poems">Raymond Queneau's Cent mille milliards de poèmes</a>, a book in which the text has been cut into strips that can be re-arranged to create nearly endless configurations:</p>

<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/Hundred-Thousand-Billion-Poems_photo_0711.jpg" style="" />
			</span>
<br />
<em>Photo by <a href="http://www.flickr.com/photos/thomasguest/3597995774/">Thomas Guest</a>. <a href="http://www.flickr.com/search/?q=Cent+mille+milliards+de+po%C3%A8mes&ss=0&ct=0&mt=all&w=all&adv=1">More photos of the book on Flickr</a>.</em></p>

<p>Parrish also mentioned <a href="http://en.wikipedia.org/wiki/Ted_Berrigan">Ted Berrigan's Sonnets</a> and <a href="http://en.wikipedia.org/wiki/David_Melnick">David Melnick's PCOET</a>. Parrish didn't mention them in his talk, but the <a href="http://dwwp.decontextualize.com/">course website</a> also mentions Brion Gysin and William S. Burroughs' work with the <a href="http://en.wikipedia.org/wiki/Cut-up_technique">cut-up technique</a>.</p>

<p>Using these works as a springboard, Parrish teaches his students UNIX commands for working with text, Python text processing techniques (such as ranging from basic string manipulation to n-gram analysis) and regular expressions to help them create their own procedural texts. He says he chose Python because it's easy to use and has a lot of tools for working with text. Using computers students can process more text and do so more quickly than the physical methods used by the experimenters of the past.</p>

<p>Parrish says pacing is one of the most difficult issues faced in the class. Beginner programmers always feel the class moves too quickly, while experienced programmers find that it moves too slow.</p>

<p>Parrish's focus is clearly in creative writing and helping students explore text in new ways. But it's a really interesting experiment in helping art or humanities students learn to program (see <a href="http://www.readwriteweb.com/hack/2011/05/douglas-rushkoff-interview.php">our interview with Douglas Rushkoff</a> (who is also a teacher at ITP on why everyone should learn to program). A similar model could also be used to inject the humanities into programming and engineering education.</p>

<h2>Where to Find More</h2>

<p>Many of the lessons can be found <a href="http://www.decontextualize.com/teaching/rwet/">here<a> and the code examples are <a href="http://github.com/aparrish/rwet-examples">in Github</a>.</p>
                    ]]></description>
                <link>http://readwrite.com/2011/07/30/teaching-creative-writing-with-programming</link>
                <guid>http://readwrite.com/2011/07/30/teaching-creative-writing-with-programming</guid>
                <category>Profiles</category>
                <pubDate>Sat, 30 Jul 2011 10:19:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[The History of Programming Languages [Infographic]]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/quality_code_matrix.jpg" style="" />
			</span>
 <a href="http://rackspace.com">Rackspace</a> recently published a nice infographic on the <a href="http://www.rackspace.com/cloud/blog/2011/05/17/infographic-evolution-of-computer-languages/">evolution of programming languages</a>. It starts with FORTRAN and COBOL and runs through Ruby on Rails (which, yes, is a framework and not a language). </p>

<p>Unfortunately, it omits such influential languages as <a href="http://en.wikipedia.org/wiki/Lisp_(programming_language)">Lisp</a>, <a href="http://en.wikipedia.org/wiki/ALGOL_60">ALGOL 60</a>  and <a href="http://en.wikipedia.org/wiki/Smalltalk">Smalltalk</a>. But including every important language ever would make for a pretty long infographic.</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/programming-languages_infographic_0711.png" style="" />
			</span>
</p>

<p>The popularity stats at the end are apparently sourced from the <a href="http://www.readwriteweb.com/hack/2011/01/javascripts-popularity-decline.php">oft criticized</a> TIOBE. You can find the full sized graphic and bibliography <a href="http://www.rackspace.com/cloud/blog/2011/05/17/infographic-evolution-of-computer-languages/">here</a>.</p>
                    ]]></description>
                <link>http://readwrite.com/2011/07/27/the-history-of-programming-languages-infographic</link>
                <guid>http://readwrite.com/2011/07/27/the-history-of-programming-languages-infographic</guid>
                <category>Profiles</category>
                <pubDate>Wed, 27 Jul 2011 12:20:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[JavaScript Creator Says the Language Wasn't Just Dumb Luck]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/javascript_logo_1210.png" style="" />
			</span>
 JavaScript creator Brendan Eich has spoken out against the perception that JavaScript was an arbitrary or random success. In a <a href="http://news.ycombinator.com/item?id=2783060">comment at Hacker News</a> Eich explains the historical context from which JavaScript emerged and how it was unlikely to have happened any other way.</p>

<p>In comment at <a href="http://lambda-the-ultimate.org/node/4308#comment-66267">Lambda the Ultimate</a>, Eich wrote: "History has reason and rhyme as well as chance, it is not all and only random. For my part, there was little 'arbitrary' in what I did, including the mistakes -- some of those weirdly recapitulated early LISP mistakes."</p>
<p>Eich's comments were in response to comments like <a href="http://lambda-the-ultimate.org/node/4308#comment-66103">this one</a>: "If Brendan Eich chose SmallTalk for the Netscape browser, that's probably what you'd be gushing about today."</p>

<p>And this one: "It's just dumb luck and path dependence. If Netscape had put scheme into Navigator we'd be using that instead."</p>

<p>Eich writes at Hacker News:</p>

<blockquote>"Subtle chains of cause and effect were at play among people involved, going back years to Silicon Graphics (Netscape drew from UIUC and SGI, plus montulli from Kansas, and jwz). Also going back through the living history of programming languages. SICP and some of the Sussman & Steele 'Lambda the ...' papers made a big impression on me years before, although I did not understand their full meaning then.

<p>"Remember, I was recruited to 'do Scheme', which felt like bait and switch in light of the Java deal brewing by the time I joined Netscape. My interest in languages such as Self informed a subversive agenda re: the dumbed down mission to make 'Java's kid brother', to have objects without classes. Likewise with first-class functions, which were inspired by Scheme but quite different in JS, especially JS 1.0.</p>

<p>"Apart from the 'look like Java' mandate, and 'object-based' as a talking point, I had little direction. Only a couple of top people at Netscape and Sun really grokked the benefit of a dynamic language for tying together components, but they were top people (marca, Rick Schell [VP Eng Netscape], Bill Joy).</p>

<p>"Rather than dumb luck, I think a more meaningful interpretation is that I was a piece of an evolving system, exploring one particular path in a damn hurry. That system contains people playing crucial parts. Academic, business, and personal philosophical and friendship agendas all transmitted an analogue of genes: ideas and concrete inventions from functional programming and Smalltalk-related languages."</blockquote></p>

<p>Eich also tells the story of how JavaScript came to be <a href="http://brendaneich.com/2008/04/popularity/">here</a>. </p>

<p>In short, the decision to use JavaScript instead of a language like Smalltalk or Scheme was far from arbitrary and arose from specific circumstances.</p>

<p>Also, although the lack of a better alternative has certainly ensure JavaScript's popularity, neither Netscape nor JavaScript's success was ensured from the beginning. Other browsers, such as <a href="http://en.wikipedia.org/wiki/Mosaic_browser">Mosiac</a> and <a href="http://en.wikipedia.org/wiki/Cello_(web_browser)">Cello</a> existed. I had all three on my computer around the time JavaScript was being developed, and I still knew people who still preferred <a href="http://en.wikipedia.org/wiki/Lynx_(web_browser)">Lynx</a> to the graphical Web. </p>

<p>And as Eich points out, not everyone saw the value in a scripting language in addition to Java. Had JavaScript not been good enough, it could have been discontinued or replaced.</p>
                    ]]></description>
                <link>http://readwrite.com/2011/07/22/javascript-was-no-accident</link>
                <guid>http://readwrite.com/2011/07/22/javascript-was-no-accident</guid>
                <category>Profiles</category>
                <pubDate>Fri, 22 Jul 2011 04:00:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[Twitter Engineer Talks About the Company's Migration from Ruby to Scala and Java]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/archives/twitter_newbird_boxed_whiteonblue.png" style="" />
			</span>
 Twitter is famous for its use of Ruby on Rails, but as it has scaled the service up it has migrated some of its code to other technologies. The company began by <a href="http://www.readwriteweb.com/hack/2011/04/video-how-twitter-scales-with.php">migrating its back-end message queue to Scala</a> (which runs on the Java Virtual Machine), continued by rebuilding its back-end search in Java and most recently <a href="http://www.readwriteweb.com/cloud/2011/04/twitter-drops-ruby-for-java.php">replaced its search front-end with a Java server</a>.</p>

<p><a href="http://www.infoq.com/articles/twitter-java-use">InfoQ</a> is running an interview with Twitter engineer Evan Weaver who explains more about the shift.</p>
<p>Here are a few interesting points:</p>

<ul>
	<li>The first class languages at Twitter are JavaScript, Ruby, Scala and Java. Soemtimes C is used as well.</li>
	<li>The usage of Ruby is shrinking at Twitter as JavaScript takes over the front-end and JVM-based languages take over the back-end.</li>
	<li>In general, developers at Twitter from a Ruby background prefer Scala, and those with a C/C++ background prefer Java.</li>
	<li>The search team uses Lucene and is experienced in Java. Java is more convenient for them than Scala or Ruby.</li>
	<li>Twitter uses a library called <a href="http://twitter.github.com/finagle/">Finagle</a> for building asynchronous RPC servers and clients in Java, Scala or any JVM langauge.</li>
	<li>The move to Scala and Java at Twitter is driven more by a need for encapsulation than for performance and scalability and much of the existing Ruby code is quite workable for the time being.</li>
<li>Static typing has been a productivity boon as Twitter search has moved towards a service oriented architecture.
</ul>

<p>The interview also goes into more specific technical reasons for preferring Scala to Rails, such as better vertical integration. Weaver also talks about Twitter's overall architecture, which was described in the talk we covered <a href="http://readwriteweb.com/cloud/2011/01/how-twitter-uses-nosql.php">here</a>.</p>
                    ]]></description>
                <link>http://readwrite.com/2011/07/06/twitter-java-scala</link>
                <guid>http://readwrite.com/2011/07/06/twitter-java-scala</guid>
                <category>Profiles</category>
                <pubDate>Wed, 06 Jul 2011 09:45:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[A Look Back at the BeOS File System ]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/beos_logo_0611.png" style="" />
			</span>
 Neal Stephenson <a href="http://www.cryptonomicon.com/beginning.html">once wrote</a> that BeOS was the Batmobile of operating systems (Windows was a station wagon, MacOS was a European sports car and Linux was a free army tank). It was created in 1991. In 1997 its legendary file system, BFS was created. Be Inc sold to Palm in 2001, and BeOS was to become the foundation of <a href="http://www.access-company.com/news/press/PalmSource/2004/021004_cobalt.html">PalmOS 6</a>, which was never used in a Palm device. However, the ideas beyond BeOS live on <a href="http://haiku-os.org/">Haiku</a>, an open source clone of the OS.</p>

<p><a href="http://arstechnica.com/open-source/news/2010/06/the-beos-filesystem.ars/">Ars Technica</a> has an interesting retrospective on the BeOS file system, BFS, which is now in use in both Haiku and <a href="http://en.wikipedia.org/wiki/SkyOS">SkyOS</a>. BFS had many forward looking features, including 64-bit data structures, journaling and metadata support.</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/beos_screenshot_0611.png" style="" />
			</span>
</p>

<p>The article includes a handy glossary of storage terms and interviews with a Be engineer responsible for BFS and Haiku developer Axel Dörfler. The Be engineer asked not to be named "to comply with the wishes of his current employer." </p>

<p>There's <a href="http://www.thesearethedroids.com/2009/10/15/androids-heritage/">a good chance</a> that employer <a href="http://www.linkedin.com/search/fpsearch?company=be+inc&currentCompany=CP&searchLocationType=I&countryCode=us&keepFacets=keepFacets&page_num=1&facet_CC=1441&search=&pplSearchOrigin=MDYS&viewCriteria=1&sortCriteria=R&facetsOrder=CC%2CN%2CI%2CPC%2CED%2CL%2CFG%2CTE%2CFA%2CSE%2CP%2CCS%2CF%2CDR%2CG&redir=redir">is Google</a>), which employees several former Be engineers. </p>

<p>Several ex-Be employees <a href="http://www.osnews.com/story/7265">went to work for Danger</a> after the company told to Palm. Some of them moved on to Android, which was co-founded by Danger co-founder Andy Rubin and acquired by Google. Others stayed on at Palm, but ended up joining Google after PalmSource (which was spun out of Palm) was acquired by <a href="http://en.wikipedia.org/wiki/Access_Co.">Access</a>.</p>

<p><a href="http://en.wikipedia.org/wiki/Dominic_Giampaolo">According to Wikipedia</a> Dominic Giampaolo, one of the main developers of BFS and author of <a href="http://www.nobius.org/~dbg/practical-file-system-design.pdf">Practical File System Design with the Be File System</a> (PDF), has been working on file systems for Apple since 2002. One of the other key developers, <a href="http://www.linkedin.com/profile/view?id=285624&authType=name&authToken=LoI3&locale=en_US&pvs=pp&trk=ppro_viewmore">Cyril Meurillon</a>, is now a consultant.</p>
                    ]]></description>
                <link>http://readwrite.com/2011/06/29/a-look-back-at-the-beos-file-s</link>
                <guid>http://readwrite.com/2011/06/29/a-look-back-at-the-beos-file-s</guid>
                <category>Profiles</category>
                <pubDate>Wed, 29 Jun 2011 12:00:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[ilearnedtoprogram.com - Share the Story of Why and How You Learned to Program]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/quality_code_matrix.jpg" style="" />
			</span>
 Every time you refresh <a href="http://ilearnedtoprogram.com/">ilearnedtoprogram.com</a> you'll see a different one sentence story from a programmer about how or why they started programming. </p>

<p>At the moment, there are 351 stories.</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/images/ilearnedtoprogram_screenshot_0411.jpg" style="" />
			</span>
</p>

<p>Here are a few:</p>

<blockquote>
"by finding artistic problems which made absorbing the knowledge easy." - <a href="http://twitter.com/LilliThompson">Lilli Thompson</a>

<p>"because my first computer would only sit with a blinking cursor if I didn't program it." <a href="http://blogs.msdn.com/ashleyf">Ashley Feniello</a></p>

<p>"because my TRS-80 didn't come with any video games, so I had to write my own."<br />
 - <a href="http://mykle.com/">Mykle Hansen</a></p>

<p>"by writing a BASIC game on a manual typewriter, and being driven by my mom to the local Radio Shack where I typed it in to the TRS-80 I coveted."- <a href="https://www.facebook.com/mfhillman">Matt Hillman</a></p>

<p>"in college, when I realized I could have a big impact on the world by building software with a focus on humanity." - <a href="http://vanessahurst.com/">Vanessa Hurst</a></p>

<p>"because I believed (and still do) that it's one of the best skills to have in order to change people's lives for the better." <a href="http://www.alyssadaw.com/">Alyssa Daw</a></p>

<p>"when I wanted a piano and got an Atari -- programmed the keys to make sounds." <a href="http://www.cs.colorado.edu/~ksiek">Katie A. Siek</a></p>

<p>"because I wanted to make interactive electronic literature." - <a href="http://zuz.husarova.net/">Zuzana Husarova</a></p>

<p>"because I wanted to build cool things. Now that's my job." - <a href="http://alexgaynor.net/">Alex Gaynor</a></p>

<p>"because I like learning languages -- computer ones included!" - <a href="http://www.dotdiva.org/profiles/angelica.html">Angelica Lim</a></p>

<p>"after graduating with a Philosophy degree." - <a href="http://modern-carpentry.com/">Thomas Saunders</a></p>

<p>"by being a designer who needed a programmer and couldn't find one!"  - <a href="http://www.catherinehicks.com/">Catherine Hicks</a></blockquote></p>
                    ]]></description>
                <link>http://readwrite.com/2011/04/26/ilearnedtoprogramcom---share-t</link>
                <guid>http://readwrite.com/2011/04/26/ilearnedtoprogramcom---share-t</guid>
                <category>Profiles</category>
                <pubDate>Tue, 26 Apr 2011 08:45:00 -0700</pubDate>
                <author>Klint Finley</author>
            </item>
                    <item>
                <title><![CDATA[Secrets of BackType's Data Engineers]]></title>
                <description><![CDATA[
                                        <p>
<span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/backtypelogo.jpg" style="" />
			</span>
How do three guys with only seed funding process a hundred million messages a day? I sat down with the <a href="http://backtype.com/">BackType</a> team to discover how they built a service relied upon by companies like bit.ly, Hunch and The New York Times. 
</p>
<p>
BackType captures online conversations, everything from tweets to blog comments to checkins and Facebook interactions. Its business is aimed at helping marketers and others understand those conversations by measuring them in a lot of ways, which means processing a massive amount of data.  
</p>
<p>
To give you an idea of the scale of its task, it has about 25 terabytes of compressed binary data on its servers, holding over 100 billion individual records. Its API serves 400 requests per second on average, and it has 60 EC2 servers around at all times, scaling up to 150 for peak loads. 
</p>
<div style="width:104; float:right; margin:30"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/backtype_christopherg.jpg" style="" />
			</span>

<i>Christopher Golda</i>
<span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/backtype_mikem.jpg" style="" />
			</span>

<i>Michael Montano</i>
<span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/backtype_nathanm.jpg" style="" />
			</span>

<i>Nathan Marz</i></div>
<p>It has pulled this off with only seed funding and just three employees: <a href="http://twitter.com/golda">Christopher Golda</a>, <a href="http://twitter.com/michaelmontano">Michael Montano</a> and <a href="http://twitter.com/nathanmarz">Nathan Marz</a>. They're all engineers, so there's not even any sysadmins to take some of the load.
</p>
<p> 
Coping with that volume of data with limited resources has forced them to be extremely creative. They've invented their own language, <a href="http://nathanmarz.com/blog/introducing-cascalog-a-clojure-based-query-language-for-hado.html">Cascalog</a>, to make analysis easy, and their own database, ElephantDB, to simplify delivering the results of their analysis to users. They've even written a system to update traditional batch processing of massive data sets with new information in near real-time.
</p>
<p> 
The backbone of BackType's pipeline is Amazon Web Services, using S3 for storage and EC2 for servers. It leverages technologies such as Clojure, Python, <a href="http://hadoop.apache.org/">Hadoop</a>, <a href="http://cassandra.apache.org/">Cassandra</a> and <a href="http://thrift.apache.org/">Thrift</a> to process this data in batch and real-time. 
</p>
<p>
The start of the pipeline is a group of machines that ingest data from the Twitter firehose, Facebook API and millions of sites and other social media services. The first interesting feature of the architecture is that it actually has two different pipelines, one the traditional batch layer that takes hours to produce results, and a "speed layer" that reflects new changes immediately.
</p>
<p>
Captured data is fed into the batch layer through processes on each machine called collectors. These append new data to a local file, which is then copied over to S3 periodically. This raw data is then put through a process they call shredding, which organizes it in two different ways. First, data units are stored with others of the same type. For example the content of tweets or blog comments would be stored together and separate from the names of their authors. Second, the same data is sliced by time, so everything within a single day will be stored together.
</p>
<span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/hack/backtypediagram.png" style="" />
			</span>

<p>
Why do they do this? The organization of the data enables them to run more efficient queries only against the relevant data. When they have a job that requires analyzing Twitter retweets for example, they can just pull out the content, sender and time for each message, and ignore all other metadata. This process is made a lot easier thanks to their use of Thrift for the data storage. Everything in their system is described by a graph-like Thrift schema, which controls the folder hierarchy the data is stored into, and automagically creates the Java/Python/etc code for serialization. 
</p>
<p>
Cascalog is one of their secret weapons, a <a href="http://clojure.org/">Clojure</a>-based query language for Hadoop that makes it simple for them to analyze their data in new ways. Inspired by the venerable <a href="http://en.wikipedia.org/wiki/Datalog">Datalog</a>, and built on top of <a href="http://www.cascading.org/">Cascading</a>, it allows you to write queries in Clojure and define even complex operations in simple code. Unlike alternatives like <a href="http://research.yahoo.com/project/90">Pig</a> or <a href="http://wiki.apache.org/hadoop/Hive">Hive</a>, it's written within a general-purpose language, so there's no need for separate user-defined functions, but it's still a highly-structured way of defining queries.
</p>
<p>
 Its power has enabled them to quickly add features like domain-level statistics and per-user influence scores with just a couple of screens of code. It's spread beyond BackType and has an active user community including companies like eHarmony, PBworks and Metamarkets.
</p>
<p>
The final part of the batch processing puzzle is how to get the results of your analysis to the final user. They experimented with writing out the data to a Cassandra cluster, but ran into performance issues. What they ended up creating instead was a system they call ElephantDB. It takes all the data from a batch job, splits it up into shards, each of which is written out to disk as BerkeleyDB-format files. After that they fire up an ElephantDB cluster to serve the shards. Unlike many traditional databases, it's read-only, so to update data served from the batch layer you create a new set of shards.
</p>
<p>
So that's how the heavy processing is done, but what about instant updates? The speed layer exists to compensate for the high latency of the batch layer. It is completely transient and because the batch layer is constantly running it only needs to worry about new data. The speed layer can often make aggressive trade-offs for performance because the batch layer will later extract deep insights and run tougher computations. It takes the data that came in after the last batch processing job and applies fast running algorithms.
</p>
<p>
Because the Hadoop processing is run once or twice a day, the fast layer only has to keep track of a few hours of data to produce its results. The smaller volume makes it easy to use database technologies like MySQL, Tokyo Tyrant and Cassandra in the speed layer. Crawlers put new data on <a href="http://gearman.org/">Gearman</a> queues and workers process and write to a database. When the API is called, a thin layer of code queries both the speed layer database and the batch ElephantDB system, and merges the information from both to produce the final output that's shown to the outside world.
</p>
<p>
BackType isn't the only startup to split its processing using this combination of speed and batch layers; Hunch does something similar for its user recommendations. The trouble is that nobody has found an approach that is as elegant or generally applicable as MapReduce for real-time processing of continuous streams of data. 
</p>
<p>
<div class="pullquote">Instead of the firefighting and housekeeping burden I'd expect from such a complex system, they seem to spend most of their time focused on applications that solve customer problems.</div>Yahoo's <a href="http://labs.yahoo.com/event/99">S4 "Distributed Stream Computing Platform"</a> is an interesting start, but Marz explained that they weren't able to build on top of it because it didn't offer any reliability guarantees, thanks to its use of UDP for communication. The lack of unit tests also made it daunting, since it would be tough to spot if any modifications they needed to make had introduced subtle bugs. 
</p>
<p>
Instead, Marz and Montano have been working on a new framework based on their own experiences. The technology managing the streaming processing and guaranteeing reliability of messages is called Storm, and though it can run a variety of languages, they've designed one especially for it called Thunderlog, based on Cascalog. 
</p>
<p>
Though they are not yet ready for release, Storm and Thunderlog are being actively developed and will soon replace their more hand-coded speed layer. The system will incorporate many of the tips they picked up building their first system. For instance, to avoid concurrency issues without paying a performance penalty, you can group events by key so that possibly conflicting changes happen on the same machine in a serial fashion.
</p>
<p>
At the end of the tour of <a href="http://tech.backtype.com/">their technology</a>, I was left very impressed by how much they have accomplished with so few engineers. Instead of the firefighting and housekeeping burden I'd expect from such a complex system, they seem to spend most of their time focused on applications that solve customer problems. 
</p>
<p>
The secret is their ability to automate the routine tasks with tools like Cascalog, ElephantDB and Thunderlog. Writing those allows them to spend their limited time on writing new applications that offer direct value to their users, without having to wrestle with screenfuls of boilerplate code first. They are on the lookout for new team members, and say they've only stayed so small because they are so committed to only hiring the very best. If you're interested in working on the cutting edge of big data processing, drop them an email at <a href="mailto:jobs@backtype.com">jobs@backtype.com</a>.
</p>
                    ]]></description>
                <link>http://readwrite.com/2011/01/12/secrets-of-backtypes-data-engineers</link>
                <guid>http://readwrite.com/2011/01/12/secrets-of-backtypes-data-engineers</guid>
                <category>Profiles</category>
                <pubDate>Wed, 12 Jan 2011 06:00:00 -0800</pubDate>
                <author>Pete Warden</author>
            </item>
            </channel>
</rss>

