In much the same way our bones absorb trace chemicals that can tell future scientists about the environment we lived in, our music absorbs cultural influences, social memes, worries and joys that are in the air at the moment of its creation. Tahir Hemphill, a designer, technologist and longtime student of hip-hop, wants to hasten our understanding of how music does that with The Hip-Hop Word Count.

This "searchable ethnographic database built from the lyrics of over 40,000 Hip-Hop songs from 1979 to present day...generates textual and quantified reports on searched phrases, syntax, memes and socio-political ideas."

"How can analyzing lyrics teach us about our culture?"

Tahir, a resident at Eyebeam Art+Technology Center, where the Hip-Hop Word Count is being developed, asked that question of himself, and answered it.

"The Hip-Hop Word Count locks in a time and geographic location for every metaphor, simile, cultural reference, phrase, meme and socio-political idea used in the corpus of Hip-Hop.The Hip-Hop Word Count then converts this data into explorable visualisations which help us to comprehend this vast set of cultural data.

This data can be used to chart the migration of ideas and builds a geography of language and is the engine for a teaching curriculum."

In other words, want to find out who the most latinate rappers are? The most monosyllabic? What the most popular champagne was in a given period? Which region produced rappers who name-checked the most? Which time period was the most concerned with social issues or consumer goods? Search the database.

The database contains metadata as well as data. The metadata per lyric includes number of words, average number of words per line and characters per word, reading-level of lyrics from junior high to post-grad and geolocation of rapper.


Funding for Future Phases

Hip-Hop Word Count is seeking its next stage capital via the crowdsourced funding site Kickstarter. But this isn't Tahir's first rodeo. He's a long-term fan steeped in the sounds of his native New York. Through his umbrella group Staple Crops, and also under the auspices of Eyebeam, he is also conducting a series of Rap Research Groups.

The RRGs are "lively and casually moderated discussions between Rap enthusiasts, historians, creative technologists, cultural critics, linguists, teachers, MC's and academics."

As of this writing, the Hip-Hop Word Count has reached a funding of $4,376 out of $7,500 needed. The money will go to paying for the coding and design and finishing both, as well as database clean-up and hosting. The end-date of the funding is 24 days from today.

Hip-Hop Data

In a conversation, Tahir said the initial tool will be free and accessible to everyone, non-monetized, with an API available. His initial work has drawn the attention, in addition to the music's enthusiasts, academics. Professors have expressed an interest in using such a database for sociological and linguistic research. Tahir himself has begun to develop curricula for mathematics and language arts based on the database and its lyrics.

He has also begun strategizing the kind of reports that could be profitably drawn from a growing Hip-Hop Word Count.

"I have a data set of unemployment figures between 1980 and 2010. I'm pulling references, slang references, to money. You know, 'cream,' 'cash,' 'guacamole.' There may be a correlation between (the way these terms are used) and the situation."

You could imagine the money - I beg your pardon, the "scrillah," the "cheese," the "scratch" - a label, for instance, might be willing to pay for a study that linked certain types of language use with up-times and down-times. Tahir, an arts and sciences polymath with an advertising resume, acknowledges this, but says "I can't zoom out that far these days."

We'll see what happens when the Word Count is up and running in a public sphere.

Rap Research photo courtesy of Eyebeam