Mapping the Largest Hip-Hop Vocabularies of Rappers
New York-based designer Matt Daniels crunched the numbers on a list of 85 rappers to find which emcee has the largest vocabulary in hip-hop. The dataset was culled from the first 35,000 lyrics of each musician’s catalog taken from the website Rap Genius, so as to the level the playing field between newer and older artists, a number that comes out to around three to five albums worth of music. For context, Daniels also included the famously verbose Moby-Dick and the works of Shakespeare as benchmarks.
I used a research methodology called token analysis to determine each artist’s vocabulary. Each word is counted once, so pimps, pimp, pimping, and pimpin are four unique words. To avoid issues with apostrophes (e.g., pimpin’ vs. pimpin), they’re removed from the dataset. It still isn’t perfect. Hip hop is full of slang that is hard to transcribe (e.g., shorty vs. shawty), compound words (e.g., king shit), featured vocalists, and repetitive choruses.
San Francisco rapper Aesop Rock came in at first, followed by the Wu Tang Clan, who all managed to score quite highly both as a group and solo acts.