Since September 2007, the walls of the Regenstein Library have provided me with an ever-growing data set of the joys, whimsies, and woes of University of Chicago students, as expressed through their graffiti. At the school where fun comes to die, I’ve seen long threads using philosophers’ names in wordplay, critical assessments of the advice offered by others, complete poems, and even hieroglyphic sex graffiti.
Working exclusively with a single data set, I developed certain assumptions about university graffiti in general. First, I assumed that most university libraries had a rich collection of this kind of illicit material—but this was disproven as soon as I started looking at other libraries on campus. Perhaps unsurprisingly, the law library was spotless, the value of speaking one’s mind hardly being worth the risk of being charged with a class B misdemeanor for criminal defacement of property. Even the science library was kept spick and span, and I had to beat the cleaning staff to the study room blackboards in the morning if I wanted anything resembling graffiti. As I expanded my collection to other universities (usually stopping by while I was on campus anyway for work-related trips), I found that cleaning practices vary widely. Small liberal arts colleges have tended to be the most graffiti-free, whereas the University of Colorado - Boulder, Arizona State University, and Brown are liberally covered in graffiti—though for the latter two, the graffiti is largely on surfaces that would be difficult to clean without causing serious damage. (If you’re ever picking furniture for a library, don’t choose wood or fabric-covered study carrels.)
Second, I assumed that all libraries would have graffiti more or less as interesting as that found at the University of Chicago—albeit, perhaps, with somewhat fewer arcane references. Regrettably, I was disappointed again and again. Each school was different, and the “interestingness” of the graffiti correlated well with the prestige of the institution. Arizona State’s corpus was so uninteresting, it had its own kind of morbid interest—visions of a post-apocalyptic future where written communication primarily takes the form of frat names. The University of Colorado - Boulder’s data set is hard to describe beyond “unremarkable”, lacking a striking amount of any particular genre of graffiti. Berkeley, the first university I visited beyond the University of Chicago, had a surprising number of pieces proclaiming and discussing identity— ethnic, religious, etc. Brown came closest to the University of Chicago in content and spirit (yes, there are hieroglyphs there, too), though sex is a much greater focus of interest at Brown—nearly half the uses of “ass” and “suck” at Brown are sexual, compared to 25% and 16%, respectively, at the University of Chicago.
I began exploring other universities’ graffiti simply out of curiosity, but by last fall I realized what I had gathered: large corpora of sociological data from five fairly diverse institutions of higher education in the United States. I had my own general sense for each data set, but I wanted to look at them more rigorously and convert intuition into something quantifiable. In short, I wanted to take last year’s tongue-in-cheek “statistical analysis of graffiti”—which was published here on Inkling and which, to my great surprise, people took seriously despite the clear and pervasive methodological flaws that should have clearly signaled comic intent rather than analytical rigor—and try doing it for real.
Data preparation and classification
I already had transcriptions of most of the graffiti, photo-by-photo, but many of the photos contained multiple unique pieces of graffiti, and photo-based groupings are completely arbitrary. So I separated each piece of graffiti, and linked together pieces that formed a conversation (to the extent I could discern their intent) using a unique identifier. I classified each piece of graffiti using 22 categories I thought would apply to most or all of the corpora, including advice, insults, love, meta (graffiti about graffiti, or the surface it’s written on), quotes, presence (“X was here” and variations), school, and sex. I later realized that I had erred in making “time” its own category—while the choice is defensible based on the University of Chicago and Brown corpora, its presence was mostly limited to those two schools.
Most controversially, I assigned each piece of graffiti an “interestingness” score, from 1 to 3. 1 indicated something cliche, predictable, or incomplete: One “advice-1” was We will all be alright; one “love-1” was I love Jenny; one “misc-1” was wage labor. I scored graffiti 2 if it represented a more fleshed-out contribution, a non-obvious reply, or use of less-obvious wording: one “advice-2” is Save yourself, it feels better, more rewarding; one “love-2” was How do you know when you’re in love?, and one “misc-2” was Pro’s - Eat pizza. - Con’s - Be tired tomorrow. A score of 3 was reserved for pieces with some substance or spark to them: One “advice-3” was Go to Tibet. Chant with the monks., one “love-3” was Academia vs. Love (with tally marks underneath), and one “misc-3” was (in response to Alla-kazaam!) Semantically, does this likely derive from “Allah”? Probs!A very small number of pieces were assigned a 4, in cases where I felt the content was a step above even the 3s in its category, such as this “misc-4”: Magnificence is dead. The nosferati wait. Random acts in sporadic art the graffitti on walls they don’t speak to me. Much is clever. Much is technical. Much is a knife twisting variation. But an act of art that leaves you shuddering. We are waiting. What about you? Perhaps soon.
People may disagree with my interestingness rankings (and have already done so), but the data is all available as Google Docs spreadsheets, and I’d be curious to see how radically others’ evaluations of the data differ from my own. If you can’t buy into my interestingness criteria, the analysis as a whole has other things to offer, including sources of quotes and references, genres of music quoted, expressions of homophobia, sexual vs. non-sexual word use, and a Venn diagram of love vs. hate. Nonetheless, interestingness remains for me the major focus of the study.
Using the interestingness scores, I calculated the average (mean) interestingness for each corpus, and for each category within the corpus. Perhaps unsurprisingly, the University of Chicago came out on top, with a score of 1.79. Brown followed at 1.56, then Berkeley at 1.43, University of Colorado at 1.38, and Arizona State at a dismal 1.23.
Does sample size affect interestingness?
The University of Chicago’s high score inevitably leads to people crying foul on the grounds of sample size—the top two schools have by far the largest corpora, with the University of Chicago at 1455 pieces, followed by Brown at 930. Arizona State has 507, the University of Colorado has 262, and Berkeley has 147, so there’s clearly not a perfect correlation between sample size and interestingness score. Nonetheless, I wanted a better response to the critics.
Because I’ve checked the University of Chicago stacks for graffiti more-or-less weekly for a years, I broke that data down by quarter, yielding individual sample sizes ranging from 204 to 39—much smaller than even the Berkeley corpus. I calculated the Pearson coefficient (which measures the strength of a correlation, with 0 indicating no correlation and +/- 1 indicating perfect correlation) for quarter-based corpus size and interestingness score, and got -.11. I think it’s fair to say that most of the metrics for the Berkeley corpus should be invalidated due to issues with the corpus size (do you really want to say that 33% of the quotes are from the Bible, when there’s only three quotes?) but the interestingness score is solid, assuming you buy into the methodology.
The full analysis of Arizona State University, the University of Colorado - Boulder, Berkeley, Brown, and the University of Chicago can be found on the Crescat Graffiti blog, but I’ll conclude with a few unexpected results:
• I was initially dismissive of quotes as derivative works that didn’t reflect original thought, and was surprised to discover that actually, the habit of quoting sources is significantly more prevalent at higher-ranked schools like Brown (7% of all graffiti) and the University of Chicago (10%). References appear almost twice as often as quotes at the University of Colorado (9% vs. 5%), and the reference-to-quote ratio is 8:1 at Arizona State (8% vs. 1%).
• Rock music in its many flavors (indie, alternative, punk, and others) represents a plurality of the music quotes at the University of Chicago and Brown. Rap quotes appear almost twice as frequently at the University of Chicago (15%) than at Brown (8%).
• Love is overwhelmingly more common than hate at Brown and Arizona State, but at the University of Chicago the numbers are significantly closer. Both “you” (University of Colorado and Brown) and “school” (Arizona State and the University of Chicago) appear in multiple corpora as an object of both love and hate.
P.S. If you’d like to take your own crack at the data I’ve collected, or if you see your school represented here and you’ve got a burning question you’d like answered about the graffiti you see every day, you can find links to all the spreadsheets I’ve created here. I’d be delighted to see what you come up with.