Superhero Metrics, Part I: The Problem
A few weeks ago, I was talking to my brother about superheroes and other comic book characters, specifically, ranking them by popularity or cultural prominence. (Why? What do you talk to your little brother about?*)
We agreed that Batman, Superman, and Spider-Man were probably the top three, but we didn’t know who would come next.
Because I am an information scientist and this type of information problem is right up my alley, I kept thinking about this for a couple of days after the conversation.
The most straightforward approach would be to do a simple Google keyword search and count the number of hits. This works reasonably well if the words being searched don’t have prominent alternate meanings. (Most of the hits for “Batman,” for instance, are probably referring to the Dark Knight, and not the province in Turkey.)
However, this type of metric completely falls apart when you start trying to search for comics characters with names like “Wolverine” or “Cyclops.” (You could add limiting terms to the search, such as “Wolverine X-Men” or “Cyclops X-Men,” but then the results for those searches aren’t comparable to bare searches for “Batman” or “Superman.”†)
Having ruled out a keyword search as insufficiently precise, my thoughts turned to Wikipedia, one of my favorite resources.
The most obvious way to use Wikipedia to gauge cultural prominence would be to compare the lengths of the articles about different characters. For instance, the article about Professor X is about 11,000 words long, but the article about Xorn (a more recent addition to the X-Men franchise) is only about 3,000 words long.
I think this approach would work fairly well for differentiating between minor and major cultural icons. However, Wikipedia guidelines state that articles longer than around 10,000 words should be broken into multiple articles, which makes word count unsuitable for ranking the top cultural figures, because article length will eventually plateau.
However, while I was playing around with Wikipedia articles, I did think of another way of using them to gauge popularity, which I’ll introduce in Part 2.
* Actually, if you’re Humble Master, this is probably exactly what you talk to your little brother about.
† If pressed to use a keyword search approach, I might compare the search terms “Wolverine comics,” “Cyclops comics,” “Batman comics,” etc. Adding the term “comics” would effectively clear up any ambiguity, but it would also eliminate results that talk about the characters in other media. (Part of the reason that some of these characters are so culturally prominent is that they’ve moved beyond the comics realm to film, television, fiction, etc.)