s Thoughts from the Physics Chick: June 2011

Monday, June 27, 2011

Superhero Metrics, Part II: A New Approach

Permit me, then, to introduce to you the Wikipedia Inter-language Metric of Cultural Significance (WIMCS)*.

The premise behind the WIMCS is simple: If a person, place, thing, etc., is highly culturally significant, it will not only appear in the English Language Wikipedia (which is the largest Wikipedia, by far), it will also appear in the Wikipedias of many other languages.

So, in order to calculate the WIMCS for a given entity, go to its Wikipedia page, open the “Languages” menu on the left side of the page, and simply count the number of languages listed.

Here’s some WIMCS data from the previously mentioned topic of conversation:

61 – Batman
57 – Spider-Man, Superman
37 – Wolverine
36 – X-Men
31 - The Fantastic Four, The Hulk, Iron Man
28 - Captain America
25 – The Joker
24 – Lex Luthor
23 - Flash, Wonder Woman
22 – Green Lantern
19 - Aquaman
18 - Jean Grey, Robin, Thor
17 – Hellboy, Professor X
9 - Hawkman

This data confirms that Batman, Superman, and Spider-Man are the top three superheroes (as expected), and suggests that Wolverine comes in at number four (at least, of the superheroes I’ve thought to check), which seems plausible.

In part III, I'll examine some of the advantages and disadvantages of this approach.

______________________________
*I’m not married to this name or to the acronym representing it. Other, catchier suggestions are highly welcome.

Monday, June 20, 2011

Superhero Metrics, Part I: The Problem

A few weeks ago, I was talking to my brother about superheroes and other comic book characters, specifically, ranking them by popularity or cultural prominence. (Why? What do you talk to your little brother about?*)

We agreed that Batman, Superman, and Spider-Man were probably the top three, but we didn’t know who would come next.

Because I am an information scientist and this type of information problem is right up my alley, I kept thinking about this for a couple of days after the conversation.

The most straightforward approach would be to do a simple Google keyword search and count the number of hits. This works reasonably well if the words being searched don’t have prominent alternate meanings. (Most of the hits for “Batman,” for instance, are probably referring to the Dark Knight, and not the province in Turkey.)

However, this type of metric completely falls apart when you start trying to search for comics characters with names like “Wolverine” or “Cyclops.” (You could add limiting terms to the search, such as “Wolverine X-Men” or “Cyclops X-Men,” but then the results for those searches aren’t comparable to bare searches for “Batman” or “Superman.”†)

Having ruled out a keyword search as insufficiently precise, my thoughts turned to Wikipedia, one of my favorite resources.

The most obvious way to use Wikipedia to gauge cultural prominence would be to compare the lengths of the articles about different characters. For instance, the article about Professor X is about 11,000 words long, but the article about Xorn (a more recent addition to the X-Men franchise) is only about 3,000 words long.

I think this approach would work fairly well for differentiating between minor and major cultural icons. However, Wikipedia guidelines state that articles longer than around 10,000 words should be broken into multiple articles, which makes word count unsuitable for ranking the top cultural figures, because article length will eventually plateau.

However, while I was playing around with Wikipedia articles, I did think of another way of using them to gauge popularity, which I’ll introduce in Part 2.

______________________________
* Actually, if you’re Humble Master, this is probably exactly what you talk to your little brother about.

† If pressed to use a keyword search approach, I might compare the search terms “Wolverine comics,” “Cyclops comics,” “Batman comics,” etc. Adding the term “comics” would effectively clear up any ambiguity, but it would also eliminate results that talk about the characters in other media. (Part of the reason that some of these characters are so culturally prominent is that they’ve moved beyond the comics realm to film, television, fiction, etc.)