s Thoughts from the Physics Chick: Superhero Metrics, Part I: The Problem

Monday, June 20, 2011

Superhero Metrics, Part I: The Problem

A few weeks ago, I was talking to my brother about superheroes and other comic book characters, specifically, ranking them by popularity or cultural prominence. (Why? What do you talk to your little brother about?*)

We agreed that Batman, Superman, and Spider-Man were probably the top three, but we didn’t know who would come next.

Because I am an information scientist and this type of information problem is right up my alley, I kept thinking about this for a couple of days after the conversation.

The most straightforward approach would be to do a simple Google keyword search and count the number of hits. This works reasonably well if the words being searched don’t have prominent alternate meanings. (Most of the hits for “Batman,” for instance, are probably referring to the Dark Knight, and not the province in Turkey.)

However, this type of metric completely falls apart when you start trying to search for comics characters with names like “Wolverine” or “Cyclops.” (You could add limiting terms to the search, such as “Wolverine X-Men” or “Cyclops X-Men,” but then the results for those searches aren’t comparable to bare searches for “Batman” or “Superman.”†)

Having ruled out a keyword search as insufficiently precise, my thoughts turned to Wikipedia, one of my favorite resources.

The most obvious way to use Wikipedia to gauge cultural prominence would be to compare the lengths of the articles about different characters. For instance, the article about Professor X is about 11,000 words long, but the article about Xorn (a more recent addition to the X-Men franchise) is only about 3,000 words long.

I think this approach would work fairly well for differentiating between minor and major cultural icons. However, Wikipedia guidelines state that articles longer than around 10,000 words should be broken into multiple articles, which makes word count unsuitable for ranking the top cultural figures, because article length will eventually plateau.

However, while I was playing around with Wikipedia articles, I did think of another way of using them to gauge popularity, which I’ll introduce in Part 2.

______________________________
* Actually, if you’re Humble Master, this is probably exactly what you talk to your little brother about.

† If pressed to use a keyword search approach, I might compare the search terms “Wolverine comics,” “Cyclops comics,” “Batman comics,” etc. Adding the term “comics” would effectively clear up any ambiguity, but it would also eliminate results that talk about the characters in other media. (Part of the reason that some of these characters are so culturally prominent is that they’ve moved beyond the comics realm to film, television, fiction, etc.)

11 Comments:

At June 20, 2011 3:03 PM, Blogger Optimistic. said...

Is the solution to find the number of articles referring to those superheroes, making allowances for length and relevance?

 
At June 20, 2011 9:36 PM, Blogger Katya said...

That's an interesting approach, and it would probably work. I can't think of a good way to get that information easily, though.

You can use the "What links here," tool but that'll pull up links from redirects, templates and user pages, not just articles. The redirects and user page links might arguably be good metrics of importance, but the template links would have a tendency to flatten out the data. (If you care, I can explain why.)

You could also add up the word count of the sub-articles, on the basis that a major cultural icon will be split into more sub-articles once the main article is too long but, again, that sounds like a lot of work to calculate. :)

 
At June 20, 2011 9:38 PM, Blogger Katya said...

Of course, the beauty of waiting to post my solution to the problem is that if someone comes up with something better in the comments, I can just pretend that's what I had in mind all along!

 
At June 20, 2011 10:34 PM, Blogger Mr. Fob said...

Clearly your solution is to ask Mr. Fob. In order of cultural significance, based on what percentage of Average Joes and Janes of various ages would recognize the character enough to form some kind of mental image upon hearing the name and/or name the character upon seeing a picture:

1. Superman
2. Batman
3. Spider-Man
4. Wonder Woman
5. Hulk
6. Wolverine (just about anyone born after 1975 knows who he is, not so much older people)
7. Iron Man (thanks to recent movies)
8. Green Lantern (ditto)
9. Thor (ditto)
10. Aquaman (would be up at about 6 or 7, circa 2007, but movies carry a lot of cultural weight)
11. Hawkman
12. Flash
13. Captain America (will be higher in a month or so)
14. All the X-Men who aren't Wolverine
15. Dr. Manhattan (more commonly known as Glowing Naked Blue Man)

(Not counting spin-off characters like Robin, Supergirl, etc.)

There. I saved you a lot of work.

 
At June 20, 2011 10:35 PM, Blogger Mr. Fob said...

Really, the way to do it would be to conduct a massive survey. Let me know how it goes.

 
At June 22, 2011 12:32 PM, Blogger Katya said...

I like the idea of your massive survey. I also like your list, because it'll be useful to compare to my results, when I produce them. (One of the downsides of my methodology is that I don't have a good way of verifying its accuracy other than gut instinct.)

Out of curiosity, are you assuming your survey respondents are Americans? (That might have an impact on whether or not it squares with my results.)

 
At June 22, 2011 12:56 PM, Blogger Mr. Fob said...

Good point. I am assuming an American point of view, and I agree that the list would be slightly different from global perspective. (For example, Captain Britain would clearly be on the list in Captain America's place.) In addition to the results of your research, I'd be curious to hear what list your gut instinct gives you.

One other note: I'm aware my list is purely DC and Marvel characters, but (a) I couldn't think of others that belong on the list who clearly fit the description "superhero," and (b) technically "super heroes" is a registered trademark shared by Marvel and DC, so other characters don't officially count as superheroes.

 
At June 23, 2011 10:22 AM, Blogger Katya said...

In addition to the results of your research, I'd be curious to hear what list your gut instinct gives you.

That's hard to say, since I'm not very connected to the culture. (I'm familiar with comics characters mostly through TV and film adaptations, and even then I can be very behind the curve. As in, I saw the first X-Men film just a couple of months ago.)

I could probably group the comics figures into high, medium, and low profile groups, but not rank them within those groups with any precision.

Going off of your list, though, I'd say the thing that surprises me most is how highly you've ranked Wonder Woman.


I couldn't think of others that belong on the list who clearly fit the description "superhero" . . .

What about Hellboy?


Also, I'm planning on posting Part II in this series next Monday. Just in case you're going crazy with anticipation.

 
At June 23, 2011 6:08 PM, Blogger Mr. Fob said...

I think that not being connected to the culture makes you a better candidate to make a guess. When I think of superheroes, a few hundred completely obscure characters come to mind, none of which would be recognized by anyone who doesn't read superhero comics regularly.

It's always been my impression that Wonder Woman is one of the most recognized superheroes. She appeared in Superfriends and in the Lynda Carter TV show, among other things. She's had her own Underoos, which I'd say is a prerequisite to be on this list. And when my sister was a missionary in Peru, people called her Mujer Maravilla simply because she was tall and dark-haired. And then there are the millions upon millions of people who know about Wonder Woman because of my essay about her in last fall's Sunstone.

As for Hellboy, this is where I'm really not a good judge. Do people know who he is? I do, but that doesn't mean much.

 
At June 23, 2011 6:16 PM, Blogger Mr. Fob said...

Back to your initial question: It seems that number of people who have contributed to a Wikipedia article is a better indicator of popularity and cultural prominence than the length of the article. Is that where you're headed?

 
At June 24, 2011 9:22 AM, Blogger Katya said...

I think that not being connected to the culture makes you a better candidate to make a guess.

That's a good point, although it's complicated by the fact that I tend to have a superficial knowledge of most things, and an unexpectedly in depth knowledge of a few things. So, I don't know a lot about popular music, but I know a lot of 40s standards because they showed up up in Looney Tunes cartoons, which were part of the Saturday morning cartoon lineup when I was growing up.

So, if I don't know about something, we can probably say that it's below a certain threshold of common knowledge, but if I do know about it, it could just be a fluke (which may be the case with Hellboy).

It seems that number of people who have contributed to a Wikipedia article is a better indicator of popularity and cultural prominence than the length of the article. Is that where you're headed?

That's not exactly it, but it's along the right lines. (And, again, that methodology could serve as a good way to double check my work, so to speak. Also, it could actually help me out with another problem I'm working on, where my methodology is giving me a curve that's too flat to be useful.)

 

Post a Comment

<< Home