Blog / other

Which Fictional Characters Are Mentioned the Most in Scientific Literature?

This post is about which fictional characters have been mentioned in the scientific literature and how many times

Fri Dec 03 2021

What do Harry Potter, Mickey Mouse, Cinderella, and Peppa Pig all have in common? They are all fictional characters that have been mentioned in scientific articles dozens of times. Yes, that's right; this post is about which fictional characters have been mentioned in the scientific literature and how many times.

mentions of fictional characters in academic literature

Utilizing the scite citation statement search, a database for 950M citation statements extracted from 27M full-text articles, we searched for popular fictional characters mentioned in the literature. This analysis is not an exhaustive list, nor was it created systematically. Nonetheless, we think it's fun and can show you just how diverse the scientific literature is, as well as how pervasive popular culture can be, even in areas where you might not expect it.

Fictional characters were crowdsourced through friends and family and members of the scite team and then searched on scite. The data and links to searches can be found here.

The R code to generate the above graph is available below. (Adapted from the code present in: https://www.gwern.net/Variables)

frequency <- rev(
  c(
    1,1,4,4,12,12,13,32,49,59,62,67,100,110,175,180,204,210,230,299,302,321,321,386,559,586,793,866,888,906,1032,1040,1161,1442,1632,2041,2366,3090,3491,3547,3576,3771,3880,4396,14630,55852,82243
  )
)

round(digits = 3, frequency / sum(frequency))

library(ggplot2)
library(ggrepel)
g <- qplot(1:length(frequency), frequency) +
  scale_y_continuous(trans = 'log2') +
  xlab("Character names by rank-ordering (decreasing)") + ylab("Mentions of character (log scale)") +
  ggtitle("Mentions of fictional characters in academic literature") +
  scale_x_discrete(expand = c(0.10, 0)) +
  geom_text_repel(size = I(3),
                  label = rev(
                    c("Snuffaluffagus ","Squidward","Scrooge McDuck","Rick and Morty","Rudolph the Red Nosed Reindeer","Bilbo Baggins","Optimus Prime","Minnie Mouse","Peppa Pig","Homer Simpson","Ninja Turtles","Pikachu","Easter Bunny","Donald Duck","Spongebob","Winnie the pooh","Ronald McDonald","Dora the explorer","X-men","Little Red Riding Hood","Spiderman","Voldemort","King Kong","Naruto","Peter Pan","James Bond","Super Mario","Godzilla","Mickey Mouse","Gonzo","Alice in Wonderland","Sherlock Holmes","Santa Claus","Gandalf","Snoopy","Mulan","Harry Potter","Popeye","Superman","Cinderella","Goldilocks","Barbie","Simba","Frankenstein","Elmo","Bert","Sonic Hedgehog"
                    )
                  ))
g$layers[[1]] <- NULL
g

Diving a bit deeper into how these were used we can see that sonic hedgehog has been discussed many times, as protein has been named after the character.

sonic hedgehog

Bert and Elmo, or more commonly known to scientists as BERT and ELMO, are deep learning models many scientists use, including us.

BERT

The oldest mention of a fictional character appears in 1967 when three authors mention a psychology experiment using a Winnie the Pooh puzzle.

oldest mention

Looking at more specific results with less ambiguous use, we can also see who has published the most on these characters and where they appear. For example, Harry Potter has been mentioned most by Professor Arthur M. Jacobs (you can see how here), with some posts even mentioning Harry Potter in the title.

harry potter

harry potter

The journals publishing articles on Harry Potter the most are as follows (maybe we need a goblet of fire journal ranking?):

harry potter journals

This post highlights how you can use the scite citation statement search to not only do fun searches on fictional characters mentioned in the literature but to show how scite can help you easily search any topic to see what experts say about this topic (whether Peppa Pig or Prostate Cancer). So, try it out and share your fun searches by tagging us on Twitter (https://twitter.com/scite):

https://scite.ai/search/citations

This post was inspired by Gwern Bronwen's post looking at the incidence of greek characters in arXiv: https://www.gwern.net/Variables