The untold story of palmerpenguins 🐧

Dr. Kristen Gorman, University of Alaska Fairbanks

Dr. Allison Horst, UC Santa Barbara

Dr. Alison Hill, Voltron Data

What is palmerpenguins?

  • An R package featuring the penguins dataset

  • 344 penguins

  • 3 penguin species (Adélie, chinstrap, and gentoo)

  • Inf fun

palmerpenguins popularity

Line graph showing palmerpenguins package downloads from CRAN, which increases quickly since published in 2020 and has leveled off near 1000 downloads per day.

  • > 476,400 CRAN downloads since 2020-07-23

  • Used globally in courses, workshops, blog posts, and other learning materials

  • penguins now in Python, Julia, and TensorFlow

palmerpenguins hex sticker with all three penguin species by Allison Horst

How the penguins came to be

The research

Dr. Kristen Gorman in the field, surrounded by penguins, at islands near Palmer Archipelago, Antarctica

  • An integrative study of the breeding ecology and population structure of Pygoscelis penguins along the western Antarctic Peninsula as part of the Palmer LTER Program (US NSF)

  • The data were originally published in PLoS ONE in 20141

  • All data were made available through the Environmental Data Initiative

Little did the penguin research team know…

Wanted: an alternative to iris

Measurements of an iris flower

  • Collected by botanist Edgar Anderson in 1935

  • Used everywhere in data science teaching & resources

  • 150 size measurements for 3 species of iris

  • No missing values

  • Lacks metadata

  • Variables like Sepal.Width

  • Published in The Annals of Eugenics (RA Fisher, 1936)

If something is a problem, offer a solution

  1. Keep using iris and use it as an opportunity to learn/teach about its problematic aspects.

  2. Find a better dataset to replace iris.

The eternal search for (better) useful datasets for teaching


  • Allison stumbles upon Gorman et al. and shares it with Alison

  • Alison writes a blog post with penguins after Allison shares it with her

21 of the described species with their respective IUCN status. Aquatic and flightless birds with remarkable swimming adaptations, penguins inhabit the Southern hemisphere (except the Galapagos Penguin). They range in size from the 45 kg Emperor Penguin to the 1.5 kg Little Penguin. Penguins are threatened by habitat loss and climate change, and the African, Galapagos and Yellow-eyed are some of the most endangered penguin species.

Finding the one

  • Meanwhile, Allison keeps playing with the penguins

  • And plotting with the penguins

  • And looking at more penguin pictures

  • Aligns eerily well with iris data