Tidy Tuesdays
  • Home
  • Weekly Challenges
  • About

On this page

  • Introduction
  • Initial Data Exploration
  • Key Variables
  • Visualizing Penguin Bill Dimensions
  • Enhanced Visualization
  • Exploring Species Distribution by Island
  • Conclusion
  • Resources

Week 1: Palmer Penguins

data visualization
ggplot2
penguins
Exploring penguin size measurements across species
Published

January 15, 2025

Introduction

This week’s Tidy Tuesday featured the Palmer Penguins dataset, which includes size measurements for three penguin species observed on three islands in the Palmer Archipelago, Antarctica.

Initial Data Exploration

Let’s look at the first few rows of the dataset:

# Using kable for better table formatting
head(df) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Species Island Bill Length (mm) Bill Depth (mm) Flipper Length (mm) Body Mass (g) Sex Year
Adelie Torgersen 39.1 18.7 181 3750 male 2007
Adelie Torgersen 39.5 17.4 186 3800 female 2007
Adelie Torgersen 40.3 18.0 195 3250 female 2007
Adelie Torgersen NA NA NA NA NA 2007
Adelie Torgersen 36.7 19.3 193 3450 female 2007
Adelie Torgersen 39.3 20.6 190 3650 male 2007

Key Variables

The dataset includes several key measurements: - Bill length (mm) - Bill depth (mm) - Flipper length (mm) - Body mass (g)

Visualizing Penguin Bill Dimensions

Let’s create a scatter plot showing the relationship between bill length and bill depth, colored by species:

ggplot(df, aes(x = `Bill Length (mm)`, y = `Bill Depth (mm)`, color = Species)) +
  geom_point(alpha = 0.7) +
  labs(title = "Penguin Bill Dimensions by Species",
       x = "Bill Length (mm)",
       y = "Bill Depth (mm)") +
  theme_minimal() +
  theme(legend.position = "bottom")

Enhanced Visualization

Here’s an enhanced version with trend lines to better visualize the relationship:

ggplot(df, aes(x = `Bill Length (mm)`, y = `Bill Depth (mm)`, color = Species)) +
  geom_point(alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE) +
  labs(title = "Penguin Bill Dimensions by Species (with trend lines)",
       x = "Bill Length (mm)",
       y = "Bill Depth (mm)") +
  theme_minimal() +
  theme(legend.position = "bottom")

Exploring Species Distribution by Island

Let’s look at how the penguin species are distributed across the islands:

# Create a summary table of species by island
species_by_island <- df %>%
  count(Species, Island) %>%
  spread(Island, n, fill = 0)

# Display as a nicely formatted table
species_by_island %>%
  kable(caption = "Penguin Species Distribution by Island") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Penguin Species Distribution by Island
Species Biscoe Dream Torgersen
Adelie 44 56 52
Chinstrap 0 68 0
Gentoo 124 0 0

Conclusion

This exploration demonstrates how simple visualizations can reveal clear patterns in biological data. The distinct clustering of penguin species based on bill dimensions shows how these measurements can be used for species identification.

Resources

  • Palmer Penguins R Package
  • Original Data Source
 
  • © 2025 | Tidy Tuesdays Personal Project