Filter & Storms dataset

Create subsets data

Published

August 27, 2025

Warm-up

  1. What’s wrong with the following code
library(tidyverse)
library(gapminder)
ggplot(gapminder,aes(x="year",y="pop")) + geom_point()

  1. Create an R-script that does the following.
  1. Load the gapminder data
  2. Create a variable where each value is a country in the gapminder data set, with no repeats.
  3. Use %in% to see if the variable contains various countries of your choice.
  4. Note how to execute the script using the keypad, which is just one way to execute a script.
source("my_script.R",echo = TRUE)
  1. Repeat the exercise above but instead use a Quarto Document.

The filter command

Below we make a subset of the data, whose country is China

C <- filter(gapminder, 
       country == "China")

Do a ?filter to learn how else to modify the 2nd parameter using & , | and more.

Use a filter to reduce the size of the data and then label points on a scatterplot using geom_text_repel

library(ggrepel)
hi_pop_countries <- filter(gapminder, 
                           pop > 500000000)
ggplot(hi_pop_countries, 
       aes(x = year, y = gdpPercap)) + 
       geom_point() + 
       geom_text_repel(aes(label = country))

Here’s another filter and preview of boxplots & the reorder function.

hi_pop_countries <- filter(gapminder, 
                           pop > 50000000)

ggplot(hi_pop_countries, 
       aes(x = lifeExp, y = country)) + geom_boxplot()

Isn’t this better? Use a plot & reorder for different variable.

ggplot(hi_pop_countries, 
       aes(x = lifeExp, y = reorder(country,lifeExp))) + 
    geom_boxplot()

And here’s a histogram:

TCU <- filter(gapminder,country %in% c("Taiwan","China","United States"))

TCU |> ggplot(aes(x = lifeExp, fill = country)) + geom_histogram(bins = 10)

Explore the storms dataset

The data() command lists all dataset included with R and the Tidyverse. Note that the storms data is in dplyr.

Use filter and varioius geoms geom_point(), geom_histogram(), geom_boxplot(). to compare storms across time.

Assignment 3

Complete these exercises. Append your answers to the quarto doc for Assignment 2. Submit as a .html. Copy the questions into your .qmd file and insert your responses after each one.

  1. Text: https://r4ds.hadley.nz/data-visualize.html#exercises

  2. Text: https://r4ds.hadley.nz/data-visualize#exercises-1

  3. Create three plots using filter and varioius geoms geom_point(), geom_histogram(), and geom_boxplot() to compare storms across time.