Row Functions
Permute the ordering of the rows. (Note the r in arrange, r for rows.) If you provide more than one column name, each additional column will be used to break ties in the values of preceding columns.
library (tidyverse)
library (nycflights13)
arrange (flights,month,dep_time)
# A tibble: 336,776 × 19
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
<int> <int> <int> <int> <int> <dbl> <int> <int>
1 2013 1 13 1 2249 72 108 2357
2 2013 1 31 1 2100 181 124 2225
3 2013 1 9 2 2359 3 432 444
4 2013 1 13 2 2359 3 502 444
5 2013 1 16 2 2125 157 119 2250
6 2013 1 10 3 2359 4 426 437
7 2013 1 13 3 2030 213 340 2350
8 2013 1 16 3 1946 257 212 2154
9 2013 1 30 3 2159 124 100 2306
10 2013 1 31 4 2359 5 455 444
# ℹ 336,766 more rows
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
# tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
# hour <dbl>, minute <dbl>, time_hour <dttm>
Use distinct
on several column names to find unique combinations. The .keep_all = TRUE
argument is used to retain all columsns
library (nycflights13)
flights |> distinct (origin,dest)
# A tibble: 224 × 2
origin dest
<chr> <chr>
1 EWR IAH
2 LGA IAH
3 JFK MIA
4 JFK BQN
5 LGA ATL
6 EWR ORD
7 EWR FLL
8 LGA IAD
9 JFK MCO
10 LGA ORD
# ℹ 214 more rows
Column functions
Add new variables (columns), usually via a formula involving existing ones.
helper functions
.before = 1
.after = some_var_name
Useful if you have too many columns, choose which columns you wish to view.
tips & helper functions
use : to select a range
use ! to exclude
use where
with is.factor()
, or is.numeric()
, or is.character()
starts_with("abc")
: matches names that begin with “abc
”.
ends_with("xyz")
: matches names that end with “xyz
”.
contains("ijk")
: matches names that contain “ijk
”.
num_range("x", 1:3)
: matches x1, x2
and x3
.
Explicitly rename variables. Do so in bulk with janitor::clean_names
Permute the ordering of columns (notice the c in relocate, c for columns)
Table Operations
One way to glue two tables together using bind_rows
and bind_cols
. Experiment to learn how different variables are handled.