R for the Rest of Us Cheatsheets

This is an in-progress version of cheatsheets for R in 3 Months participants and other people taking the R for the Rest of Us courses Fundamentals of R and Going Deeper with R.

General Syntax

Order Syntax Meaning Example
1 <- assignment operator x <- 10
2 -> alternative assignment 10 -> x
3 = assignment (mostly in function arguments) mean(x = c(1,2,3))
4 ! logical NOT operator: returns FALSE if statement is TRUE starwars |> filter(eye_color != “yellow”)
5 == equal x == y
6
greater than x > y
7 < less than x < y
8 =

=

greater than or equal to x >= y
9 <= less than or equal to x <= y
10 logical OR operator: returns TRUE if one of the statements is TRUE starwars |> filter(eye_color != “yellow” | height > 150)
11 & logical AND operator: returns TRUE if both elements are TRUE starwars |> filter(eye_color != “yellow” & height > 150)
12 : creates a sequence of numbers 1:10
13 checks if a value is in a vector 3 %in% c(1,2,3,4)
14 ( ) used for function calls and precedence sum(1, 2, 3)
15 |> pipe operator mtcars |> select(mpg, cyl)
16 function() defines a function f <- function(x) x^2

Main functions used in our program

Getting Started with R

[R in 3 Months] Week 1

In this week we learn how to import/read our data and packages, and explore it by creating quick summaries.

Order Function Name Package What it Does Example Category
1 install.packages() base R installs a package install.packages(“tidyverse”) Package Management
2 library() base R loads a package library(tidyverse) Package Management
3 read_csv() {readr} reads a csv file data <- read_csv(“data.csv”) Data Import/Export
4 glimpse() {dplyr} shows a summary of a data frame glimpse(penguins) Data Exploration
5 skim() {skimr} summary statistics about variables in data frames skim(penguins) Data Exploration
6 tbl_summary() {gtsummary} calculates descriptive statistics tbl_summary(penguins) Data Exploration
7 makeDataReport() {dataReporter} makes a data overview report makeDataReport(penguins) Data Exploration
8 scan_data() {pointblank} makes a data overview report scan_data(penguins) Data Exploration

Fundamentals of R

Data Wrangling and Analysis

[R in 3 Months] Week 2

These lessons teach us how to reduce our dataset to contain only variables and occurrences of interest, create new variables and summarise them.

Order Function Name Package What it Does Example Category
9 select() {dplyr} keep or drop columns starwars |> select(species, starships) Data Wrangling and Analysis
10 mutate() {dplyr} create, modify, and delete columns starwars |> mutate(body_ratio = mass / height) Data Wrangling and Analysis
11 filter() {dplyr} keep or drop rows that match a condition starwars |> filter(eye_color == “yellow”) Data Wrangling and Analysis
12 summarize() {dplyr} summarize a variable starwars |> summarize(avg_height = mean(height)) Data Wrangling and Analysis
13 group_by() {dplyr} group by one or more variables starwars |> group_by(species) |> summarize(avg_height = mean(height)) Data Wrangling and Analysis
14 arrange() {dplyr} order the rows of a data frame starwars |> arrange(desc(height)) Data Wrangling and Analysis

Data Visualization

[R in 3 Months] Week 3

In these lessons, we learned the basics of the grammar of graphics and how to make bar plots, scatterplots, and set some colours.

Order Function Name Package What it Does Example Category
15 ggplot() {ggplot2} initializes a ggplot object ggplot(data = starwars, aes(x = height, y = mass)) Data Visualization
16 aes() {ggplot2} defines aesthetic mappings aes(x = height, y = mass, color = gender) Data Visualization
17 geom_point() {ggplot2} creates a scatterplot ggplot(starwars, aes(x = height, y = mass)) + geom_point() Data Visualization
18 geom_bar() {ggplot2} creates a bar chart ggplot(starwars, aes(x = species)) + geom_bar() Data Visualization
19 geom_histogram() {ggplot2} creates a histogram ggplot(starwars, aes(x = height)) + geom_histogram(binwidth = 10) Data Visualization
20 geom_boxplot() {ggplot2} creates a boxplot ggplot(starwars, aes(x = gender, y = height)) + geom_boxplot() Data Visualization
21 facet_wrap() {ggplot2} splits the plot into multiple panels ggplot(starwars, aes(x = height, y = mass)) + geom_point() + facet_wrap(vars(species)) Data Visualization
22 facet_grid() {ggplot2} creates a grid of plots ggplot(starwars, aes(x = height, y = mass)) + geom_point() + facet_grid(rows = vars(gender), cols = vars(species)) Data Visualization
23 labs() {ggplot2} adds labels to the plot ggplot(starwars, aes(x = height, y = mass)) + geom_point() + labs(title = “Height vs Mass in Star Wars”) Data Visualization
24 theme() {ggplot2} customizes plot appearance ggplot(starwars, aes(x = height, y = mass)) + geom_point() + theme_minimal() Data Visualization
25 scale_color_manual() {ggplot2} manually sets color scale ggplot(starwars, aes(x = height, y = mass, color = gender)) + geom_point() + scale_color_manual(values = c(“blue”, “pink”, “purple”)) Data Visualization
26 ggsave() {ggplot2} saves plots to a file ggsave(filename = “plots/starwars-plot.pdf”, height = 8, width = 11, units = “in”) Data Visualization

Quarto

[R in 3 Months] Week 4

On week 4, we dive into Quarto and learn about code chunks, how to format text using Markdown, and how to render basic reports in PDF, Word and HTML.

Order Function Name Package What it Does Example Category
27 format: html Quarto sets document output format to HTML format: html Quarto
28 format: pdf Quarto sets document output format to PDF format: pdf Quarto
29 format: docx Quarto sets document output format to Word format: docx Quarto
30 toc: true Quarto adds a table of contents toc: true Quarto
31 Heading

Heading

Quarto creates a heading Heading

Heading

Quarto
32 bold Quarto formats text as bold **bold** Quarto
33 italic Quarto formats text as italic *italic* Quarto
34 Link Quarto creates a hyperlink [Google](https://www.google.com) Quarto
35 List item Quarto creates an unordered list - Item 1
- Item 2
- Item 3
Quarto
36 Ordered item
  1. Ordered item
Quarto creates an ordered list FirstSecondThird
  1. First
  2. Second
  3. Third
Quarto
37 Blockquote Quarto creates a blockquote > This is a blockquote Quarto
38 Inline code Quarto formats text as code The top response was `r top_response` Quarto
39 Code block Quarto creates a chunk of R code in a Quarto document ```{r}
library(tidyverse)
```
Quarto
40 Horizontal line Quarto creates a horizontal line --- Quarto
41 Image Quarto inserts an image ![Quarto Logo](quarto.png) Quarto
42 #| echo: false Quarto code chunk option to hide code but shows output echo: false Quarto
43 #| eval: false Quarto code chunk option to prevent code execution #| eval: false Quarto
44 #| warning: false Quarto code chunk option to hide warnings #| warning: false Quarto
45 #| message: false Quarto code chunk option to hide messages #| message: false Quarto
46 #| cache: Quarto code chunk option to cache code output #| cache: true Quarto
47 #| fig-align: ‘center’ Quarto code chunk option to align figure in the center #| fig-align: “center” Quarto
48 #| fig-cap: ‘Caption’ Quarto code chunk option to add a caption to a figure #| fig-cap: “Scatterplot” Quarto
49 #| fig-width: Quarto code chunk option to set figure width #| fig-width: 6 Quarto
50 #| fig-height: Quarto code chunk option to set figure height #| fig-height: 4 Quarto

Going Deeper with R

Advanced Data Wrangling

[R in 3 Months] Weeks 6, 7 and 8

In the advanced data wrangling section, we learn how to reshape our datasets and how to combine them. We also learn about the tidy principles and how to make a messy dataset more tidy.

Order Function Name Package What it Does Example Category
51 download.file() Base R downloads a file from the Internet download.file(“url”, destfile = “my_path/my_file.csv”) Data Import/Export
52 read_excel() {readxl} reads an Excel file into R read_excel(“data.xlsx”, sheet = 1) Data Import/Export
53 write_csv() {readr} writes a dataframe to a CSV file write_csv(data, “output.csv”) Data Import/Export
54 pivot_longer() {tidyr} converts wide data into long format pivot_longer(data, cols = c(var1, var2), names_to = “variable”, values_to = “value”) Data Wrangling and Analysis
55 pivot_wider() {tidyr} converts long data into wide format pivot_wider(data, names_from = “variable”, values_from = “value”) Data Wrangling and Analysis
56 separate_wider_delim() {tidyr} separates one column into multiple columns separate_wider_delim(data, col = “variable”, into = c(“var1”, “var2”), sep = “_“) Data Wrangling and Analysis
57 separate_longer_delim() {tidyr} splits one column into multiple rows by a delimiter separate_longer_delim(data, col = “tags”, delim = “,”) Data Wrangling and Analysis
58 count() {dplyr} counts instances of unique values in a column data |> count(category) Data Wrangling and Analysis
59 distinct() {dplyr} returns unique rows in a dataset data |> distinct() Data Wrangling and Analysis
60 parse_number() {readr} extracts numbers from character strings parse_number(“$1,234.56”) Data Wrangling and Analysis
61 as.numeric() {base} converts a column to numeric type as.numeric(c(“1”, “2”, “3”)) Data Wrangling and Analysis
62 case_when() {dplyr} performs conditional value assignment mutate(data, category = case_when(value > 10 ~ “High”, .default ~ “Low”)) Data Wrangling and Analysis
63 case_match() {dplyr} maps values to new categories mutate(data, category = case_match(x, “A” ~ “Alpha”, “B” ~ “Beta”)) Data Wrangling and Analysis
64 na_if() {dplyr} replaces a specific value with NA mutate(data, var = na_if(var, “Unknown”)) Data Wrangling and Analysis
65 contains() {dplyr} selects columns that contain a string select(data, contains(“score”)) Data Wrangling and Analysis
66 starts_with() {dplyr} selects columns that start with a string select(data, starts_with(“age”)) Data Wrangling and Analysis
67 str_detect() {stringr} checks if a string contains a pattern filter(data, str_detect(name, “John”)) Data Wrangling and Analysis
68 str_replace() {stringr} replaces text patterns in a string mutate(data, name = str_replace(name, “Mr.”, ““)) Data Wrangling and Analysis
69 bind_rows() {dplyr} binds multiple dataframes by rows bind_rows(df1, df2) Data Wrangling and Analysis
70 inner_join() {dplyr} joins two datasets, keeping only matching rows inner_join(df1, df2, join_by(“id”)) Data Wrangling and Analysis
71 left_join() {dplyr} joins two datasets, keeping all rows from the left left_join(df1, df2, join_by(“id”)) Data Wrangling and Analysis
72 right_join() {dplyr} joins two datasets, keeping all rows from the right right_join(df1, df2, join_by(“id”)) Data Wrangling and Analysis
73 full_join() {dplyr} joins two datasets, keeping all rows from both full_join(df1, df2, join_by(“id”)) Data Wrangling and Analysis

Advanced Data Visualization

[R in 3 Months] Weeks 9, 10 and 11

This section teaches us how to make our plots look better. We learn how to tweak themes to declutter, highlight and explain.

Order Function Name Package What it Does Example Category
74 read_rds() {readr} reads an RDS file into R data <- read_rds(“data.rds”) Data Import/Export
75 write_rds() {readr} writes a data frame to an RDS file data |> write_rds(“data.rds”) Data Import/Export
76 ungroup() {dplyr} removes grouping structure from a dataset data |> ungroup() Data Wrangling and Analysis
77 is.nan() base R checks if a value is NaN (Not a Number) is.nan(c(1, NaN, 3)) Data Exploration
78 is.na() base R checks if a value is NA (Not Available) is.na(c(1, NA, 3)) Data Exploration
79 fct_reorder() {forcats} reorders a factor based on another variable data |> mutate(category = fct_reorder(category, value)) Data Wrangling and Analysis
80 geom_line() {ggplot2} adds a line plot layer to ggplot ggplot(data, aes(x, y)) + geom_line() Data Visualization
81 slice_max() {dplyr} selects rows with the highest values of a column data |> slice_max(order_by = value, n = 5) Data Wrangling and Analysis
82 pull() {dplyr} extracts a single column as a vector vector <- data |> pull(column_name) Data Wrangling and Analysis
83 fct_relevel() {forcats} relevels a factor, moving specified levels first data |>mutate(category = fct_relevel(category, “High”)) Data Wrangling and Analysis
84 lag() {dplyr} shifts values down by one or more rows data |> mutate(prev_value = lag(value)) Data Wrangling and Analysis
85 geom_text() {ggplot2} adds text labels to a ggplot ggplot(data, aes(x, y, label = label)) + geom_text() Data Visualization
86 geom_text_repel() {ggrepel} adds text labels that avoid overlapping ggplot(data, aes(x, y, label = label)) + geom_text_repel() Data Visualization
87 scale_y_continuous() {ggplot2} adjusts the y-axis scale ggplot(data, aes(x, y)) + scale_y_continuous(labels = scales::percent) Data Visualization
88 rename() {dplyr} renames columns in a dataset data |> rename(new_name = old_name) Data Wrangling and Analysis
89 drop_na() {tidyr} removes rows with missing values drop_na(data, column_name) Data Wrangling and Analysis
90 str_glue() {glue} creates formatted strings using variables data |> mutate(full_name = str_glue(“My name is {first_name} {last_name}”)) Data Wrangling and Analysis
91 percent_format() {scales} formats numbers as percentages ggplot(data, aes(x, y)) + scale_y_continuous(labels = percent_format()) Data Visualization
92 annotate() {ggplot2} adds text or shapes to a ggplot ggplot(data, aes(x, y)) + annotate(“text”, x = 5, y = 10, label = “Note”) Data Visualization

Advanced Quarto

[R in 3 Months] Week 12

In these option advanced Quarto lessons we learn how to make parameterized reports, how to use inline R code to print variables as text, and how to publish your reports.

Order Function Name Package What it Does Example Category
93 flextable() {flextable} creates a customizable table flextable(data) Tables
94 gt() {gt} creates a gt table for formatted output gt(data) Tables
95 cols_label() {gt} renames columns in a gt table data |> gt() |> cols_label(column_name = “New Label”) Tables
96 cols_width() {gt} sets column widths in a gt table data |> gt() |> cols_width(vars(column1) ~ px(100), vars(column2) ~ px(200)) Tables
97 cols_align() {gt} aligns columns in a gt table data |> gt() |> cols_align(align = “center”, columns = vars(column_name)) Tables
98 tab_caption() {gt} adds a caption below a gt table data |> gt() |> tab_caption(“This is a table caption.”) Tables
99 opt_interactive() {gt} makes the table interactive (sortable, searchable) data |> gt() |> opt_interactive() Tables
100 quarto_render() {quarto} renders a Quarto document to the specified format quarto_render(“report.qmd”, output_format = “html”) Quarto
101 tibble() {tibble} creates a modern data frame tibble(name = c(“Alice”, “Bob”), age = c(25, 30)) Quarto
102 pwalk() {purrr} iterates over multiple lists, applying a function pwalk(list(names, ages), ~ print(paste(.x, “is”, .y, “years old”))) Quarto