2  Getting Started with R Cheat Sheet

R is one of the most popular programming languages in data science and is widely used across various industries and academia. It’s open-source, easy to learn, and capable of handling complex data and statistical manipulations, making it the preferred computing environment for many data scientists today.

This cheat sheet will cover an overview of getting started with R.

2.1 Using Packages in R

R packages are collections of functions and tools developed by the R community. They increase the power of R by improving existing base R functionalities or adding new ones.

## Lets you install new packages (e.g., tidyverse package)
 #install.packages("tidyverse")

##Lets you load and use packages (e.g., tidyverse package)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

2.2 The Working Directory

The working directory is a file path that R will use as the starting point for relative file paths. That is, it’s the default location for importing and exporting files. An example of a working directory looks like “/file/path”

## Returns your current working directory
getwd()
[1] "/home/pawan/r-cheatsheets"
## Changes your current working directory to a desired file path
setwd("~/")

2.3 Operators in R

2.3.1 Arithmetic Operators in R

Operator Description
a + b Sums two variables
a - b Subtracts two variables
a * b Multiply two variables
a / b Divide two variables
a ^ b Exponentiation of a variable
a %% b The remainder of a variable
a %/% b Integer division of variables

2.3.2 Relational Operators in R

Operator Description
a == b Tests for equality
a != b Tests for inequality
a > b Tests for greater than
a < b Tests for smaller than
a >= b Tests for greater or equal than
a <= b Tests for smaller or equal than

2.3.3 Logical Operators in R

Operator Description
! Logical NOT
& Element-wise Logical AND
&& Logical AND
| Element-wise Logical OR
|| Logical OR

2.3.4 Assignment Operators in R

Operator Description
x <- 1, x = 1 Assigns a variable to x

2.3.5 Other Operators in R

Operator Description
%in% Identifies whether an element belongs to a vector
$ Allows you to access objects stored within an object
%>% Part of magrittr package, it’s used to pass objects to functions

2.4 Getting Started with Vectors in R

Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. In other words, a vector is a simple tool to store data.
A vector is a contiguous collection of objects of the same type. Common types of vectors include logical, integer, double, and character.

2.4.1 Creating Vectors in R

A vector is a one-dimensional data set or a single-column data set, that doesn’t have a row

c(1:59)  
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
[51] 51 52 53 54 55 56 57 58 59

Here c called the command
Within the brackets, the things called elements, and in the starting of elements inside the square bracket [] is the position of the element or the indicate the row number

There are types of vector
Numeric vector - 1,2,3,4
Character vector - “A”, “a”, “b”, “ram”
Logical - True, False

  • Creates a vector using elements separated by commas
c(1,3,5)
[1] 1 3 5
  • Creates a vector of integers between two numbers
1:7
[1] 1 2 3 4 5 6 7
  • Creates a vector between two numbers, with a specified interval between each element
seq(2,8,by = 2)
[1] 2 4 6 8
  • Creates a vector of given elements repeated a number of times
rep(2,8,times = 4)
[1] 2 2 2 2 2 2 2 2
  • Creates a vector of given elements repeating each element a number of times
rep(2,8,each = 3)
 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

2.5 Vector Functions in R

  • sort(my_vector) : Returns my_vector sorted
  • rev(my_vector) : Reverses the order of my_vector
  • table(my_vector) : Count the values in a vector
  • unique(my_vector) : Distinct elements in a vector

2.6 Selecting Vector Elements in R

  • my_vector[6] : Returns the sixth element of my_vector
  • my_vector[-6] : Returns all but the sixth element
  • my_vector[2:6] : Returns elements two to six
  • my_vector[-(2:6)] : Returns all elements except those between the second and the sixth
  • my_vector[c(2,6)] : Returns the second and sixth elements
  • my_vector[x == 5] : Returns elements equal to 5
  • my_vector[x < 5 ] : Returns elements less than 5
  • my_vector[x %in% c(2, 5 ,8 )] : Returns elements in the set {2, 5, 8}

3 Math Functions in R

  • log(x) : Returns the logarithm of a variable
  • exp(x) : Returns exponential of a variable
  • max(x) : Returns the maximum value of a vector
  • min(x) : Returns the minimum value of a vector
  • mean(x) : Returns the mean of a vector
  • sum(x) : Returns the sum of a vector
  • median(x) : Returns the median of a vector
  • quantile(x) : Percentage quantiles of a vector
  • round(x, n) : Round to n decimal places
  • rank(x) : Rank of elements in a vector
  • signif(x, n) : Round off n significant figures
  • var(x) : Variance of a vector
  • cor(x, y) : Correlation between two vectors
  • sd(x) : Standard deviation of a vector

3.1 Getting Started with Strings in R

The stringr package makes it easier to work with strings in R - you should install and load this package to use the following functions.

3.1.1 Find matches

  • Load stringr packages
library(stringr)
  • Detects the presence of a pattern match in a string
str_detect(string, pattern, negate = FALSE) 
  • Detects the presence of a pattern match at the beginning of a string
str_starts(string, pattern, negate = FALSE) 
  • Finds the index of strings that contain pattern match
str_which(string, pattern, negate = FALSE) 
  • Locates the positions of pattern matches in a string
str_locate(string, pattern)
  • Counts the number of pattern matches in a string
str_count(string, pattern)

3.1.2 Subset

  • Extracts substrings from a character vector
str_sub(string, start = 1L, end = -1L)
  • Returns strings that contain a pattern match
str_subset(string, pattern, negate = FALSE) 
  • Returns the first pattern match in each string as a vector
str_extract(string, pattern) 
  • Returns the first pattern match in each string as a matrix with a column for each group in the pattern
str_match(string, pattern)

3.1.3 Mutate

  • Replaces substrings by identifying the substrings with str_sub() and assigning them to the results
str_sub() <- value 
  • Replaces the first matched pattern in each string
str_replace(string, pattern, replacement)  
  • Replaces all matched patterns in each string
str_replace_all(string, pattern, replacement) 
  • Converts strings to lowercase
str_to_lower(string) 
  • Converts strings to uppercase
str_to_upper(string) 
  • Converts strings to title case
str_to_title(string) 

3.1.4 Join and split

  • Repeats strings n times
str_dup(string, n)
  • Splits a vector of strings into a matrix of substrings
str_split_fixed(string, pattern, n)