1 Lab Instructions

The goal of this lab is give you more practice with scripting, R Packages, and Basic Stats functions.

You will need to submit your answers from Section 5.2 onto to canvas.

2 R Scripting

For this lab, it is recommended you use an R script to analyze the data. As a reminder, the code you create here can be used for HW 1. You can use the R script that is provided on canvas or create your own1.

Additionally, remember to practice proper scripting techniques to ensure best practices. This includes naming techniques, R package placements, and proper commenting.

3 R Packages

R Packages are used to increases the functionality of R. Additionally, many R package developers release data to the public via. The data is saved as an RData file that can be easily accessed. For this section, you will install the palmerpenguins from CRAN.

Installation

There are two ways to install an R Package from CRAN, via the console or RStudio (recommended). You can choose either way to install the package. If you decide to install the palmerpenguins via the console, use the following code:

install.packages("palmerpenguins")

Accessing Data

Before you can access the data, you will need to load in the package, use the following code to load the package:

library(palmerpenguins)

The name of the data set is called penguins use the head function to view the first few lines of the data set. The output is provided below:

## # A tibble: 6 x 8
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
## 1 Adelie  Torge…           39.1          18.7              181        3750 male 
## 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
## 3 Adelie  Torge…           40.3          18                195        3250 fema…
## 4 Adelie  Torge…           NA            NA                 NA          NA <NA> 
## 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
## 6 Adelie  Torge…           39.3          20.6              190        3650 male 
## # … with 1 more variable: year <int>

If you look deeper into the penguins data set, you will notice there are missing values in a few observations. Therefore, we are going to eliminate these observations, by creating a new data set called new_penguin. You can use the na.omit function to eliminate observations with missing values:

new_penguin <- na.omit(penguins)
new_penguin

4 Creating Vectors

Using the new_penguin data set, convert the following variables to separate vectors:

  • species

  • island

  • sex

  • bill_length_mm

  • bill_depth_mm

  • flipper_length_mm

  • body_mass_g

As an example, the code below extracts the variable year and creates a new vector called penguins_year:

penguins_year <- new_penguin$year

5 Basic Statistics Functions

5.1 Summary Statistics

R has basic functions to calculate basic statistics for vectors. The table below provides a limited list of functions.

Numeric

Function Description
max Maximum
min Minimum
range Range (min and max)
mean Mean
median Median
sd Standard Deviation
sum Sum

Character

Function Description
table Obtain Frequencies
prop.table Obtain Relative Frequencies

5.2 Obtaining Statistics

Using the different functions obtain the required statistics. Record your answers on the Lab 1B quiz on Canvas.

Numeric

For the numeric vectors, obtain the following statistics: mean, median, standard deviation, and sum. For example:

mean(penguins_year)

Character

For the character vectors, obtain the table frequencies. For example:

table(penguins_year)

  1. Remember to properly comment which problems are being answered in the script.↩︎