Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Class notes

Introduction to working with data on R

Rating
-
Sold
1
Pages
10
Uploaded on
03-01-2021
Written in
2020/2021

Introduction to working with data on R

Institution
Course

Content preview

Working with health data in R and RStudio

Corinne Riddell

August 28, 2020

Learning objectives for today:
1. What is a data frame
2. How to read a comma separated values (CSV) file using read_csv()
3. Get to know the data using str(), head(), dim(), and names()
4. Manipulate the data frame using the R package dplyr’s main functions:
• rename()
• select()
• arrange()
• filter()
• mutate()
• group_by()
• summarize()

Readings
• There are no chapters from the textbook for this lecture.
• Here are some additional online resources (optional, but helpful!):
– Data Frames
– 15 min intro to dplyr
– Data wrangling cheat sheet

What is a data frame?
• A data frame is a data set.
• We read data into R from common sources like Excel spreadsheets (.xls or .xlsx), text files (.txt), comma
separate value files (.csv), and other formats.
• The simplest format of data contains one row for each individual in the study.
• The first column of the data identifies the individual (perhaps by a name or an ID variable).
• Subsequent columns are variables that have been recorded or measured.

Lake data from Baldi and Moore (B&M)
• Exercise 1.25 from Edition 4 of B&M
• Six rows of data from a study of mercury concentration across 53 lakes
• I’ve added three fabricated rows
• I’ve placed these data in Day-2 folder
• Let’s find it there

readr is a library to import data into R
• To access readr’s functions we load the library like this:




1

, library(readr)

• Click the green arrow to run the code or place your cursor on the line of code and type cmd + enter
(Mac) or control + enter (PC)
• A green rectangle that temporarily appears next to the code shows you that it has run.

read_csv() to load the lake data in R
• read_csv() is a function from the readr library used to import csv files.
• code template: your_data <- read_csv("pathway_to_data.csv")
• The <- is called the assignment operator. It says to save the imported data into an object called
your_data.
lake_data <- read_csv("Data_mercury_lake.csv")

## Parsed with column specification:
## cols(
## lakes = col_character(),
## ph = col_double(),
## chlorophyll = col_double(),
## mercury_in_fish = col_double(),
## number_fish_sampled = col_double(),
## age_data = col_character()
## )
• Anytime you see “##” on the html slides or in the PDF lecture files, the text in those lines are the
output of running the code in the previous line. So the lines above are the output displayed when you
run the read_csv() function.

Exercise 1
1. Execute the above code using either the green arrow or by clicking on it and hitting the keyboard
shortcut (cmd + enter on mac or Ctrl + enter on PC).
2. Note that the data appears in the Environment pane in the top right.
• Notice the number of observations and the number of variables.
3. Click the tiny table icon to the right of the lake_data in the Environment pane to open the Viewer
tab and inspect the data.


Check your understanding!
Four functions to get to know a dataset
• head(your_data): Shows the first six rows of the supplied dataset
• dim(your_data): Provides the number of rows by the number of columns
• names(your_data): Lists the variable names of the columns in the dataset
• str(your_data): Summarizes the above information and more
I use these functions all the time! Multiple times per session when working with data to remind me what the
variable names are, and what the data looks like.

head()
First six rows:
head(lake_data)




2

Written for

Institution
Course

Document information

Uploaded on
January 3, 2021
Number of pages
10
Written in
2020/2021
Type
Class notes
Professor(s)
Corinne riddell
Contains
1

Subjects

$7.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
rishabhgoel

Get to know the seller

Seller avatar
rishabhgoel University Of California - Berkeley
Follow You need to be logged in order to follow users or courses
Sold
1
Member since
5 year
Number of followers
1
Documents
1
Last sold
5 year ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions