Class notes

Introduction to working with data on R

Rating

Sold

Pages

Uploaded on

03-01-2021

Written in

2020/2021

Introduction to working with data on R

Institution

Course

Content preview

Working with health data in R and RStudio

Corinne Riddell

August 28, 2020

Learning objectives for today:
1. What is a data frame
2. How to read a comma separated values (CSV) file using read_csv()
3. Get to know the data using str(), head(), dim(), and names()
4. Manipulate the data frame using the R package dplyr’s main functions:
• rename()
• select()
• arrange()
• filter()
• mutate()
• group_by()
• summarize()

Readings
• There are no chapters from the textbook for this lecture.
• Here are some additional online resources (optional, but helpful!):
– Data Frames
– 15 min intro to dplyr
– Data wrangling cheat sheet

What is a data frame?
• A data frame is a data set.
• We read data into R from common sources like Excel spreadsheets (.xls or .xlsx), text files (.txt), comma
separate value files (.csv), and other formats.
• The simplest format of data contains one row for each individual in the study.
• The first column of the data identifies the individual (perhaps by a name or an ID variable).
• Subsequent columns are variables that have been recorded or measured.

Lake data from Baldi and Moore (B&M)
• Exercise 1.25 from Edition 4 of B&M
• Six rows of data from a study of mercury concentration across 53 lakes
• I’ve added three fabricated rows
• I’ve placed these data in Day-2 folder
• Let’s find it there

readr is a library to import data into R
• To access readr’s functions we load the library like this:

1

, library(readr)

• Click the green arrow to run the code or place your cursor on the line of code and type cmd + enter
(Mac) or control + enter (PC)
• A green rectangle that temporarily appears next to the code shows you that it has run.

read_csv() to load the lake data in R
• read_csv() is a function from the readr library used to import csv files.
• code template: your_data <- read_csv("pathway_to_data.csv")
• The <- is called the assignment operator. It says to save the imported data into an object called
your_data.
lake_data <- read_csv("Data_mercury_lake.csv")

## Parsed with column specification:
## cols(
## lakes = col_character(),
## ph = col_double(),
## chlorophyll = col_double(),
## mercury_in_fish = col_double(),
## number_fish_sampled = col_double(),
## age_data = col_character()
## )
• Anytime you see “##” on the html slides or in the PDF lecture files, the text in those lines are the
output of running the code in the previous line. So the lines above are the output displayed when you
run the read_csv() function.

Exercise 1
1. Execute the above code using either the green arrow or by clicking on it and hitting the keyboard
shortcut (cmd + enter on mac or Ctrl + enter on PC).
2. Note that the data appears in the Environment pane in the top right.
• Notice the number of observations and the number of variables.
3. Click the tiny table icon to the right of the lake_data in the Environment pane to open the Viewer
tab and inspect the data.

Check your understanding!
Four functions to get to know a dataset
• head(your_data): Shows the first six rows of the supplied dataset
• dim(your_data): Provides the number of rows by the number of columns
• names(your_data): Lists the variable names of the columns in the dataset
• str(your_data): Summarizes the above information and more
I use these functions all the time! Multiple times per session when working with data to remind me what the
variable names are, and what the data looks like.

head()
First six rows:
head(lake_data)

2

Report Copyright Violation

Written for

Institution: University Of California - Berkeley
Course: Introduction to Biostatistics (PBHLTH142)

All documents for this subject (1)

Document information

Uploaded on: January 3, 2021
Number of pages: 10
Written in: 2020/2021
Type: Class notes
Professor(s): Corinne riddell
Contains: 1

Subjects

data
statistics
math
r

$7.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

rishabhgoel

Get to know the seller

rishabhgoel University Of California - Berkeley

View profile

Sold

Member since

5 year

Number of followers

Documents

Last sold

5 year ago

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller rishabhgoel. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 50860 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Introduction to working with data on R

Content preview

Written for

Document information

Subjects

Get to know the seller

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?