Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

Data Science Student Solutions Manual – Step-by-Step Answers & Learning Guide (2026)

Rating
-
Sold
-
Pages
70
Grade
A+
Uploaded on
25-04-2026
Written in
2025/2026

The DataScience_SSM (Student Solutions Manual) provides detailed step-by-step solutions and explanations for exercises in a data science textbook. It helps learners understand how to apply statistical, computational, and analytical methods used in modern data science. This resource is ideal for students, instructors, and self-learners who want structured guidance in data analysis, modeling, and interpretation. Introduction to data science workflows Data collection and cleaning Exploratory data analysis (EDA) Probability and statistical reasoning Data visualization techniques Regression and predictive modeling Machine learning fundamentals Interpretation of data results Python/R-based analytical thinking (conceptual) Step-by-step solutions to exercises Clear explanations of data science methods Supports coursework and exam preparation Covers foundational data science concepts Data Science Student Solutions Manual (SSM) – Step-by-Step Answers and Explanations The Data Science Student Solutions Manual provides detailed solutions and explanations for textbook exercises. It covers key topics such as data analysis, probability, visualization, regression, and machine learning fundamentals, helping students build strong analytical and problem-solving skills in data science.

Show more Read less
Institution
Computer Science
Course
Computer Science

Content preview

https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub




https://www.stuvia.com/uaser/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub

,https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub



Principles of Data Science



Chapter 1
What Are Data and Data Science?



Chapter Review
[1.1, LO 1.1.1, 1.1.2]
1. Select the incorrect step and goal pair of the data science cycle.
a. Data collection: collect the data so that you have something for analysis.
b. Data preparation: have the collected data stored in a server as is so that you can start
the analysis.
c. Data analysis: analyze the prepared data to retrieve some meaningful insights.
d. Data reporting: present the data in an effective way so that you can highlight the
insights found from the analysis.


Solution: b. Data preparation: have the collected data stored in a server as is so that you can
start the analysis.
Rarely is collected data already in good shape for analysis. Most of the time, collected data
needs to be processed to be suitable for the analysis of interest. An example of preparation can
be dealing with missing data—removing them or filling them.

[1.2, LO 1.2.1]
3. Which of the following best exemplifies the interdisciplinary nature of data science in various
fields?
a. A historian traveling to Italy to study ancient manuscripts to uncover historical insights
about the Roman Empire
b. A mathematician solving complex equations to model physical phenomena
c. A biologist analyzing a large dataset of genetic sequences to gain insights about the
genetic basis of diseases
d. A chemist synthesizing new compounds in a laboratory


Solution: c. A biologist analyzing a large dataset of genetic sequences to gain insights about the
genetic basis of diseases
Traditionally, biologists would conduct lab experiments to answer questions in their field;
however, nowadays data science is being used to analyze large datasets to extract valuable
information that can shed light on complex topics such as the genetic basis of diseases. Option
a) is incorrect as studying primary sources does not inherently involve data science. Option b) is


11/11/24 For more free, peer-reviewed, openly licensed resources visit OpenStax.org. 2


https://www.stuvia.com/uaser/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub

,https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub



Principles of Data Science


incorrect as solving equations is not in the domain of data science. Option d) is incorrect as it
describes the traditional work of a chemist as a lab scientist.

Critical Thinking
[1.3, LO 1.3.4]
1. For each dataset, list the attributes.
a. Spotify dataset
b. CancerDoc dataset


Solution a: Following is the list of attributes in the Spotify dataset:
track_name, artist(s)_name, artist_count, released_year, released_month, released_day,
in_spotify_playlists, in_spotify_charts, streams, in_apple_playlists, in_apple_charts,
in_deezer_playlists, in_deezer_charts, in_shazam_charts, bpm, key, mode, danceability_%,
valence_%, energy_%, acousticness_%, instrumentalness_%, liveness_%, speechiness_%
Solution b: The CancerDoc dataset has three attributes; however, none of these attributes have
a clear name. They are: the column with numeric identifiers (the first column), the column with
cancer type (the second column), and the actual text (the third column).

[1.3, LO 1.3.2]
3. For each dataset, identify the type of the dataset—structured vs. unstructured. Explain why.
a. Spotify dataset
b. CancerDoc dataset


Solution a: The Spotify dataset is a structured dataset since each item in the dataset is in a
same form.
Solution b: The CancerDoc dataset is an unstructured dataset since the third column is the main
information while the first and second columns serve as labels of each entry (i.e., used to
distinguish each item in the dataset). The third column is a free-form text, so this dataset is
unstructured.

[1.3, LO 1.3.4]
5. Open the WikiHow dataset (ch1-wikiHow.json) and list the attributes of the dataset.
Solution: The ch1-wikiHow.json file has a list of items in an array (i.e., [ ]). Each array has an
object (i.e., { }) in which there are nine attributes total. The attributes are: “Time”, “URL”,
“MainTask”, “MainTaskSummary”, “Steps”, “Categories”, “Ingredients”, “Requirements”, and
“Tips”.
Note that some attributes have data in the form of an array as well. For example, “Steps” is an
array of which each element is also an object with three fields—“Headline”, “Description”, and
“Links”.


11/11/24 For more free, peer-reviewed, openly licensed resources visit OpenStax.org. 3


https://www.stuvia.com/uaser/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub

, https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub



Principles of Data Science



[1.5, LO 1.5.3]
7. Regenerate the scatterplot of the Spotify dataset, but with a custom title and x-/y-axis label.
The title should be “BPM vs. Danceability.” The x-axis label should be titled “bpm” and range
from the minimum to the maximum bpm value. The y-axis label should be titled “danceability”
and range from the minimum to the maximum Danceability value.
a. Python Matplotlib (Hint: DataFrame.min() and DataFrame.max() methods
return min and max values of the DataFrame. You can call these methods upon a specific
column of a DataFrame as well. For example, if a DataFrame is named df and has a
column named “col1”, df[“col1”].min() will return the minimum value of the
“col1” column of df. )
b. A spreadsheet program such as MS Excel or Google Sheets (Hint: Calculate the minimum
and maximum value of each column somewhere else first, then simply use the value
when editing the scatterplot.)
Solution a: The following code draws the same scatterplot with the custom title and axis labels.

import matplotlib.pyplot as plt
plt.scatter(data["bpm"], data["danceability_%"]) # draw the scatterplot
plt.title("BPM vs. Danceability") # set the title

plt.xlabel("BPM") # set the x-axis label
plt.xlim(data["bpm"].min(), data['bpm'].max()) # set the range of the axis

# set the y-axis label and its range of values
plt.ylabel("Danceability (%)")
plt.ylim(data["danceability_%"].min(), data['danceability_%'].max())

plt.show()




Solution b: (This solution is based on MS Excel.) You can edit the chart title by double-clicking
the title text. A cursor will show up, and you can edit the title text. The axis labels can be added
by clicking Chart Design > Add Chart Element > Axis Titles. Primary Vertical and Primary
Horizontal will add a text box for the x- and y-axes, respectively. You can edit the text boxes by
double-clicking them.

To set the range of the values to be related to the minimum and maximum values of the bpm
and danceability column, on Excel you need to calculate those values first. You can do so by
using =MIN() and =MAX() on each column. Note those values somewhere and use them in the
text boxes under Format Axis > Axis Options > Bounds. You can open the Format Axis menu by
either 1) double-clicking the axis elements or 2) right-clicking the axis elements and then
selecting Format Axis….


11/11/24 For more free, peer-reviewed, openly licensed resources visit OpenStax.org. 4


https://www.stuvia.com/uaser/openstaxstudyhub https://www.stuvia.com/user/openstaxstudyhub

Written for

Institution
Computer Science
Course
Computer Science

Document information

Uploaded on
April 25, 2026
Number of pages
70
Written in
2025/2026
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$18.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
OpenStaxStudyHub

Get to know the seller

Seller avatar
OpenStaxStudyHub Amg School Of Licensed Practical Nursing
Follow You need to be logged in order to follow users or courses
Sold
10
Member since
6 months
Number of followers
0
Documents
101
Last sold
3 weeks ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions