What is a Series?
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.
Example:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
Labels
If nothing else is specified, the values are labeled with their index number. First value has index 0,
second value has index 1 etc. This label can be used to access a specified value.
Example:
Return the first value of the Series:
print(myvar[0])
Create your own labels :
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a, index = ["x", "y", "z"])
print(myvar)
You can also use a key/value object, like a dictionary, when creating a Series.
Note: The keys of the dictionary become the labels.
To select only some of the items in the dictionary, use the index argument and specify only the items
you want to include in the Series.
Example:
import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories, index = ["day1", "day2"])
print(myvar)
, DataFrames
Data sets in Pandas are usually multi-dimensional tables, called DataFrames.
Series is like a column, a DataFrame is the whole table.
Example:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
myvar = pd.DataFrame(data)
print(myvar)
Pandas use the loc attribute to return one or more specified row(s): print(df.loc[0])
Note: This example returns a Pandas Series.
print(df.loc[[0, 1]])
Note: When using [], the result is a Pandas DataFrame.
Read a CSV file
If you have a large DataFrame with many rows, Pandas will only return the first 5 rows, and the last 5
rows with only using print(df), but to show it all we should use the command:
df.to_string()
Example:
import pandas as pd
df = pd.read_csv('data.csv')
print(df.to_string())
In my system the number is 60, which means that if the DataFrame contains more than 60 rows,
the print(df) statement will return only the headers and the first and last 5 rows. (I knew it by running
the following: print(df.options.display.max_rows). We can change the maximum by coding:
import pandas as pd
pd.options.display.max_rows = 9999
df = pd.read_csv('data.csv')
print(df)