Series¶

Let’s say we have a cat and we noticed it is sneezing a lot. We suspect it might be allergic to something. So we track the count of sneezes over one week. For this purpose, we could employ the Series data type provided by pandas.

Start by importing it:

from pandas import Series  # Note the initial upper-case letter

Creating a Series¶

There are different ways we can add data to a Series. We start out with a simple list:

sneeze_counts = Series(data=[32, 41, 56, 62, 30, 22, 17])
print(sneeze_counts)

Output

0    32
1    41
2    56
3    62
4    30
5    22
6    17
dtype: int64

Note that the Series automatically adds an index on the left side. It also automatically infers the best fitting data type for the elements (here int64 = 64-bit integer)

Note: If you are not familiar with Object-oriented Programming you might be caught a bit off guard by the way this actually works. In short, pandas introduces the series as a new data type (like int, str and all the others) and as such the value of sneeze_counts is actually the whole series at once.

Extra Information¶

To make the data a bit more meaningful, let’s set a custom index:

days_of_week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
sneeze_counts.index = days_of_week
print(sneeze_counts)

Output

Monday       32
Tuesday      41
Wednesday    56
Thursday     62
Friday       30
Saturday     22
Sunday       17
dtype: int64

Also, we add a name to the series, so we can distinguish it later:

sneeze_counts.name = "Sneezes"

All at Once¶

The index and name can also be passed directly while creating the series

We suspect that the illness of our cat is related to the weather, so we also log the average temperature and humidity

temperatures = Series(
    data=[10.9, 8.2, 7.6, 7.8, 9.4, 11.1, 12.4],
    index=days_of_week,
    name="Temperature"
)
humidities = Series(
    data=[62.5, 76.3, 82.4, 98.2, 77.4, 58.9, 41.2],
    index= days_of_week,
    name="Humidity"
)

Alternatively you can provide the index while creating the series by passing a dictionary:

sneeze_counts = Series(
    data= {
        "Monday": 32,
        "Tuesday": 41,
        "Wednesday": 56,
        "Thursday": 62,
        "Friday": 30,
        "Saturday": 22,
        "Sunday": 17
    },
    name="Sneezes"
)

Quick Maths¶

To get a first statistical impression of the data, use the describe()-method:

print(temperatures.describe())

Output

count     7.000000
mean      9.628571
std       1.871465
min       7.600000
25%       8.000000
50%       9.400000
75%      11.000000
max      12.400000
Name: Temperature, dtype: float64

Key Points

Series are a 1-dimensional data structure
You can use indices to label the data and a name to label the whole Series