Skip to content

Combining Matplotlib and Pandas

The matplotlib framework integrates well with the popular pandas data processing library.

Success

If you would like to learn more about pandas, check out our other workshop!

For the following examples we will use our usual data:

months = [
    "Jan", "Feb", "Mar", "Apr", "May", "Jun",
    "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
]

water_levels_2010 = [
    5.77, 6.04, 6.52, 6.48, 6.54, 5.92, 
    5.64, 5.21, 5.01, 5.18, 5.45, 5.59
]

water_levels_2020 = [
    5.48, 5.82, 6.31, 6.26, 6.09, 5.87, 
    5.72, 5.54, 5.22, 4.86, 5.12, 5.40
]

Plotting Pandas Series

To plot a pandas series, it can be directly fed into the pyplot.plot(…)-function.

from matplotlib import pyplot
from pandas import Series

# … Raw data as above

# We turn it into a series
measurements = Series(
    data = water_levels_2010,
    index=months,
    name="Water levels in 2010"
)

# And plot it
pyplot.plot(measurements, label=measurements.name)  # (1)
pyplot.legend()
pyplot.show()

Explanation

  1. Note how we can make use of the fact that the series has a name to also assign it as a plot label.

Working with DataFrames

Matplotlib can also handle pandas DataFrames as input:

from pandas import DataFrame
from matplotlib import pyplot

# … Raw data as above

measurements = DataFrame(
    data = {
        "Water levels in 2010": water_levels_2010,
        "Water levels in 2020": water_levels_2020
    },
    index = months
)

pyplot.plot(measurements)
pyplot.legend(measurements.columns.values)
pyplot.show()

Data frames in pandas actually also bring a plotting function with them, so you could also write it this way:

# … Import and data frame creation as before

measurements.plot()
pyplot.show()

Isn’t that convenient? You can find more information on the DataFrame.plot documentation page