Skip to content

Task 04: Initial Exploration

This is a suggested solution. It is meant to help you out if you struggle with a certain aspect of the exercise. Your own solution may differ widely and can still be perfectly valid.

Statistics

Since the way to obtain the statistics is the same for all the columns we are interested in, we can do it in a convenient loop.

for topic in [LABEL_TEMP, LABEL_DEW, LABEL_RAIN_1H]:
    print(
        "Statistics on", topic, "…",
        "Min:", weather_data[topic].min(), 
        "Max:", weather_data[topic].max(), 
        "Mean:", weather_data[topic].mean()
    )

Hint

For a quick inspection, the describe()-method for DataFrames is also a good choice. Try it!

Calculating the Overall Rain

Since we have hourly reports it makes sense to add up the measurements in the last hour (instead of the 6-hour summaries that are also given). It’s a good thing we removed those pesky -1 values from the precipitation column, isn’t it?

print("Total precipitation measured:", weather_data[LABEL_RAIN_1H].sum())

Maximum Temperature Differences

To solve this task, we basically need a new data frame with the differences between each of our rows of data. Pandas offers the diff()-method for this exact purpose. Besides the max() and min()-methods, there is also the idxmax() and idxmin()-method, that returns the index (in our case the date and hour) of the extreme values.

temperature_differences = weather_data[LABEL_TEMP].diff()

print(
    "Largest temperature rise on", temperature_differences.idxmax(),
    "with", temperature_differences.max(), "K in one hour"
)

print(
    "Largest temperature drop on", temperature_differences.idxmin(),
    "with", temperature_differences.min(), "K in one hour"
)

Wind Speed Changes

We start by creating the new column. For convenience, we should also introduce a new constant for the label of the column.

LABEL_SPEED_DELTA = "Wind Speed Change"
weather_data[LABEL_SPEED_DELTA] = weather_data[LABEL_SPEED].diff()

Based on that we can also figure out the absolute change and its maximum.

absolute_speed_delta = weather_data[LABEL_SPEED_DELTA].abs()

print(
    "Largest wind speed change on", absolute_speed_delta.idxmax(),
    "with", absolute_speed_delta.max(), "m/s in one hour"
)