Gather all the Data¶
Lets do some statistics! To collect a bundle of values we do not need individual variables. While a tuple already has a fixed size upon creation, we can use a list instead, since it can become larger and smaller as we go along.
population_over_time =  # This is an empty list for current_day in range(START_DAY, START_DAY + simulation_duration): print("Start of day", current_day) (current_population, current_food) = simulate_day(current_population, current_food) current_food = current_food + food_per_day population_over_time.append(current_population) # Put the new data point into our list print("Population over time:", population_over_time)
You can access the elements of a list via an index, as with tuples.
Also, lists can be used as a data source in
for-loops, like a
A basic Evaluation¶
There are some nice built-in functions that we can use for some basic statistics. Many of those accept a list as input.
# Calculate some statistical values gathered_values = len(population_over_time) # Counts the elements in a list lowest_population = min(population_over_time) highest_population = max(population_over_time) average_population = sum(population_over_time) / gathered_values print("We gathered", gathered_values, "data points") print("Minimum:", lowest_population, "individuals") print("Maximum:", highest_population, "individuals") print("Average:", average_population, "individuals")
A less basic Evaluation¶
Now let’s assume that we would also consider the median value to be of interest. That would be the value in the middle of our list if we were to sort all entries.
To understand the inner workings of this, we need to know how list entries can be accessed individually. Each element within a list has its current position given by a number, its so-called index. Indexes are counted starting from 0 and increase by 1 for each passed element.
As an example, consider the list
fruits = [ "apples", "bananas", "cherries", "dates", "elderberries" ].
Its indexes would look like this:
So even though the list has five elements, there is no index
There is also the reverse index, which uses negative numbers for counting and starts with the value -1 for the last element.
The values then decrease by 1 for each element going forward.
In our example we could access
fruits and it would yield
fruits[-2] would give us
Now that we have an idea how indexes work, we can pick out the element in the middle of a list, which we need for the median-calculation. For a list with an odd number of elements this is fairly straight-forward: Take the length of the list and (integer)-divide it by two.
Example: Odd-length list
odd_list = [100, 200, 250, 300, 300].
len(odd_list) would give us
len(odd_list) // 2 yields
odd_list would be
In case of an even-length list, we have two elements that form the center. This time we have to (integer)-divide the list length by two to get the element directly after the center. Subtract one from that elements index to get the element directly before the center. Last, we need to calculate the average of those two values to confrom to the definition of the median value.
Example: Even-length list
even_list = [100, 200, 250, 400].
len(even_list) would give us
len(even_list) // 2 yields
even_list would be
To calculate the median, we would also need the element before that, which would be
even_list[2 - 1] i.e.
This yields the value
200, which we would average with the previously obtained
250 to get a median value of
We want to encapsulate all of this into a function, since it is a bunch of maths that we want out of our way once we have written it all down.
def calculate_median(data): """Calculate the median value from a list of numbers. To calculate the median value, the data will be sorted by value and the value in the center of the sorted list is returned. In case of an even-length list, the two values closest to the center position will be averaged. The list itself will not be affected, all required modifications will be done on a copy. Args: data: A list of numeric values for which the median is to be calculated. Returns: The median value of the list. """ own_data = data.copy() # (1) (2) own_data.sort() # (1) (3) element_count = len(own_data) center_index = element_count // 2 if element_count % 2 == 0: # Do we have an even count of elements? before_middle = own_data[center_index - 1] after_middle = own_data[center_index] median = (before_middle + after_middle) / 2 else: median = own_data[center_index] return median
- We use the
.-notation here to call a function that is only defined in the context of the specific data type. The concept behind it is called object-oriented programming and is a whole workshop on its own. You can read a notation like
some_thing.some_function()as simnilar to
some_function(some_thing), for the purpose of this workshop.
- Because we do not want to mess with our original data, we will work on a copy instead.
sort-function can sort the elements of a list as long as they can be compared with each other. Note that this modifies the list directly (This is why we use a copy in the first place).
Finally, we can add another statistic to our program:
- Lists can bundle up multiple values
- The size of a list is not fixed and may change as the program progresses
- Indexes are numeric values that can access individual elements
This is the code that we have so far: