Data Types Overview¶
This page provides you with an overview over the most common data types in Python and gives a short introduction to their peculiarities.
Features of Data Types¶
Depending on the data type at hand, you can expect them to adhere to certain rules. These rules are guaranteed to hold by the Python language.
In this section you will find a short explanation over the most commonly used of these features and a table detailing to which data types they apply.
Overview¶
Data Type | Abbr. | Use case | Hashable | Immutable | Indexable | Iterable | Ordered |
---|---|---|---|---|---|---|---|
Boolean | bool | Express whether statements hold | Yes | Yes | - | - | - |
Dictionary | dict | Associate keys with values | No | No | By key | By keys | Yes[^1] |
Floating Point Number | float | Numbers with decimal part | Yes | Yes | - | - | - |
Integer | int | Whole numbers | Yes | Yes | - | - | - |
List | list | Ordered sequence of values | No | No | By position | Yes | Yes |
NoneType[^2] | — | Express no data or unknown | Yes | - | - | - | - |
String | str | Represent Text data | Yes | Yes | By position | Yes | Yes |
Set | set | Mathematical (algebraic) set | No | No | No | Yes | No |
Tuple | tuple | Fixed sequence of values | Yes | Yes | By position | Yes | Yes |
[^1]: Since Python 3.7
[^2]: There is exactly one value of this type, called None
Hashable¶
An object of a certain data type is hashable if a so-called Hash-function can be computed for it. This is for example the case for immutable data types.
The hash is used internally by many algorithms for optimizations. Being hashable is most prominently required for objects that you want to use as keys in dictionaries or as elements of sets.
You can use the built-in hash(…)
function to calculate the hash of a given value (if possible).
… info “Example”
```
>>> hash("Hello World") # type `string` is hashable
3884018592779822938
>>> hash( set() ) # type `set` is not hashable
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'
>>> my_dictionary = { set(): 0} # since 'set' is not hashable, it can not be the key of a dictionary
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'
```
Immutable¶
An object that can not be changed after creation is considered immutable. A danger of confusion rests in the misunderstanding that re-assigning a value to a variable changes the underlying object, which is not the case with immutable objects.
… info “Example”
The `id(…)` of an object can be used to see whether it is the same object being assigned a to variable or if a new object has been used to represent a new value.
Let's look at the **immutable** case:
```
>>> my_value = 5 # Integer numbers are immutable
>>> id(my_value)
10918688
>>> my_value = my_value - 1 # Try to modify the value
>>> id(my_value)
10918656
```
Here we see that the ID of the object has changed.
The calculation we made, has created a new object (representing the number `4`) and re-assigned the variable.
Now, let's see the **mutable** case:
```
>>> my_list = [1, 2, 3] # Lists are mutable
>>> id(my_list)
140565007220360
>>> my_list.append(4) # Modify the list
>>> id(my_list)
140565007220360
```
In this case the underlying object is the same and the list has actually been modified.
(Im-)Mutability becomes relevant when multiple variables share a reference to the same thing or when passing mutable data types into functions as parameters. In the case of mutable data types, a change made via one variable would be visible by the other one.
Indexable¶
If a data type allows for multiple elements, it is considered indexable if the elements can be individually accessed by providing a specific identifier, called an index. The index can be for example the position of an element in case of ordered data structures or the key in case of dictionaries.
Data types like tuple
or list
have a numeric index (i.e. the position of the element is given by a number), with the first index being 0
.
…info “Further Reading”
More details can be found in this [article about list indexing][list-indexing].
Iterable¶
If a data type allows for multiple elements, it is considered iterable if the elements can be retrieved one-by-one.
This is the prerequisite for being usable as a data source in for
-loops.
Mutable¶
Opposite of Immutable.
Ordered¶
For data types that may contain multiple elements, it can be important to know if they follow a certain rule in which elements are arranged (e.g. by hash, by value or by insertion ordering). If that is the case, these data types are considered to be ordered. This can become important when iterating over such data types or trying to optimize algorithms.
Basic Data Types¶
This section will give an overview over common data types in Python. While these are not all the data types available, they are commonly used and can already solve a wide range of programming problems. Custom data types can be defined by using Object-oriented Programming
Integer¶
Integers represent whole numbers (positive as well as negative).
Floating Point Numbers¶
Floating point numbers represent decimal fractions like 3.14
.
They can also be written using the scientific notation: 2.5e3
instead of 2500
.
A full specification of this data type is to be found in the IEEE 754 standard.
… warning “Common Pitfalls”
* Using the wrong decimal separator: only `.` is allowed as a decimal separator, regardless of your actual language.
* Mistaking floating point values for integers: `2.0` is a floating point number even if it represents a whole number
* Special value `NaN`: _not a number_ is actually a floating point number by definition. Also `NaN` is not equal to itself.
* Not considering accuracy in sensitive calculations: Floating point numbers suffer from degrading accuracy over many calculations. Pay attention when your model is particularly reliant on precision!
Boolean Values¶
This data type is used to represent the truthfulness of a statement in boolean logic, which is very often used in decision making.
Possible values for this data type are True
and False
(Notice the first uppercase letter).
The None-Value¶
None
is the one and only possibe value of the NoneType
.
It is used to explicitly indicate missing, unknown or irrelevant data.
In this role it often finds its application as a default value for function parameters.
… warning “Common Pitfalls”
* `None` should not be confused with _not a number_ (`NaN`) or the empty string (`""`). Each of these carry a different implicit meaning and behave differently.
* According to [PEP8][pep8] it is preferrable to check for `None` by using the `is` keyword instead of `==`
```python
if data is None:
print("No information available")
```
Collection Data Types¶
All collections are data types which can hold an arbitrary amount of elements (even 0).
Collections are indexable and iterable.
Elements of a collection are separated by using a ,
(except in the case of Strings).
Strings¶
Strings represent sequences of characters, individual elements that make up our text.
Thus, the string data type is most often used to represent textual information.
In Python it is required that their literal values are surrounded with "
or '
.
… info “Example”
```python
my_text = "Hello World"
# Alternatively:
my_text = 'Hello World'
# There is no restriction to the supported characters:
hello_in_japanese = "こんにちは"
```
Within Strings, you can use the backslash (\
) for several special symbols.
Sequence | Effect |
---|---|
\\ |
Generates a literal \ |
\n |
Generates a new line (linebreak) indicator |
\" |
Generates a literal " |
… info “Example”
```python
print( "She said \"Hello\" and left.\nAnd I was like… :\\" )
```
Will output:
```
She said "Hello" and left.
And I was like… :\
```
… info “Further Reading”
More on [Strings in the Python Documentation][strings-python-doc]
Tuples¶
The term tuple is the abstraction for an ordered, immutable sequence of values. 2-tuples are commonly referred to as pairs, 3-tuples as triplets.
… info “Example”
```python
xy_coordinates = (2, 3) # Note that the parenthesis are recommended but optional
x_coordinate = xy_coordinates[0] # Elements can be accessed by position index
```
Lists¶
A sequence of mutable values is called a list. Lists are by default ordered by insertion order, however the order can be changed by sorting the list or inserting elements in the middle and other operations.
… info “Example”
```python
shopping_list = ["milk", "bread", "onions", "butter", "cheese", "tomatoes"]
shopping_list.sort() # Now it is in alphabetic order
```
Sets¶
The mathmatical construct of a set is represented by the data type of the same name. As such, sets are mutable but make no guarantee about any order of elements and therefore can also not be accessed by index. In a set, duplicate elements are not allowed.
… info “Example”
```python
multiples_of_2 = {2, 4, 6, 8, 10, 12}
multiples_of_3 = {3, 6, 9, 12}
multiples_of_2_or_3 = multiples_of_2.union(multiples_of_3)
# Will contain 6 and 12 only once, even though they were present in both sets
# No guarantee on the order of the elements in the resulting set
```
Dictionaries¶
Dictionaries are one of Pythons most prominent data types.
They are made up from an (since Python 3.7: ordered!) set of key-value pairs.
each of these KV-pairs follows the scheme of key: value
.
While there are no restrictions on the values, keys have to be hashable and unique within the dictionary.
The keys further serve as the index into a dictionary making the data type very suitable for data where frequent lookups occur.
… info “Example”
```python
language_codes = {
"cz": "Czech",
"de": "German",
"jp": "Japanese",
"sw": "Swahili"
}
chosen_language = language_codes["jp"] # chosen_language is "Japanese"
```