Data Types Overview¶

This page provides you with an overview over the most common data types in Python and gives a short introduction to their peculiarities.

Features of Data Types¶

Depending on the data type at hand, you can expect them to adhere to certain rules. These rules are guaranteed to hold by the Python language.

In this section you will find a short explanation over the most commonly used of these features and a table detailing to which data types they apply.

Overview¶

Data Type	Abbr.	Use case	Hashable	Immutable	Indexable	Iterable	Ordered
Boolean	bool	Express whether statements hold	Yes	Yes	-	-	-
Dictionary	dict	Associate keys with values	No	No	By key	By keys	Yes[^1]
Floating Point Number	float	Numbers with decimal part	Yes	Yes	-	-	-
Integer	int	Whole numbers	Yes	Yes	-	-	-
List	list	Ordered sequence of values	No	No	By position	Yes	Yes
NoneType[^2]	—	Express no data or unknown	Yes	-	-	-	-
String	str	Represent Text data	Yes	Yes	By position	Yes	Yes
Set	set	Mathematical (algebraic) set	No	No	No	Yes	No
Tuple	tuple	Fixed sequence of values	Yes	Yes	By position	Yes	Yes

[^1]: Since Python 3.7 [^2]: There is exactly one value of this type, called None

Hashable¶

An object of a certain data type is hashable if a so-called Hash-function can be computed for it. This is for example the case for immutable data types.

The hash is used internally by many algorithms for optimizations. Being hashable is most prominently required for objects that you want to use as keys in dictionaries or as elements of sets.

You can use the built-in hash(…) function to calculate the hash of a given value (if possible).

Example

>>> hash("Hello World")  # type `string` is hashable
3884018592779822938

>>> hash( set() )  # type `set` is not hashable
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

>>> my_dictionary = { set(): 0}  # since 'set' is not hashable, it can not be the key of a dictionary
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

Immutable¶

An object that can not be changed after creation is considered immutable. A danger of confusion rests in the misunderstanding that re-assigning a value to a variable changes the underlying object, which is not the case with immutable objects.

Example

The id(…) of an object can be used to see whether it is the same object being assigned a to variable or if a new object has been used to represent a new value.

Let’s look at the immutable case:

>>> my_value = 5  # Integer numbers are immutable
>>> id(my_value)
10918688
>>> my_value = my_value - 1  # Try to modify the value
>>> id(my_value)
10918656

Here we see that the ID of the object has changed. The calculation we made, has created a new object (representing the number 4) and re-assigned the variable.

Now, let’s see the mutable case:

>>> my_list = [1, 2, 3]  # Lists are mutable
>>> id(my_list)
140565007220360
>>> my_list.append(4)  # Modify the list
>>> id(my_list)
140565007220360

In this case the underlying object is the same and the list has actually been modified.

(Im-)Mutability becomes relevant when multiple variables share a reference to the same thing or when passing mutable data types into functions as parameters. In the case of mutable data types, a change made via one variable would be visible by the other one.

Indexable¶

If a data type allows for multiple elements, it is considered indexable if the elements can be individually accessed by providing a specific identifier, called an index. The index can be for example the position of an element in case of ordered data structures or the key in case of dictionaries.

Data types like tuple or list have a numeric index (i.e. the position of the element is given by a number), with the first index being 0.

Iterable¶

If a data type allows for multiple elements, it is considered iterable if the elements can be retrieved one-by-one. This is the prerequisite for being usable as a data source in for-loops.

Mutable¶

Opposite of Immutable.

Ordered¶

For data types that may contain multiple elements, it can be important to know if they follow a certain rule in which elements are arranged (e.g. by hash, by value or by insertion ordering). If that is the case, these data types are considered to be ordered. This can become important when iterating over such data types or trying to optimize algorithms.

Basic Data Types¶

This section will give an overview over common data types in Python. While these are not all the data types available, they are commonly used and can already solve a wide range of programming problems. Custom data types can be defined by using Object-oriented Programming

Integer¶

Integers represent whole numbers (positive as well as negative).

Floating Point Numbers¶

Floating point numbers represent decimal fractions like 3.14. They can also be written using the scientific notation: 2.5e3 instead of 2500. A full specification of this data type is to be found in the IEEE 754 standard.

Common Pitfalls

Using the wrong decimal separator: only . is allowed as a decimal separator, regardless of your actual language.
Mistaking floating point values for integers: 2.0 is a floating point number even if it represents a whole number
Special value NaN: not a number is actually a floating point number by definition. Also NaN is not equal to itself.
Not considering accuracy in sensitive calculations: Floating point numbers suffer from degrading accuracy over many calculations. Pay attention when your model is particularly reliant on precision!

Boolean Values¶

This data type is used to represent the truthfulness of a statement in boolean logic, which is very often used in decision making. Possible values for this data type are True and False (Notice the first uppercase letter).

The None-Value¶

None is the one and only possibe value of the NoneType. It is used to explicitly indicate missing, unknown or irrelevant data. In this role it often finds its application as a default value for function parameters.

Common Pitfalls

None should not be confused with not a number (NaN) or the empty string (""). Each of these carry a different implicit meaning and behave differently.
According to PEP8 it is preferrable to check for None by using the is keyword instead of ==
```
if data is None:
    print("No information available")
```

Collection Data Types¶

All collections are data types which can hold an arbitrary amount of elements (even 0). Collections are indexable and iterable. Elements of a collection are separated by using a , (except in the case of Strings).

Strings¶

Strings represent sequences of characters, individual elements that make up our text. Thus, the string data type is most often used to represent textual information. In Python it is required that their literal values are surrounded with " or '.

Example

my_text = "Hello World"

# Alternatively, you may use single quotes:
my_text = 'Hello World'

# You may use all characters defined in the UTF-8 standard:
hello_in_japanese = "こんにちは"

Within Strings, you can use the backslash (\) for several special symbols.

Sequence	Effect
`\\`	Generates a literal `\`
`\n`	Generates a new line (linebreak) indicator
`\"`	Generates a literal `"`

Example

print( "She said \"Hello\" and left.\nAnd I was like… :\\" )

Will output:

She said "Hello" and left.
And I was like… :\

Tuples¶

The term tuple is the abstraction for an ordered, immutable sequence of values. 2-tuples are commonly referred to as pairs, 3-tuples as triplets.

Example

xy_coordinates = (2, 3)  # Note that the parenthesis are recommended but optional
x_coordinate = xy_coordinates[0]  # Elements can be accessed by position index

Lists¶

A sequence of mutable values is called a list. Lists are by default ordered by insertion order, however the order can be changed by sorting the list or inserting elements in the middle and other operations.

Example

shopping_list = ["milk", "bread", "onions", "butter", "cheese", "tomatoes"]
shopping_list.sort()  # Now it is in alphabetic order

Sets¶

The mathmatical construct of a set is represented by the data type of the same name. As such, sets are mutable but make no guarantee about any order of elements and therefore can also not be accessed by index. In a set, duplicate elements are not allowed.

Example

multiples_of_2 = {2, 4, 6, 8, 10, 12}
multiples_of_3 = {3, 6, 9, 12}
multiples_of_2_or_3 = multiples_of_2.union(multiples_of_3)
# Will contain 6 and 12 only once, even though they were present in both sets
# No guarantee on the order of the elements in the resulting set

Dictionaries¶

Dictionaries are one of Pythons most prominent data types. They are made up from an (since Python 3.7: ordered!) set of key-value pairs. each of these KV-pairs follows the scheme of key: value. While there are no restrictions on the values, keys have to be hashable and unique within the dictionary. The keys further serve as the index into a dictionary making the data type very suitable for data where frequent lookups occur.

Example

language_codes = {
    "cz": "Czech",
    "de": "German",
    "jp": "Japanese",
    "sw": "Swahili"
}
chosen_language = language_codes["jp"]  # chosen_language is "Japanese"