Skip to content

Data Types Overview

This page provides you with an overview over the most common data types in Python and gives a short introduction to their peculiarities.

Features of Data Types

Depending on the data type at hand, you can expect them to adhere to certain rules. These rules are guaranteed to hold by the Python language.

In this section you will find a short explanation over the most commonly used of these features and a table detailing to which data types they apply.

Overview

Data Type Abbr. Use case Hashable Immutable Indexable Iterable Ordered
Boolean bool Express whether statements hold Yes Yes - - -
Dictionary dict Associate keys with values No No By key By keys Yes[^1]
Floating Point Number float Numbers with decimal part Yes Yes - - -
Integer int Whole numbers Yes Yes - - -
List list Ordered sequence of values No No By position Yes Yes
NoneType[^2] Express no data or unknown Yes - - - -
String str Represent Text data Yes Yes By position Yes Yes
Set set Mathematical (algebraic) set No No No Yes No
Tuple tuple Fixed sequence of values Yes Yes By position Yes Yes

[^1]: Since Python 3.7 [^2]: There is exactly one value of this type, called None

Hashable

An object of a certain data type is hashable if a so-called Hash-function can be computed for it. This is for example the case for immutable data types.

The hash is used internally by many algorithms for optimizations. Being hashable is most prominently required for objects that you want to use as keys in dictionaries or as elements of sets.

You can use the built-in hash(…) function to calculate the hash of a given value (if possible).

Example

>>> hash("Hello World")  # type `string` is hashable
3884018592779822938

>>> hash( set() )  # type `set` is not hashable
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

>>> my_dictionary = { set(): 0}  # since 'set' is not hashable, it can not be the key of a dictionary
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

Immutable

An object that can not be changed after creation is considered immutable. A danger of confusion rests in the misunderstanding that re-assigning a value to a variable changes the underlying object, which is not the case with immutable objects.

Example

The id(…) of an object can be used to see whether it is the same object being assigned a to variable or if a new object has been used to represent a new value.

Let’s look at the immutable case:

>>> my_value = 5  # Integer numbers are immutable
>>> id(my_value)
10918688
>>> my_value = my_value - 1  # Try to modify the value
>>> id(my_value)
10918656

Here we see that the ID of the object has changed. The calculation we made, has created a new object (representing the number 4) and re-assigned the variable.

Now, let’s see the mutable case:

>>> my_list = [1, 2, 3]  # Lists are mutable
>>> id(my_list)
140565007220360
>>> my_list.append(4)  # Modify the list
>>> id(my_list)
140565007220360

In this case the underlying object is the same and the list has actually been modified.

(Im-)Mutability becomes relevant when multiple variables share a reference to the same thing or when passing mutable data types into functions as parameters. In the case of mutable data types, a change made via one variable would be visible by the other one.

Indexable

If a data type allows for multiple elements, it is considered indexable if the elements can be individually accessed by providing a specific identifier, called an index. The index can be for example the position of an element in case of ordered data structures or the key in case of dictionaries.

Data types like tuple or list have a numeric index (i.e. the position of the element is given by a number), with the first index being 0.

Further Reading

More details can be found in this article about list indexing.

Iterable

If a data type allows for multiple elements, it is considered iterable if the elements can be retrieved one-by-one. This is the prerequisite for being usable as a data source in for-loops.

Mutable

Opposite of Immutable.

Ordered

For data types that may contain multiple elements, it can be important to know if they follow a certain rule in which elements are arranged (e.g. by hash, by value or by insertion ordering). If that is the case, these data types are considered to be ordered. This can become important when iterating over such data types or trying to optimize algorithms.

Basic Data Types

This section will give an overview over common data types in Python. While these are not all the data types available, they are commonly used and can already solve a wide range of programming problems. Custom data types can be defined by using Object-oriented Programming

Integer

Integers represent whole numbers (positive as well as negative).

Floating Point Numbers

Floating point numbers represent decimal fractions like 3.14. They can also be written using the scientific notation: 2.5e3 instead of 2500. A full specification of this data type is to be found in the IEEE 754 standard.

Common Pitfalls

  • Using the wrong decimal separator: only . is allowed as a decimal separator, regardless of your actual language.
  • Mistaking floating point values for integers: 2.0 is a floating point number even if it represents a whole number
  • Special value NaN: not a number is actually a floating point number by definition. Also NaN is not equal to itself.
  • Not considering accuracy in sensitive calculations: Floating point numbers suffer from degrading accuracy over many calculations. Pay attention when your model is particularly reliant on precision!

Boolean Values

This data type is used to represent the truthfulness of a statement in boolean logic, which is very often used in decision making. Possible values for this data type are True and False (Notice the first uppercase letter).

The None-Value

None is the one and only possibe value of the NoneType. It is used to explicitly indicate missing, unknown or irrelevant data. In this role it often finds its application as a default value for function parameters.

Common Pitfalls

  • None should not be confused with not a number (NaN) or the empty string (""). Each of these carry a different implicit meaning and behave differently.
  • According to PEP8 it is preferrable to check for None by using the is keyword instead of ==

    if data is None:
        print("No information available")
    

Collection Data Types

All collections are data types which can hold an arbitrary amount of elements (even 0). Collections are indexable and iterable. Elements of a collection are separated by using a , (except in the case of Strings).

Strings

Strings represent sequences of characters, individual elements that make up our text. Thus, the string data type is most often used to represent textual information. In Python it is required that their literal values are surrounded with " or '.

Example

my_text = "Hello World"

# Alternatively, you may use single quotes:
my_text = 'Hello World'

# You may use all characters defined in the UTF-8 standard:
hello_in_japanese = "こんにちは"

Within Strings, you can use the backslash (\) for several special symbols.

Sequence Effect
\\ Generates a literal \
\n Generates a new line (linebreak) indicator
\" Generates a literal "

Example

print( "She said \"Hello\" and left.\nAnd I was like… :\\" )

Will output:

She said "Hello" and left.
And I was like… :\

Further Reading

More on Strings in the Python Documentation

Tuples

The term tuple is the abstraction for an ordered, immutable sequence of values. 2-tuples are commonly referred to as pairs, 3-tuples as triplets.

Example

xy_coordinates = (2, 3)  # Note that the parenthesis are recommended but optional
x_coordinate = xy_coordinates[0]  # Elements can be accessed by position index

Lists

A sequence of mutable values is called a list. Lists are by default ordered by insertion order, however the order can be changed by sorting the list or inserting elements in the middle and other operations.

Example

shopping_list = ["milk", "bread", "onions", "butter", "cheese", "tomatoes"]
shopping_list.sort()  # Now it is in alphabetic order

Sets

The mathmatical construct of a set is represented by the data type of the same name. As such, sets are mutable but make no guarantee about any order of elements and therefore can also not be accessed by index. In a set, duplicate elements are not allowed.

Example

multiples_of_2 = {2, 4, 6, 8, 10, 12}
multiples_of_3 = {3, 6, 9, 12}
multiples_of_2_or_3 = multiples_of_2.union(multiples_of_3)
# Will contain 6 and 12 only once, even though they were present in both sets
# No guarantee on the order of the elements in the resulting set

Dictionaries

Dictionaries are one of Pythons most prominent data types. They are made up from an (since Python 3.7: ordered!) set of key-value pairs. each of these KV-pairs follows the scheme of key: value. While there are no restrictions on the values, keys have to be hashable and unique within the dictionary. The keys further serve as the index into a dictionary making the data type very suitable for data where frequent lookups occur.

Example

language_codes = {
    "cz": "Czech",
    "de": "German",
    "jp": "Japanese",
    "sw": "Swahili"
}
chosen_language = language_codes["jp"]  # chosen_language is "Japanese"