What's new in Python 3.11?
The first beta release of Python 3.11 is out, bringing some fascinating features for us to tinker with. This is what you can expect to see in 2022's release of Python later this year.
Even better error messages
Python 3.10 gave us better error messages in various regards, but Python 3.11 aims to improve them even more. Some of the most important things that are added to error messages in Python 3.11 are:
Exact error locations in tracebacks
Until now, in a traceback, the only information you got about where an exception got raised was the line. The issue could have been anywhere on the line though, so sometimes this information was not enough.
Here's an example:
def get_margin(data):
margin = data['profits']['monthly'] / 10 + data['profits']['yearly'] / 2
return margin
data = {
'profits': {
'monthly': 0.82,
'yearly': None,
},
'losses': {
'monthly': 0.23,
'yearly': 1.38,
},
}
print(get_margin(data))
This code results in an error, because one of these fields in the dictionary is None. This is what we get:
Traceback (most recent call last):
File "/Users/tusharsadhwani/code/marvin-python/mytest.py", line 15, in
print(get_margin(data))
File "/Users/tusharsadhwani/code/marvin-python/mytest.py", line 2, in print_margin
margin = data['profits']['monthly'] / 10 + data['profits']['yearly'] / 2
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
But it is impossible to tell by the traceback itself, which part of the calculation caused the error.
On 3.11 however:
Traceback (most recent call last):
File "asd.py", line 15, in
print(get_margin(data))
^^^^^^^^^^^^^^^^
File "asd.py", line 2, in print_margin
margin = data['profits']['monthly'] / 10 + data['profits']['yearly'] / 2
~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
It is crystal clear, that data['profits']['yearly'] was None.
To be able to render this information, the end_line and end_col data was added to Python code objects. You can also access this information directly through the obj.__code__.co_positions() method.
Notes for exceptions
To make tracebacks even more context rich, Python 3.11 allows you to add notes to exception objects, which get stored in the exceptions, and displayed when the exception is raised.
Take this code for example, where we add important information about some API data conversion logic:
def get_seconds(data):
try:
milliseconds = float(data['milliseconds'])
except ValueError as exc:
exc.add_note(
"The time field should always be a number, this is a critial bug. "
"Please report this to the backend team immediately."
)
raise # re-raises the exception, instead of silencing it
seconds = milliseconds / 1000
return seconds
get_seconds({'milliseconds': 'foo'}) # 'foo' is not a number!
This added note gets printed just below the Exception message:
Traceback (most recent call last):
File "asd.py", line 14, in
get_seconds({"milliseconds": "foo"})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "asd.py", line 3, in get_seconds
milliseconds = float(data["milliseconds"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: 'foo'
The time field should always be a number, this is a critial bug. Please report this to the backend team immediately.
Built in toml support
The standard library now has built-in support for reading TOML files, using the tomllib module:
import tomllib
with open('.deepsource.toml', 'rb') as file:
data = tomllib.load(file)
tomllib is actually based on an open source TOML parsing library called tomli. And currently, only reading TOML files is supported. If you need to write data to a TOML file instead, consider using the tomli-w package.
asyncio Task Groups
When doing asynchronous programming, you often run into situations where you have to trigger many tasks to run concurrently, and then take some action when they are completed. For example, downloading a bunch of images in parallel, and then bundling them to a zip file at the end.
To do that, you need to collect tasks and pass them to asyncio.gather.Here's a simple example of parallelly running tasks with the gather function:
import asyncio
async def simulate_flight(city, departure_time, duration):
await asyncio.sleep(departure_time)
print(f"Flight for {city} departing at {departure_time}PM")
await asyncio.sleep(duration)
print(f"Flight for {city} arrived.")
flight_schedule = {
'Boston': [3, 2],
'Detroit': [7, 4],
'New York': [1, 9],
}
async def main():
tasks = []
for city, (departure_time, duration) in flight_schedule.items():
tasks.append(simulate_flight(city, departure_time, duration))
await asyncio.gather(*tasks)
print("Simulations done.")
asyncio.run(main())
$ python asd.py
Flight for New York departing at 1PM
Flight for Boston departing at 3PM
Flight for Boston arrived.
Flight for Detroit departing at 7PM
Flight for New York arrived.
Flight for Detroit arrived.
Simulations done.
But having to maintain a list of the tasks yourself to be able to await them is a bit clunky. So now a new API is added to asyncio called Task Groups:
import asyncio
async def simulate_flight(city, departure_time, duration):
await asyncio.sleep(departure_time)
print(f"Flight for {city} departing at {departure_time}PM")
await asyncio.sleep(duration)
print(f"Flight for {city} arrived.")
flight_schedule = {
'Boston': [3, 2],
'Detroit': [7, 4],
'New York': [1, 9],
}
async def main():
async with asyncio.TaskGroup() as tg:
for city, (departure_time, duration) in flight_schedule.items():
tg.create_task(simulate_flight(city, departure_time, duration))
print("Simulations done.")
asyncio.run(main())
When the asyncio.TaskGroup() context manager exits, it ensures that all the tasks created inside it have finished running.
Bonus: Exception groups
A similar feature was also added for exception handling inside async tasks, called exception groups.
Say you have many async tasks running together, and some of them raised errors. Currently Python's exception handling system doesn't work well in this scenario.
Here's a short demo of what it looks like with 3 concurrent crashing tasks:
import asyncio
def bad_task():
raise ValueError("oops")
async def main():
tasks = []
for _ in range(3):
tasks.append(asyncio.create_task(bad_task()))
await asyncio.gather(*tasks)
asyncio.run(main())
When you run this code:
$ python asd.py
Traceback (most recent call last):
File "asd.py", line 13, in
asyncio.run(main())
File "/usr/bin/python3.8/lib/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/bin/python3.8/lib/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "asd.py", line 9, in main
tasks.append(asyncio.create_task(bad_task()))
File "asd.py", line 4, in bad_task
raise ValueError("oops")
ValueError: oops
There's no indication that 3 of these tasks were running together. As soon as the first one fails, it crashes the whole program.
But in Python 3.11, the behaviour is a bit better:
import asyncio
async def bad_task():
raise ValueError("oops")
async def main():
async with asyncio.TaskGroup() as tg:
for _ in range(3):
tg.create_task(bad_task())
asyncio.run(main())
$ python asd.py
+ Exception Group Traceback (most recent call last):
| File "", line 1, in
| File "/usr/local/lib/python3.11/asyncio/runners.py", line 181, in run
| return runner.run(main)
| ^^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/asyncio/runners.py", line 115, in run
| return self._loop.run_until_complete(task)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/asyncio/base_events.py", line 650, in run_until_complete
| return future.result()
| ^^^^^^^^^^^^^^^
| File "", line 2, in main
| File "/usr/local/lib/python3.11/asyncio/taskgroups.py", line 139, in __aexit__
| raise me from None
| ^^^^^^^^^^^^^^^^^^
| ExceptionGroup: unhandled errors in a TaskGroup (3 sub-exceptions)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "", line 2, in bad_task
| ValueError: oops
+---------------- 2 ----------------
| Traceback (most recent call last):
| File "", line 2, in bad_task
| ValueError: oops
+---------------- 3 ----------------
| Traceback (most recent call last):
| File "", line 2, in bad_task
| ValueError: oops
+------------------------------------
The exception now tells us that we had 3 errors thrown, in a structure known as an ExceptionGroup.
Exception handling with these exception groups is also interesting, you can either do except ExceptionGroup to catch all the exceptions in one go:
try:
asyncio.run(main())
except ExceptionGroup as eg:
print(f"Caught exceptions: {eg}")
$ python asd.py
Caught exceptions: unhandled errors in a TaskGroup (3 sub-exceptions)
Or you can catch them based on the exception type, using the new except* syntax:
try:
asyncio.run(main())
except* ValueError as eg:
print(f"Caught ValueErrors: {eg}")
$ python asd.py
Caught ValueErrors: unhandled errors in a TaskGroup (3 sub-exceptions)
Typing improvements
The typing module saw a lot of interesting updates this release. Here are some of the most exciting ones:
Variadic generics
Support for variadic generics has been added to the typing module in Python 3.11.
What that means is that, now you can define generic types that can take an arbitrary number of types in them. It is useful for defining generic methods for multi-dimensional data.
For example:
from typing import Generic
from typing_extensions import TypeVarTuple
Shape = TypeVarTuple('Shape')
class Array(Generic[*Shape]):
...
# holds 1 dimensional data, like a regular list
items: Array[int] = Array()
# holds 3 dimensional data, for example, X axis, Y axis and value
market_prices: Array[int, int, float] = Array()
# This function takes in an `Array` of any shape, and returns the same shape
def double(array: Array[Unpack[Shape]]) -> Array[Unpack[Shape]]:
...
# This function takes an N+2 dimensional array and reduces it to an N dimensional one
def get_values(array: Array[int, int, *Shape]) -> Array[*Shape]:
...
# For example:
vector_space: Array[int, int, complex] = Array()
reveal_type(get_values(vector_space)) # revealed type is Array[complex]
Variadic generics can be really useful for defining functions that map over N-dimensional data. This feature can help a lot in type checking codebases that rely on data science libraries such as numpy or tensorflow.
The new Generic[*Shape] syntax is only supported in Python 3.11. To use this feature in Python 3.10 and below, you can use the typing.Unpack builtin instead: Generic[Unpack[Shape]].
singledispatch now supports unions
functools.singledispatch is a neat way to do function overloading in Python, based on type hints. It works by defining a generic function, and decorating it with @singledispatch. Then you can define specialized variants of that function, based on the type of the function arguments:
from functools import singledispatch
@singledispatch
def half(x):
"""Returns the half of a number"""
return x / 2
@half.register
def _(x: int):
"""For integers, return an integer"""
return x // 2
@half.register
def _(x: list):
"""For a list of items, get the first half of it."""
list_length = len(x)
return x[: list_length // 2]
# Outputs:
print(half(3.6)) # 1.8
print(half(15)) # 7
print(half([1, 2, 3, 4])) # [1, 2]
By inspecting the type given to the function's arguments, singledispatch can create generic functions, providing a non object-oriented way to do function overloading.
But this is all old news. What Python 3.11 brings is that now, you can pass union types for these arguments. For example, to register a function for all the number types, previously you would have to do it separately for each type, such as float, complex or Decimal:
@half.register
def _(x: float):
return x / 2
@half.register
def _(x: complex):
return x / 2
@half.register
def _(x: decimal.Decimal):
return x / 2
But now, you can specify all of them in a Union:
@half.register
def _(x: float | complex | decimal.Decimal):
return x / 2
And the code will work exactly as expected.
Self type
Previously, if you had to define a class method that returned an object of the class itself, adding types for it was a bit weird, it would look something like this:
from typing import TypeVar
T = TypeVar('T', bound=type)
class Circle:
def __init__(self, radius: int) -> None:
self.radius = radius
@classmethod
def from_diameter(cls: T, diameter) -> T:
circle = cls(radius=diameter/2)
return circle
To be able to say that a method returns the same type as the class itself, you had to define a TypeVar, and say that the method returns the same type T as the current class itself.
But with the Self type, none of that is needed:
from typing import Self
class Circle:
def __init__(self, radius: int) -> None:
self.radius = radius
@classmethod
def from_diameter(cls, diameter) -> Self:
circle = cls(radius=diameter/2)
return circle
Required[] and NotRequired[]
TypedDict is really useful to add type information to a codebase that uses dictionaries heavily to store data. Here's how you can use them:
from typing import TypedDict
class User(TypedDict):
name: str
age: int
user : User = {'name': "Alice", 'age': 31}
reveal_type(user['age']) # revealed type is 'int'
However, TypedDicts had a limitation, where you could not have optional parameters inside a dictionary, kind of like default parameters inside function definitions.
For example, you can do this with a NamedTuple:
from typing import NamedTuple
class User(NamedTuple):
name: str
age: int
married: bool = False
marie = User(name='Marie', age=29, married=True)
fredrick = User(name='Fredrick', age=17) # 'married' is False by default
This was not possible with a TypedDict (at least without defining multiple of these TypedDict types). But now, you can mark any field as NotRequired, to signal that it is okay for the dictionary to not have that field:
from typing import TypedDict, NotRequired
class User(TypedDict):
name: str
age: int
married: NotRequired[bool]
marie: User = {'name': 'Marie', 'age': 29, 'married': True}
fredrick : User = {'name': 'Fredrick', 'age': 17} # 'married' is not required
NotRequired is most useful when most fields in your dictionary are required, having a few not required fields. But, for the opposite case, you can tell TypedDict to treat every single field as not required by default, and then use Required to mark actually required fields.
For example, this is the same as the previous code:
from typing import TypedDict, Required
# `total=False` means all fields are not required by default
class User(TypedDict, total=False):
name: Required[str]
age: Required[int]
married: bool # now this is optional
marie: User = {'name': 'Marie', 'age': 29, 'married': True}
fredrick : User = {'name': 'Fredrick', 'age': 17} # 'married' is not required
contextlib.chdir
contextlib has a small addition to it, which is a context manager called chdir. All it does is change the current working directory to the specified directory inside the context manager, and set it back to what it was before when it exits.
One potential usecase can be to redirect where you write the logs to:
import os
def write_logs(logs):
with open('output.log', 'w') as file:
file.write(logs)
def foo():
print("Scanning files...")
files = os.listdir(os.curdir) # lists files in current directory
logs = do_scan(files)
print("Writing logs to /tmp...")
with contextlib.chdir('/tmp'):
write_logs(logs)
print("Deleting files...")
files = os.listdir(os.curdir)
do_delete(files)
This way, you don't have to worry about changing and reverting the current directory manually, the context manager will do it for you.
Summary
In addition to all of these, there's a cherry on top: Python also got 22% faster on average in this release. It might even get faster by the time the final release is out around October!
Also, work on Python 3.12 has already started. If you want to stay up to date with all the latest developments with the language, you can check out the pull requests page on Python's GitHub repository.