Wednesday, June 16, 2021
In preparation for Ewa’s departure, the Python Software Foundation will begin a search to find a new Executive Director. The Executive Director is a key player in helping the Foundation pursue our mission “to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.”
The Board of Directors will work together with the Staff, the outgoing Executive Director, and the community in developing a timeline for the transition as well as posting periodic updates on the search when it formally begins. Please keep an eye on the PSF blog for these updates as well as the forthcoming job listing.
Wednesday, June 09, 2021
The second day of the 2021 Python Language Summit finished with a series of lightning talks from Ronny Pfannschmidt, Pablo Galindo, Batuhan Taskaya, Luciano Ramalho, Jason R. Coombs, Mark Shannon, and Tobias Kohn.
Annotated Assertions: Debugging With Joy
Ronny Pfannschmidt spoke about annotated assertions. He is a pytest maintainer and loves approachable debugging.
He compared assertions in unittest with assertions in pytest. He remarked that mistakes have been made in the past and are still being made today. Before pytest 2.1, pytest would reinterpret assertions, which was bad for side effects. Today, pytest deals with side effects by handling all of the variables, collecting them, and showing them to you.
Here's what he would like to see in Python:
Here's what he'd like to do:
- Create a PEP or have a PEP sponsor
- Open the implementation of pytest to a wider audience
PEP 657: Fine-Grained Error Locations in Tracebacks
Pablo Galindo and Batuhan Taskaya shared their thoughts on what they want to do and what they don't want to do with PEP 657. The goal of this PEP is to improve the debugging experience by making the information in tracebacks more specific. It would also help with code coverage tools because it would allow expression-level coverage rather than just line-level coverage. JEP 358 has already accomplished something similar.
The speakers want to:
- Keep maintenance costs low
- Keep the size small without overcomplicating the compiler
- Provide an API for tools to consume
- Provide an opt-out mechanism
They want to avoid:
- Adding a new set of .pyc files
- Adding a new debugging info file format
- Having a large number of new flags to customize
- Implementing in memory/size encoding
- Complicating the compiler too much
- Providing more than one opt-out mechanism
- Having manual metadata propagation
For the opt-out mechanism, there will be two ways to deactivate the feature:
- Environment variable: PYNODEBUGRANGES
- Command line option: -Xnodebugranges
Who Speaks for Mort on python-dev?
- Mort is an opportunistic developer who like to create quick solutions for immediate problems. He focuses on productivity and learns as needed.
- Elvis is a pragmatic programmer who likes to create long-lasting solutions. He learns while working on solutions.
- Einstein is a paranoid programmer who likes to create the most efficient solution to a problem. He typically learns before working on the solution.
Annotations as Transforms
Jason R. Coombs shared his thoughts on designating transformation functions to be applied to parameters and return values. He had originally been inspired by the simplicity and power of decorators, and his idea could in theory be applied with decorators today. However, he determined that it would be more elegant to use annotations.
Using this approach would have advantages:
- Elegant, simple declaration of intended behavior
- Clear separation of concerns
- Avoiding rewriting variables in the scope
- Easy reuse of transformations
- Explicit type transformation
However, there would also be challenges:
- Compatibility: Although older versions of Python don't have this functionality, you could implement a compatibility shim.
- Ambiguity between types and transforms: In order to address this concern, you could potentially:
- Require transforming functions to be explicitly created
- Provide a wrapping helper to specify that a type is used as a transform (e.g. -> transform(str))
- Provide a wrapper helper or explicit types for nontransforming type declarations (e.g. Int or strict(int))
Tiers of Execution: Making CPython Execute Efficiently
- Tier 0: The slowest tier, with minimal memory usage and low startup time
- Tier 1: Primary interpreter, the adaptive, specializing interpreter
- Tier 2: Small region, lightweight JIT
- Tier 3: Large region, heavyweight JIT
The higher a tier, the hotter the code that it will execute. Today, CPython is at tier 0.3. It's a compromise between memory use and speed but isn't optimized for either. He said that tier 0 could be considered for Python 3.11 or later. It could:
- Minimize startup time and memory use at the expense of execution speed
- Support a full set of features, including sys.settrace
- Be able to execute from a .pyc file that is mmapped and immutable
- Adaptive, specializing interpreter (PEP 659)
- Possible lack of support for some features, such as sys.settrace
Running Parallel Python Code in the Browser
You could suspend the current task and let everything in the event queue happen so that the message can be processed and then resume your task later on. To do that, you could use the bytecode in Python 3.6+ because the frame already has an index into the bytecode and captures state, to a certain extent. However, some bytecode instructions are too complex. _add_ can execute arbitrary Python code, fail, call _radd_, and execute other Python code. The standard bytecode is insufficient.
He's currently using an MPI interface for parallel processing. There is:
- Early-stage multiprocessing support
- No blocking or freezing of the browser's UI
Sunday, June 06, 2021
At the 2021 Python Language Summit, Zac Hatfield-Dodds gave a presentation about fuzzing and testing with Python properties. This presentation tied in with the one he gave at the 2020 Python Language Summit.
What Is Testing?
For the purposes of this talk, he defined testing as the art and science of running code and then checking if it did what it was supposed to do. He added that, although assertions, type checkers, linters, and code review are good, they are not testing.
There are two general reasons why we might have tests:
- For correctness:
- The goal is to validate software and determine that they are no bugs.
- Nondeterminism is acceptable.
- Finding any fault is a success.
- For software engineering (programming, over time, in teams):
- The goal is to validate changes or detect regressions.
- Nondeterminism is bad.
- Bugs should be in only the diff.
When these two reasons for testing aren't distinguished, there can be miscommunications.
What Is Property-Based Testing?
There are many types of tests:
- Unit tests
- Integration tests
- Snapshot tests
- Parameterized tests
- Fuzz tests
- Property-based tests
- Stateful model tests
The speaker then walked the summit attendees through an example to explain going from traditional unit tests through to parameterized tests and then seeing how that plays into property-based tests.
Imagine that you needed to test the sorted() builtin. With a traditional set of unit tests, you can write a bunch of cases with the expected inputs and outputs:
If you want to avoid repeating yourself, you can write a list of inputs and outputs:
If you don't have a known good result, then you can still write tests using only the input argument. One option would be to compare to another reference implementation:
However, comparing with another reference implementation might not be an option, so you could just test if the output seems to be right:
In order to improve on this test, you might want to add another property that you can test. You could check that the length of the output is the same as the length of the input and that you have the same set of elements:
This would pass on the incorrect sorted([1, 2, 1]) -> [1, 2, 2]. A brute-force approach using itertools.permutations() would detect that too:
But the best solution is collections.Counter():
This last test uses property-based testing:
In summary, property-based testing lets you:
- Generate input data that you might not have thought of yourself
- Check that the result isn't wrong, even without the right answer
- Discover bugs in your understanding rather than just in your code
If that's not enough, then you also have other options:
The Hypothesis database is a collection of files on disk that represent the various examples. Since it's a key-value store, it's easy to implement your own custom one:
In this example, you have a local database on disk. You can also have a shared network database on something like Redis, for example.
Coverage-guided fuzzing takes this to the next level:
At the 2020 Python Language Summit, when he said that we would find more bugs if we used property-based testing for CPython and the standard library, the response was positive, but then not much happened. Since then, Paul Ganssle has opened a PR on CPython to add some Hypothesis tests for the zoneinfo library. Zac Hatfield-Dodds said that CPython is doing very well on unit testing and has a strong focus on regressions but that it would be quite valuable to add some of the tools that have been developed for testing for correctness.
These tools don't only find existing bugs. They're good at finding regressions where someone checked in new code with what turned out to be inadequate test coverage:
There is a pace at which we find and fix bugs that were preexisting in addition to the ongoing rate of introducing new bugs that then get detected by fuzzing instead of lasting for too long:
There is a three-step plan:
- Merge Paul Ganssle's PR or come up with an alternative proposal to get Hypothesis into CPython's CI in order to unblock ongoing incremental work
- Merge some tests
- Run them in CI and on OSS-Fuzz
Saturday, June 05, 2021
At the 2021 Python Language Summit, Eric Snow gave a presentation about how core developers can receive guidance to help them work on improvements to the language that will bring the most benefit to the Python community.
What Does the Community Need?
When Eric Snow first got involved in core development over a decade ago, he liked that there was so much to learn. But he found that the organization didn't offer a lot of direction in terms of guiding volunteers to the kind of work that would have the biggest impact.
Over the years, he's thought about how he and other contributors decide what to work on and why. What directs the efforts of contributors and core developers? Contributors make decisions based on their availability, expertise, and interests. Core developers need to act as stewards for the language. There's plenty of collaboration that goes on, but everyone has their own idea of what the community needs, what would be interesting to work on, and what the Python language needs. As a result, it can be hard to see the bigger picture.
Would a PM Help?
Time and time again, he has asked himself what he can work on to best help the Python community. We all care about this language and the community that surrounds it, so we want to help it as much as we can. Sometimes, it's hard to get a sense of what will help the community and the language the most from our own limited, individual perspectives.
One solution could be to have a dedicated PM who can provide the direction that we've been missing. This person could be provided by the PSF or a sponsoring organization. They could compile and maintain a list of improvements that would be most beneficial to the community. They wouldn't dictate what would be worked on, but they could surface what the community needs.
There has been talk of the Steering Council providing a road map. Having a PM could help:
- Provide a clear picture of what that road map could look like
- Help developers and maintainers make decisions about where to spend their time as volunteers
- Facilitate collaboration
Eric Snow was interested in hearing whether or not other core developers would find a PM helpful.
Luciano Ramalho said that he was strongly in favor of having a PM who could assess the needs of the community from the perspective of the community rather than only the perspective of core developers. This idea overlaps with the questions he raised in his lightning talk at this summit. He also mentioned that Go has a PM role similar to what Eric Snow was suggesting.
Other attendees discussed how this kind of role could be funded and considered how much benefit the role could bring considering that the PSF is working with limited financial resources. They also discussed the differences between a Product Manager and a Program Manager and determined that this role would be more like a Program Manager.
Saturday, May 29, 2021
What Is the stdlib?
He succinctly described the stdlib as "a collection of modules that ship with CPython (usually)." This was the most accurate definition he could give, considering how big it is, how varied its contents are, and how long it has been around.
He didn't offer an answer to the question about whether or not there should be a new informational PEP to define clear goals for the stblib, but he wanted core developers to engage with the question. There are a variety of opinions on the stdlib, but it could be beneficial to come to some kind of agreement about:
- What it should be
- How to manage it
- What it should focus on
- How to decide what will or will not be added to it
He shared that he semi-regularly sees requests for adding a TOML parser to the stdlib. When he considers requests, he asks himself:
- Should a module be added?
- How should such a decision be made?
- What API should it have?
- Is there a limit to how big the stdlib should get?
So far, there haven't been basic guidelines for answering these kind of questions. As a result, decisions have been made on a case-by-case basis.
How Big Is the stdlib?
He did some data digging in March of 2021 and shared his findings. Here are the broad strokes:
- python -v -S -c pass imports 14 modules.
- The are 208 top-level modules in the standard library.
- The ratio of modules to people registered to vote in the last Steering Council election is 2.3, but not all of those people are equally available to help maintain the stdlib.
What Should the stdlib Cover?
Some people have suggested that the stdlib should be focused on helping users bootstrap pip and no more. Others have said that it should focus on system administration or other areas. Considering that there are thirty-one thematic groupings in the index, the people who maintain the stdlib don't seem to have come to a collective decision either. The groupings cover everything from networking to GUI libraries to REPLs and more.
The stdlib has been around for a long time, and we need to be careful about breaking people's code, so the goal is not to deprecate what is already there but to consider guidelines for making additions.
How Do We Decide What Goes Into the stdlib?
He compared PEP 603 and graphlib to show how this question has been answered in different ways in the past. The goal of PEP 603 was to add the class frozenmap to the collections module. However, the graphlib module was only ever discussed in an issue and never got a PEP before it was added to the stdlib. There is no standardized approach for making these kinds of decisions, so he would like to know what approaches core developers think would be most appropriate.
What Is the Maintenance Cost?
The PR queue is already long, which can be overwhelming for maintainers and discouraging for contributors.
The following modules aren't used in any of the top 4000 projects:
Seventy-six modules are used in less than 1% of the 4000 most downloaded projects on PyPI. That's over 36% of all the modules in the stdlib. This raises some questions:
- Do we want to continue to ship these modules?
- What does this tell us about what the community finds useful in the stdlib?
- How can that inform future guidelines about what to include in the stdlib?
Based on the data from March 2021, there were:
- 37 modules with no open PRs
- 1,451 PRs involving the stdlib, which made up the bulk of all the PRs
Monday, May 24, 2021
Mariatta Wijaya and Carol Willing gave a presentation about a new documentation work group at the 2021 Python Language Summit. Last year, Carol Willing and Ned Batchelder spoke about laying the groundwork for this project at the 2020 Python Language Summit.
Why Does Python Need a Documentation Work Group?
The mission of the Python Software Foundation is to advance the Python language and grow a diverse, international community of Python programmers so that the language can continue to flourish well into the future. However, when it comes to documentation, core developers don't necessarily reflect the larger world of Python users. If we can bring together core developers, documentarians, and educators, then we can have better documentation that serves the needs of the wider community more effectively.
What Should the Documentation Work Group Achieve?
The work group has two main goals:
- Improve documentation content so there's more documentation aimed at people who are learning the language
- Modernize documentation themes to make them responsive on mobile so they can better serve users working with limited bandwidth
Although core developers have sometimes felt a great deal of ownership of parts of the documentation, all of the documentation is a community resource. As a result, no one person should be responsible for any one part of the documentation. If the work group is large enough, then it can serve as an editorial board that could work towards consensus.
The work group will:
- Set priorities and projects for the next year
- Build a larger documentation community and help them feel engaged, connected, and empowered
What Will Stay the Same?
Changes to documentation will still go through the same PR process that is described in the dev guide. There will be the same commitment to quality. Although there will be new documentation to meet the needs of underserved users and topics, the existing docs at docs.python.org and devguide.python.org will remain.
What Will Change?
Documentation is a gateway to education. In order for it to be more effective, we need broader input from the community. There have already been considerable efforts with translation and localization, but it would also be beneficial to have a new landing page to help users find the resources they need.
The next step is to deal with the logistics of work-group membership. The current members are Mariatta Wijaya, Carol Willing, Ned Batchelder, and Julien Palard. The charter for the work group states that it can have up to twenty members. The application process will be similar to the one that the code of conduct work group used. The current members expect applications to come from the wider Python community as well as from core developers.
Once the group has more members, they will hold a monthly meeting that will be scheduled to accommodate a variety of time zones. They will discuss docs issues, open PRs, the status of projects, achievements, next steps, and more.
There are also plans for AMA sessions on Discourse so that docs team members can answer questions and connect with the wider docs community. In addition, the group will reach out to PyLadies and tap into the diverse skill sets of their members.
Where Can You Learn More?
To learn more, you can check out the Python docs community on:
Sunday, May 23, 2021
- CPython sources and how they fit with Debian and Ubuntu
- Ownership of module installations
- Architecture and platform support
- Inclusiveness and the ubiquity of Python on various platforms
- Communication issues
What Is Python Like on Debian and Ubuntu?
He shared what Debian 11 and Ubuntu 21.04 have installed for Python. By default, there is almost nothing, so it's usually pulled in by various seeds or images. Linux distributions usually have a mature packaging system and don't ship naked CPython, unlike macOs or Windows.
These Debian and Ubuntu versions still use Python 2 to bootstrap PyPI, but one version of Python 3 is also shipped with them. CPython itself is split into multiple binary packages for license issues, dependencies, development needs, and cross-buildability needs. There is also a new python3-full package that includes everything for the batteries-included experience. About twenty percent of the packages shipped by Debian use Python. He said that this is usually enough for desktop users but may not be enough for developers. However, the line between those two groups is not always clear.
As for the QA that Python is getting in Debian and Ubuntu, for the main part of the archive, it must conform to the Debian Free Software Guidelines. These guidelines include free distribution, inclusion of source code, and ability to modify and create derived works. Packages have to build from source and pass the upstream test suite as well as CI tests.
What Distro-Specific Decisions Were Made for Debian and Ubuntu?
Debian has a policy for shipping Python packages that is also used by Ubuntu. Usually, applications are in application-specific locations so they don't get in the way of anything else. They ship modules as --single-version-externally-managed, and they are usually shipped only if the application needs them.
The site-packages directory has been renamed to /usr/lib/python3/dist-packages and /usr/local/lib/python3.x/dist-packages for local installs. The path doesn't change during Python upgrades (PEP 3147 and PEP 3149). Although there are a large number of packages that use Python, pip is not used in the archive but just provided for use.
You can't call Python with python, python2, python3.x, or python3. There is no Python executable name by default to call Python because they just removed most of the Python 2 stuff. There is a package called python-is-python3 that reinstalls the Python symlink. That package was a compromise, and there was some difficulty getting it into Debian.
In the past, there have been license issues with shipping the CPython upstream sources. There are still license issues with the _dbm module, which is only buildable with a GPL-3+ license. There were also some executables included in the sources that were removed for 3.10. The big remaining issue is that wheels are still included without the source and can't be shipped, so you have to build them using the regular setuptools and pip distributions. Usually, symlinks and dependencies are used to point to the proper setuptools and pip packages.
The relationship between pip and Linux distributions is a difficult one, and there is more than one way to install Python modules. Part of the motivation behind renaming site-packages to dist-packages was that pip was breaking desktop systems. They also wanted to resolve conflicts with locally built Python (installed in /usr/local).
There has been some controversy about what can break your system:
- sudo rm -rf /
- sudo pip install pil
- sudo apt install python3-pil
PyPA does not consider sudo pip install to be dangerous, but Debian and Ubuntu have different opinions about how pip should behave. Mixing packages from two different installers can lead to problems. Although PEP 517 appears to say that pip is only recommended, pip does seem to be enforced more and more. Matthias Klose joked that perhaps ensurepip should be renamed to enforcepip. He also said that having some kind of offline mode for pip would help.
He discussed inclusiveness and said that Python being ubiquitous should be seen as an asset rather than a burden. He was sad to see negative attitudes towards platforms that are used less, such as AIX, and didn't see why Python would want to exclude some communities.
What Communication Issues Are There?
In December or November, there were tweets about problems within Debian and Ubuntu. He considered some of the concerns that were brought up to be valid, but he said that there were also legal threats made against the Debian project on behalf of the PSF. He was of the opinion that all parties could improve and that it wasn't just a Debian or Ubuntu problem.
Here are some of the communication problems he highlighted:
- Distro issues don't reach distro people.
- Problems with pip breaking systems don't reach pip developers.
- There is no single place to discuss PyPA issues.
- There have been problems with manylinux wheels built for CentOS that came up in distro channels.
Saturday, May 22, 2021
The Stable ABI and Limited C API
Petr Viktorin spoke about the stable ABI and limited C API. The stable ABI is a way to compile a C extension on Python 3.x and run it on Python 3.x+. It was introduced in 2009 with PEP 384. You can use it to simplify extension maintenance, and it will allow you to support more versions. But it does have lower performance, and you can't do everything with it that you could with the full API.
Petr Viktorin would like to see it used for bindings and embeddings. If Python is just a small part of your application and you don't want to invest a lot of maintainer time into supporting Python, then that would be a good use case. You could also use it to support unreleased Python versions.
If you limit yourself to the subset of the limited C API, then you will get an extension that conforms to the stable ABI. The limited C API aims to avoid implementation details and play well with:
- Alternate Python implementations
- Extension languages other than C
- New features, such as isolated subinterpreters
However, the limited C API is not stable.
The limited C API and the stable ABI are now defined in Misc/stable_abi.txt. There are already tests, and soon there will be documentation as well. To learn more, check out:
Promoting PyLadies in CPython Development
Lorena Mesa spoke about PyLadies, which is an international mentorship group with a focus on helping more women become active participants and leaders in the Python open-source community.
Most of the growth in the PyLadies community has been coming from outside the USA and Europe. South America has the most active chapters, with Brazil in the lead. In order to help chapters better support their members, PyLadies is working on a centralized mandate and a global governance model.
While PyLadies has been working on education and outreach, it has been challenging to quantify how the group is helping women become active participants and leaders in the OSS community. In order to address this issue, they are preparing a survey about the challenges their members face in open source. Members may be having difficulty with:
- Language barriers
- Technical expertise
- Submit a recording of their workflow to publish on YouTube
- Offer feedback
CircuitPython: A Subset of CPython
Jython 3: Something Completely Different?
- Built-in methods, but not yet their call sites
- An interpreter for a subset of CPython bytecode
At the 2021 Python Language Summit, Antonio Cuni gave a presentation about HPy. He also gave a presentation about HPy at the 2020 Python Language Summit, so this year he shared updates on how the project has evolved since then.
What Is HPy?
HPy is an alternative API for writing C extensions. Although the current Python C API shows CPython implementation details, HPy hides all the implementation details that would otherwise be exposed. Antonio Cuni said that, if everyone used HPy, then it would help Python evolve in the long term.
Using HPy extensions will make it easier to support alternative implementations. HPy is designed to be GC friendly and isn't build on top of ref counting. It is also designed to have zero overhead on CPython, so you can port an existing module from the Python C API to HPy without any performance loss. In addition, it allows incremental migration, allowing you to port your existing extension one function at a time, or even one method at a time. HPy is also faster than the existing Python C API on alternative implementation such as PyPy and GraalPython.
What's New With HPy?
In the past year since Antonio Cuni last shared an update at a Python Language Summit, HPy has continued to make progress. It now has:
- Support for Windows
- Support for creating custom types in C
- A debug mode to help you find mistakes in your C code
- Setup tools integration to make it easier to compile HPy extensions
There has also been work on a very early port of some parts of NumPy to HPy. The feedback from the NumPy team has been positive so far. Soon, the HPy team will start writing a Cython backend so that all Cython extensions will be able to automatically use HPy as well.
The HPy team has made a lot of progress with building community and getting funding. There is now a site for HPy as well as a blog, and there has been a lot of interest and involvement from the Python community. For example, someone independently started porting Pillow. Oracle, IBM, and Quansight Labs have provided some funding, but there has still been plenty of non-funded open source development, as usual.
How Do the CPython ABI and the Universal ABI Compare?
There are some differences between the CPython ABI and the Universal ABI:
On the Universal ABI side, there is no way to support wheels.
How Does Debug Mode Work?
HPy's debug mode may be useful to you even if you aren't concerned about the problems that HPy is intended to solve because it can help you find common problems in your C code, such as memory leaks. Here's an example of an HPy function that takes an object and increments it by one:
HPy_Close() isn't called on the object that was created, so you have a memory leak. If you want to compile this file into an extension, then you can use setup.py:
Now, you can load the module and debug:
What Does the Future Hold?
Antonio Cuni closed his presentation by asking the CPython developers at the summit if it would be possible to make HPy a semi-official API in the future, with first-class support for importing modules and distributing wheels. Some attendees suggested writing a PEP to make that happen.
Thursday, May 20, 2021
At the 2021 Python Language Summit, Guido van Rossum gave a presentation about plans for making CPython faster. This presentation came right after Dino Viehland's talk about Instagram's performance improvements to CPython and made multiple references to it.
Can We Make CPython Faster?
We can, but it's not yet clear by how much. Last October, Mark Shannon shared a plan on GitHub and python-dev. He asked for feedback and said that he could make CPython five times faster in four years, or fifty percent faster per year for four years in a row. He was looking for funding and wouldn't reveal the details of his plan without it.
How Will We Make CPython Faster?
Seven months ago, Guido van Rossum left a brief retirement to work at Microsoft. He was given the freedom to pick a project and decided to work on making CPython faster. Microsoft will be funding a small team consisting of Guido van Rossum, Mark Shannon, Eric Snow, and possibly others.
The team will:
- Collaborate fully and openly with CPython's core developers
- Make incremental changes to CPython
- Take care of maintenance and support
- Keep all project-specific repos open
- Have all discussions in trackers on open GitHub repos
- The byte code
- The compiler
- The internals of a lot of objects
Who Will Benefit?
You'll benefit from the speed increase if you:
- Run CPU-intensive pure Python code
- Use tools or websites built in CPython
- Rewrote your code in C, Cython, C++, or similar to increase speed already (e.g. NumPy, TensorFlow)
- Have code that is mostly waiting for I/O
- Use multithreading
- Need to make your code more algorithmically efficient first
Cinder is Instagram's internal performance-oriented production version of CPython 3.8, so all of the comparisons in this presentation dealt with Python 3.8. Cinder has a lot of performance optimizations, including bytecode inline caching, eager evaluation of coroutines, a method-at-a-time JIT, and an experimental bytecode compiler that uses type annotations to emit type-specialized bytecode that performs better in the JIT.
Instagram did a lot of work with asynchronous I/O. One big change was sending and receiving values without raising StopIteration. Raising all of those exceptions was a huge source of overhead. On simple benchmarks, this was 1.6 times faster, but it was also a 5% win in production. These changes have been upstreamed to Python 3.10 (bpo-41756 & bpo-42085).
Instagram also made another change to asynchronous I/O that hasn't been upstreamed yet: eager evaluation. Often, in their workload, if they await a call to a function, then it can run and immediately complete. If the call completes without blocking, then they don't have to create a coroutine object. Instead, a wait handle is returned. (One singleton instance is used, as the handle is immediately consumed.)
They used the new vectorcall API to do this work, so they have a new flag to show that a call is being awaited at the call site. In addition to having functions check this flag, they also have asyncio.gather() check the flag. This avoids overhead for task creation and scheduling. These changes led to a 3% win in production and haven't been upstreamed yet, but there have been discussions.
Another big change is inline caching for byte code, which they call shadow byte code. Although Python 3.10 has some inline caching as well, Instagram took a somewhat different approach. In Instagram's implementation, hot methods get a complete copy of the byte code and caches. As the function executes, they replace the opcodes in that copy with a hidden copy that's more specific. This resulted in a 5% win in production.
Dino Viehland also spoke about dictionary watchers, which haven't been upstreamed to CPython. Dictionary watchers provide updates to globals for builtins when they are modified. Instagram achieved this by reusing the existing version tag in dictionaries to mark dictionaries that are being watched. They took the low bit from that, so now whenever they need to bump the dictionary version, they bump it by two. This led to an additional 5% win when combined with shadow byte code.
Instagram made targeted optimizations as well. The CPython documentation mentions that assigning to __builtins__ is a CPython implementation detail. But it's an unusual one, because when you assign to it, it may not be respected immediately. For example, if you're using the same globals, then you use the existing builtins. Instagram made that always point to the fixed builtins dictionary, which led to a 1% win in production.
They also made some small changes to PyType_Lookup that were upstreamed and will be in Python 3.10. You can check bpo-43452 to learn more. In addition, Instagram worked on ThreadState lookup avoidance and prefetching variables before they're loaded, but frame creation is still expensive.
Instagram has tried some experimental changes as well. One big one was the JIT. They have a custom method at a time JIT. There is nearly full coverage for all of the opcodes. They do have some unsupported opcodes, but they are rare and not used in methods, such as IMPORT_STAR. There are a couple of intermediate representations. The front end lowers to an HIR where they do an SSA and have a ref count insertion pass as well as other optimization passes. After they go though the HIR level, they lower it to an LIR, which is closer to x64.
Another experimental idea is something they call static Python. It provides similar performance gains as MyPyC or Cython, but it works at runtime and has no extra compile steps. It starts with a new source loader that loads files marked with import __static__, and it supports cross module compilation across different source files. There are also new byte codes such as INVOKE_FUNCTION and LOAD_FIELD that can be bound tightly at runtime. It uses normal PEP 484 annotations.
InterOp needs to enforce types at the boundaries between untyped Python and static Python. If you call a typed function, then you might get a TypeError. Static Python has a whole new static compiler that uses the regular Python ast module and is based on the Python 2.x compiler package.
In addition, Pyro is an unannounced, experimental, from-scratch implementation that reuses the standard library. The main differences between Pyro and CPython are:
- Compacting garbage collection
- Tagged pointers
- Hidden classes
The C API is emulated for the PEP 384 subset for supporting C extensions.
Production improvements are difficult to measure because changes have been incremental over time, but they are estimated at between 20% and 30% overall. When Instagram was benchmarking, they used CPython 3.8 as the baseline and compared Cinder, Cinder with the JIT, and Cinder JIT noframe, which Instagram is not yet using in production but wants to move towards so they won't have to create Python frame objects for jitted code.
Sunday, May 16, 2021
The 2021 Python Language Summit: Progress on Running Multiple Python Interpreters in Parallel in the Same Process
Victor Stinner started by explaining why we would need to make the changes that they're discussing. One use case would be if you wanted to embed Python and extend the features of your application, like Vim, Blender, LibreOffice, and pybind11. Another use case is subinterpreters. For example, to handle HTTP requests, there is Apache mod_wsgi, which uses subinterpreters. There are also plugins for WeeChat, which is an IRC client written in C.
One of the current issues with embedding Python is that it doesn't explicitly release memory at exit. If you use a tool to track memory leaks, such as Valgrind, then you can see a lot of memory leaks when you exit Python.
Python makes the assumption that the process is done as soon as you exit, so you wouldn't need to release memory. But that doesn't work for embedded Python because applications can survive after calling Py_Finalize(), so you have to modify Py_Finalize() to release all memory allocations done by Python. Doing that is even more important for Py_EndInterpreter(), which is used to exit the subinterpreter.
Running Multiple Interpreters in Parallel
The idea is to run one interpreter per thread and one thread per CPU, so you use as many interpreters as you have CPUs to distribute the workload. It's similar to multiprocessing use cases, such as distributing machine learning.
Why Do We Need a Single Process?
There are multiple advantages to using a single process. Not only can it be more convenient, but it can also be more efficient for some uses cases. Admin tools are designed for handling a single process rather than multiple. Some APIs don't work with cross-processes since they are designed for single processes. On Windows, creating a thread is faster than creating a process. In addition, macOS decided to ban fork(), so multiprocessing uses spawn by default and is slower.
No Shared Object
The issue with running multiple interpreters is that all CPUs have access to the same memory. There is concurrent access on the refcnt object. One way to make sure that the code is correct is to put a lock on the reference counter or use an atomic operation, but that can create a performance bottleneck. One solution would be to not share any objects between interpreters, even if they're immutable objects.
What Drawbacks Do Subinterpreters Have?
If you have a crash, like a segfault, then all subinterpreters will be killed. You need to make sure that all imported extensions support subinterpreters.
C API & Extensions
Next, Dong-hee Na shared the current status of the extension modules that support heap types, module state, and multiphase initialization. In order to support multiple subinterpreters, you need to support multiphase initialization (PEP 489), but first you need to convert static types to heap types and add module state. PEP 384 and PEP 573 support heap types, and we mostly use PyTypeFromSpec() and PyTypeFromModuleAndSpec() APIs. Dong-hee Na walked the summit attendees through an example with the _abc module extension.
Work Done So Far
Victor Stinner outlined some of the work that has already been done. They had to deal with many things to make interpreters not share objects anymore, such as free lists, singletons, slice cache, pending calls, type attribute lookup cache, interned strings, and Unicode identifiers. They also had to deal with the states of modules because there are some C APIs that directly access states, so they needed to be per interpreter rather than per module instance.
One year ago, Victor Stinner wrote a proof of concept to check if the design for subinterpreters made sense and if they're able to scale with the number of CPUs:
Work That Still Needs to Be Done
Some of the easier TODOs are:
- Converting remaining extensions and static types
- Making _PyArg_Parser per interpreter
- Dealing with the GIL itself
Some of the more challenging TODOs are:
- Removing static types from the public C API
- Making None, True, and False singletons per interpreter
- Getting the Python thread state (tstate) from a thread local storage (TLS)
There are some ideas for the future:
- Having an API to directly share Python objects
- Sharing data and use one Python object per interpreter with locks
- Supporting spawning subprocesses (fork)
If you want to know more, you can play around with this yourself: