Wednesday, June 16, 2021

Update on the Python Software Foundation Executive Director

After ten years of exceptional service to the Python Software Foundation, the PSF Executive Director Ewa Jodlowska has decided to leave the Foundation at the end of 2021. We wish to thank Ewa for her many years of service and contributions to not only the Foundation but to the entire Python community. It’s safe to say the PSF, PyCon and the whole Python community would not be where it is today if not for Ewa.

In preparation for Ewa’s departure, the Python Software Foundation will begin a search to find a new Executive Director. The Executive Director is a key player in helping the Foundation pursue our mission “to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.”

The Board of Directors will work together with the Staff, the outgoing Executive Director, and the community in developing a timeline for the transition as well as posting periodic updates on the search when it formally begins. Please keep an eye on the PSF blog for these updates as well as the forthcoming job listing.

Wednesday, June 09, 2021

The 2021 Python Language Summit: Lightning Talks, Round 2

The second day of the 2021 Python Language Summit finished with a series of lightning talks from Ronny Pfannschmidt, Pablo Galindo, Batuhan Taskaya, Luciano Ramalho, Jason R. Coombs, Mark Shannon, and Tobias Kohn.

Annotated Assertions: Debugging With Joy

Ronny Pfannschmidt spoke about annotated assertions. He is a pytest maintainer and loves approachable debugging.

He compared assertions in unittest with assertions in pytest. He remarked that mistakes have been made in the past and are still being made today. Before pytest 2.1, pytest would reinterpret assertions, which was bad for side effects. Today, pytest deals with side effects by handling all of the variables, collecting them, and showing them to you. 

Here's what he would like to see in Python:

Here's what he'd like to do:

  • Create a PEP or have a PEP sponsor
  • Open the implementation of pytest to a wider audience

PEP 657: Fine-Grained Error Locations in Tracebacks

Pablo Galindo and Batuhan Taskaya shared their thoughts on what they want to do and what they don't want to do with PEP 657. The goal of this PEP is to improve the debugging experience by making the information in tracebacks more specific. It would also help with code coverage tools because it would allow expression-level coverage rather than just line-level coverage. JEP 358 has already accomplished something similar.

The speakers want to:

  • Keep maintenance costs low
  • Keep the size small without overcomplicating the compiler
  • Provide an API for tools to consume
  • Provide an opt-out mechanism

They want to avoid:

  • Adding a new set of .pyc files
  • Adding a new debugging info file format
  • Having a large number of new flags to customize
  • Implementing in memory/size encoding
  • Complicating the compiler too much
  • Providing more than one opt-out mechanism
  • Having manual metadata propagation

For the opt-out mechanism, there will be two ways to deactivate the feature:

  1. Environment variable: PYNODEBUGRANGES
  2. Command line option: -Xnodebugranges

Who Speaks for Mort on python-dev?

Luciano Ramalho explained that Mort, Elvis, and Einstein are names for personas that have been used within Microsoft to understand the needs of users:

  • Mort is an opportunistic developer who like to create quick solutions for immediate problems. He focuses on productivity and learns as needed.
  • Elvis is a pragmatic programmer who likes to create long-lasting solutions. He learns while working on solutions.
  • Einstein is a paranoid programmer who likes to create the most efficient solution to a problem. He typically learns before working on the solution.
Users of Python can be organized into similar groups with distinct needs. Since Einsteins may not clearly understand the needs of Morts and Elvises, Luciano Ramalho suggested that it may be time to recruit core users to speak for the Python users who aren't also core developers.

Annotations as Transforms

Jason R. Coombs shared his thoughts on designating transformation functions to be applied to parameters and return values. He had originally been inspired by the simplicity and power of decorators, and his idea could in theory be applied with decorators today. However, he determined that it would be more elegant to use annotations.

Using this approach would have advantages:

  • Elegant, simple declaration of intended behavior
  • Clear separation of concerns
  • Avoiding rewriting variables in the scope
  • Easy reuse of transformations
  • Explicit type transformation

However, there would also be challenges:

  • Compatibility: Although older versions of Python don't have this functionality, you could implement a compatibility shim.
  • Ambiguity between types and transforms: In order to address this concern, you could potentially:
    • Require transforming functions to be explicitly created
    • Provide a wrapping helper to specify that a type is used as a transform (e.g. -> transform(str))
    • Provide a wrapper helper or explicit types for nontransforming type declarations (e.g. Int or strict(int))

Tiers of Execution: Making CPython Execute Efficiently

Mark Shannon started by defining four tiers of execution:

  • Tier 0: The slowest tier, with minimal memory usage and low startup time
  • Tier 1: Primary interpreter, the adaptive, specializing interpreter
  • Tier 2: Small region, lightweight JIT
  • Tier 3: Large region, heavyweight JIT

The higher a tier, the hotter the code that it will execute. Today, CPython is at tier 0.3. It's a compromise between memory use and speed but isn't optimized for either. He said that tier 0 could be considered for Python 3.11 or later. It could:

  • Minimize startup time and memory use at the expense of execution speed
  • Support a full set of features, including sys.settrace
  • Be able to execute from a .pyc file that is mmapped and immutable
Tier 1 is planned for Python 3.11:
  • Adaptive, specializing interpreter (PEP 659)
  • Possible lack of support for some features, such as sys.settrace
Tiers 2 and 3 are entirely hypothetical at the moment and would involve JIT compilers. They maybe be more like LuaJIT than JVM.

Switching between tiers can be expensive, but the goal is to make it cheaper by having the same in-memory data layout for all tiers. In order to support all of Python, we will need to switch between tiers often. Each tier should be maintained mostly independently for open-source development. The performance cost won't be high if the memory layout is designed carefully.

Running Parallel Python Code in the Browser

Tobias Kohn has been working on TPython, a new Python implementation that works in the browser. His objectives were to do multiprocessing in the browser, not block the UI, and use native JavaScript libraries. 

Because JavaScript has a single thread event queue that contains even I/O and garbage collection, as long as your current thread is running, nothing else can happen while your current task is running. You can use web workers with messages in each of the web worker's event queues, but those messages won't become visible until the event queue gets to them.

You could suspend the current task and let everything in the event queue happen so that the message can be processed and then resume your task later on. To do that, you could use the bytecode in Python 3.6+ because the frame already has an index into the bytecode and captures state, to a certain extent. However, some bytecode instructions are too complex. _add_ can execute arbitrary Python code, fail, call _radd_, and execute other Python code. The standard bytecode is insufficient.

He's currently using an MPI interface for parallel processing. There is:

  • Early-stage multiprocessing support
  • A NumPy-like interface for JavaScript typed arrays
  • No blocking or freezing of the browser's UI
It runs on unmodified CPython 3.6+ bytecode.

Sunday, June 06, 2021

The 2021 Python Language Summit: Fuzzing and Testing Python With Properties

At the 2021 Python Language Summit, Zac Hatfield-Dodds gave a presentation about fuzzing and testing with Python properties. This presentation tied in with the one he gave at the 2020 Python Language Summit.

Zac Hatfield-Dodds
 

What Is Testing?

For the purposes of this talk, he defined testing as the art and science of running code and then checking if it did what it was supposed to do. He added that, although assertions, type checkers, linters, and code review are good, they are not testing.

There are two general reasons why we might have tests:

  1. For correctness:
    1. The goal is to validate software and determine that they are no bugs.
    2. Nondeterminism is acceptable.
    3. Finding any fault is a success.
  2. For software engineering (programming, over time, in teams):
    1. The goal is to validate changes or detect regressions.
    2. Nondeterminism is bad.
    3. Bugs should be in only the diff.

When these two reasons for testing aren't distinguished, there can be miscommunications.

What Is Property-Based Testing?

There are many types of tests:

  • Unit tests
  • Integration tests
  • Snapshot tests
  • Parameterized tests
  • Fuzz tests
  • Property-based tests
  • Stateful model tests

The speaker then walked the summit attendees through an example to explain going from traditional unit tests through to parameterized tests and then seeing how that plays into property-based tests.

Imagine that you needed to test the sorted() builtin. With a traditional set of unit tests, you can write a bunch of cases with the expected inputs and outputs:


If you want to avoid repeating yourself, you can write a list of inputs and outputs:


If you don't have a known good result, then you can still write tests using only the input argument. One option would be to compare to another reference implementation:

However, comparing with another reference implementation might not be an option, so you could just test if the output seems to be right:

In order to improve on this test, you might want to add another property that you can test. You could check that the length of the output is the same as the length of the input and that you have the same set of elements:

This would pass on the incorrect sorted([1, 2, 1]) -> [1, 2, 2].  A brute-force approach using itertools.permutations() would detect that too:


But the best solution is collections.Counter():

This last test uses property-based testing:

 
Instead of having a specific list of inputs, you could use Hypothesis:
 

That test will fail because NaN compares unequal to itself, so any list containing NaN will appear to not be in sorted order. So it could be good to have specified behavior for the ordering on NaN elements in the sorting algorithm:

 
He said that one of the big advantages of using something like Hypothesis rather than a list of handwritten examples is that is will raise conceptual issues that you may not have already thought through yourself.

In summary, property-based testing lets you:

  • Generate input data that you might not have thought of yourself
  • Check that the result isn't wrong, even without the right answer
  • Discover bugs in your understanding rather than just in your code
Often, you don't even need assertions in the test. Generating unusual input data is surprisingly effective. It can give you the sort of feedback you could get from real users, but you don't need to ship before getting the feedback.

A common concern is that, if you have randomized testing, then are things flaky? How do you deal with determinism? Hypothesis has been working on that for years, so they have solid answers to these kinds of questions:

If that's not enough, then you also have other options:

The Hypothesis database is a collection of files on disk that represent the various examples. Since it's a key-value store, it's easy to implement your own custom one:

In this example, you have a local database on disk. You can also have a shared network database on something like Redis, for example.

Coverage-guided fuzzing takes this to the next level:

What's New?

At the 2020 Python Language Summit, when he said that we would find more bugs if we used property-based testing for CPython and the standard library, the response was positive, but then not much happened. Since then, Paul Ganssle has opened a PR on CPython to add some Hypothesis tests for the zoneinfo library. Zac Hatfield-Dodds said that CPython is doing very well on unit testing and has a strong focus on regressions but that it would be quite valuable to add some of the tools that have been developed for testing for correctness.

These tools don't only find existing bugs. They're good at finding regressions where someone checked in new code with what turned out to be inadequate test coverage:

There is a pace at which we find and fix bugs that were preexisting in addition to the ongoing rate of introducing new bugs that then get detected by fuzzing instead of lasting for too long:


What's Next?

There is a three-step plan:

  1. Merge Paul Ganssle's PR or come up with an alternative proposal to get Hypothesis into CPython's CI in order to unblock ongoing incremental work
  2. Merge some tests
  3. Run them in CI and on OSS-Fuzz
For interested parties, you can see and engage in the follow-ups to this work on the Python Steering Council's issue tracker.

Saturday, June 05, 2021

The 2021 Python Language Summit: What Should I Work on as a Core Dev?

At the 2021 Python Language Summit, Eric Snow gave a presentation about how core developers can receive guidance to help them work on improvements to the language that will bring the most benefit to the Python community.

Eric Snow

What Does the Community Need?

When Eric Snow first got involved in core development over a decade ago, he liked that there was so much to learn. But he found that the organization didn't offer a lot of direction in terms of guiding volunteers to the kind of work that would have the biggest impact. 

Over the years, he's thought about how he and other contributors decide what to work on and why. What directs the efforts of contributors and core developers? Contributors make decisions based on their availability, expertise, and interests. Core developers need to act as stewards for the language. There's plenty of collaboration that goes on, but everyone has their own idea of what the community needs, what would be interesting to work on, and what the Python language needs. As a result, it can be hard to see the bigger picture.

Would a PM Help?

Time and time again, he has asked himself what he can work on to best help the Python community. We all care about this language and the community that surrounds it, so we want to help it as much as we can. Sometimes, it's hard to get a sense of what will help the community and the language the most from our own limited, individual perspectives.

One solution could be to have a dedicated PM who can provide the direction that we've been missing. This person could be provided by the PSF or a sponsoring organization. They could compile and maintain a list of improvements that would be most beneficial to the community. They wouldn't dictate what would be worked on, but they could surface what the community needs.

There has been talk of the Steering Council providing a road map. Having a PM could help:

  • Provide a clear picture of what that road map could look like
  • Help developers and maintainers make decisions about where to spend their time as volunteers
  • Facilitate collaboration

Discussion

Eric Snow was interested in hearing whether or not other core developers would find a PM helpful.  

Luciano Ramalho said that he was strongly in favor of having a PM who could assess the needs of the community from the perspective of the community rather than only the perspective of core developers. This idea overlaps with the questions he raised in his lightning talk at this summit. He also mentioned that Go has a PM role similar to what Eric Snow was suggesting.

Other attendees discussed how this kind of role could be funded and considered how much benefit the role could bring considering that the PSF is working with limited financial resources. They also discussed the differences between a Product Manager and a Program Manager and determined that this role would be more like a Program Manager.