To increase the security of Python package downloads, we're beginning to introduce two-factor authentication (2FA) as a login security option on the Python Package Index. This is thanks to a grant from the Open Technology Fund; coordinated by the Packaging Working Group of the Python Software Foundation.
Starting today, the canonical Python Package Index at PyPI.org and the test site at test.pypi.org offer 2FA for all users. We encourage project maintainers and owners to log in and go to their Account Settings to add a second factor. This will help improve the security of their PyPI user accounts, and thus reduce the risk of vandals, spammers, and thieves gaining account access.
PyPI's maintainers tested this new feature throughout May and fixed several resulting bug reports; regardless, you might find a new issue. If you find any potential security vulnerabilities, please follow our published security policy. (Please don't report security issues in Warehouse via GitHub, IRC, or mailing lists. Instead, please directly email one or more of our maintainers.) If you find an issue that is not a security vulnerability, please report it via GitHub.
PyPI currently supports a single 2FA method: generating a code through a Time-based One-time Password (TOTP) application. After you set up 2FA on your PyPI account, then you must provide a TOTP (along with your username and password) to log in. Therefore, to use 2FA on PyPI, you'll need to provision an application (usually a mobile phone app) in order to generate authentication codes; see our FAQ for suggestions and pointers.
You'll need to verify your primary email address on your Test PyPI and/or PyPI accounts before setting up 2FA. You can also do that in your Account Settings.
Currently, only TOTP is supported as a 2FA method. Also, 2FA only affects login via the website which safeguards against malicious changes to project ownership, deletion of old releases, and account take overs. Package uploads will continue to work without 2FA codes being provided.
But we're not done! We're currently working on WebAuthn-based multi-factor authentication, which will let you use, for instance, Yubikeys for your second factor. Then we'll add API keys for package upload, then an advanced audit trail of sensitive user actions. More details are in our progress reports.
Thanks to the Open Technology Fund for funding this work. And please sign up for the PyPI Announcement Mailing List for future updates.
Thursday, May 30, 2019
Wednesday, May 29, 2019
2018 in review!
Happy New Year from the PSF! We’d like to highlight some of our activities from 2018 and update the community on the initiatives we are working on.
PyCon 2018
PyCon 2018 was held in Cleveland, Ohio, US. The conference brought together 3,389 attendees from 41 countries. We awarded $118,543 in financial aid to 143 attendees. In addition to financial aid, the conference continues to offer childcare for attendees, a newcomer orientation, a PyLadies lunch, and many more events.
Registration is now open for PyCon 2019: https://pycon.blogspot.com/2018/11/pycon-2019-registration-is-open.html .
Registration is now open for PyCon 2019: https://pycon.blogspot.com/2018/11/pycon-2019-registration-is-open.html .
Community Support
We initiated a Python Software Foundation Meetups Pro network at the end of the year, which supports 37 meetups in 8 countries and further expansion planned. The Sponsorship model allows the PSF to invite existing groups to the Meetup Pro network. The organizers no longer pay for the meetup subscription once they become part of the PSF network. This initiative will save approximately 32 hours of PSF staff time and 21 hours of meetup organizer time.
To help with transparency, the PSF launched its first newsletter in December! If you’d like to receive our next edition, subscribe here: https://www.python.org/psf/newsletter/. You can read our first edition here: https://mailchi.mp/53049c7e2d8b/python-software-foundation-q4-newsletter
This year we formalized our fiscal sponsorship program to better support mission related projects. The PSF has signed fiscal sponsorship agreements with 8 groups including Pallets (Flask), PhillyPUG, PuPPy, PyCascades, PyHawaii, PyMNtos, PyArkansas, and the Python San Diego User Group. Through this effort, the PSF is able to support these projects by handling their accounting and admin work so the projects can concentrate on furthering their goals.
Thanks to a generous award from the Mozilla Open Source Support program, the all new Python Package Index based on the warehouse codebase rollout was completed in April of 2018.
In June Ernest W. Durbin III was hired as Director of Infrastructure. Ernest will be evaluating and strengthening internal systems, supporting and improving community infrastructure, and developing programs that benefit the Python community worldwide.
In September, the PSF hired Jackie Augustine as Event Manager. Jackie will be working with the team on all facets of PyCon and managing several community resources for regional conferences.
It is with great pleasure that we announce that Ewa Jodlowska will be the PSF's first Executive Director, starting January 1, 2019. Given her years of dedicated service to the PSF from event manager to her current position as Director of Operations, we can think of no one more qualified to fill this role as the PSF continues to grow and develop.
Through out 2018, we presented several awards to recognize those that go above and beyond in our community. This year we gave out several Community Service Awards, a Distinguished Service Award, and a Frank Willison Memorial Award. To find out more about our awards or how to nominate someone for a Community Service Award, check out: https://www.python.org/community/awards/.
Python Package Index
If you are interested in what the Packaging Group is currently working on, check out their RFP for security and accessibility development: http://pyfound.blogspot.com/2018/12/upcoming-pypi-improvements-for-2019.html.
Grants
The Python Ambassador program helps further the PSF's mission with the help of local Pythonistas. The goal is to perform local outreach and introduce Python to areas where it may not exist yet. In March 2018, the board approved expanding our Python Ambassador program to include East Africa. Kato Joshua and the Afrodjango Initiative have been doing great outreach in universities in Uganda, Rwanda, and Kenya.
In a general overview, $324,000 was paid in grants last year to recipients in 51 different countries. We awarded $59,804 more in grants in 2018 than 2017. That's a 22.6% increase for global community support.
Here is a chart showing the global grant distribution in 2018:
Here is a chart showing the global grant distribution in 2018:
PSF Staff
In September, the PSF hired Jackie Augustine as Event Manager. Jackie will be working with the team on all facets of PyCon and managing several community resources for regional conferences.
It is with great pleasure that we announce that Ewa Jodlowska will be the PSF's first Executive Director, starting January 1, 2019. Given her years of dedicated service to the PSF from event manager to her current position as Director of Operations, we can think of no one more qualified to fill this role as the PSF continues to grow and develop.
Community Recognition
Community Service Awards
Chukwudi Nwachukwu was recognized for his contribution to spreading the growth of Python to the Nigerian community and his dedication and research to the PSF grants work group.Mario Corchero was awarded a CSA for his leadership of the organization of PyConES, PyLondinium, and the PyCon Charlas track in 2018. His work has been instrumental in promoting the use of Python and fostering Python communities in Spain, Latin America, and the UK.
We also honored our Job Board volunteers: Jon Clements, Melanie Jutras, Rhys Yorke, Martijn Pieters, Patrice Neff, and Marc-Andre Lemburg, who have spent many hours reviewing and managing the hundreds of job postings submitted on an annual basis
Mariatta Wijaya was an awardee for her contributions to CPython, her efforts to improve the workflow of the Python core team, and her work to increase diversity in our community. In addition, her work as co-chair of PyCascades helps spread the growth of Python
Alex Gaynor received an award for his contributions to the Python and Django Communities and the Python Software Foundation. Alex previously served as a PSF Director in 2015-2016. He currently serves as an Infrastructure Staff member and contributes to legacy PyPI and the next generation warehouse and has helped legacy warehouse in security (disabling unsupported OpenID) and cutting bandwidth costs by compressing 404 images.
2018 Distinguished Service Award
The 2018 Distinguished Service Award was presented to Marc-Andre Lemburg for his significant contributions to Python as a core developer, EuroPython chair, PSF board member, and board member of the EuroPython Society.2018 Frank Willison Memorial Award
The Frank Willison Memorial Award for Contributions to the Python Community was awarded to Audrey Roy Greenfeld and Daniel Roy Greenfeld for their contributions to the development of Python and the global Python community through their speaking, teaching, and writing.Donations and Sponsorships
This year we welcomed 17 new sponsors in 2018 including our first Principal Sponsors, Facebook and Capital One. Thank you for your very generous support.
We welcome your thoughts on how you’d like to see our Foundation involved in Python’s ecosystem and are always interested in hearing from you. Email us!
We wish you a very successful 2019!
Ewa Jodlowska
Executive Director
Betsy Waliszewski
Sponsor Coordinator
Tuesday, May 28, 2019
Python Core Developer Mentorship
Core developer Victor Stinner described to the Language Summit his method of mentoring potential new core developers. His former apprentices Pablo Galindo Salgado and Cheryl Sabella, who have both been promoted to core developer in the last year, recounted their experiences of the mentorship program.
Read more 2019 Python Language Summit coverage.
Barriers To Entry
Python needs more core developers now, according to Stinner, to spread the burden of maintaining the code and reviewing contributions. Regular contributors can be promoted to the core team, but this process can take up to five years and few contributors stay engaged for long enough, because contributing to the Python project is discouraging.
Contributors’ main frustration is that pull requests can languish for months or years without a review, so they give up and seek a responsive project instead. Python is caught in a Catch-22, where the core team’s understaffing makes the project unwelcoming to potential recruits, which means the team stays understaffed. But there are other hurdles for contributors: The code base is 30 years old, with some dusty corners and complex parts, and it supports a wild variety of platforms. Python’s popularity can also be a barrier; it is frightening to modify code used by millions of people.
The Fast Path To The Core Team
Stinner described how core developers can overcome the Catch-22 by personally mentoring prospective teammates, as he does. He identifies promising coders who contribute frequently, and contacts them to offer mentorship.
Stinner said that a mentor must follow the apprentice’s progress closely over a period of many months, at least. Not all worthwhile effort results in a Git commit: an apprentice must spend time learning the workflow, the codebase, and so on. With close attention, a mentor will know that the apprentice is making progress even during quiet periods. Once an apprentice submits a pull request, the mentor’s job is to provide a prompt, thorough review, or recruit the appropriate expert to do so.
Stinner admitted that it is difficult to prioritize among the many items on his to-do list, so he dedicates time on his schedule for mentoring to ensure he is available for his apprentice. It is particularly important to ask regularly, "What are you doing? Are you stuck? Do you need some help?"
The main goal of the mentorship is to keep the apprentice motivated to contribute to Python. Compared to the usual contributor’s experience of submitting a patch and getting no response for months, an apprentice with a committed mentor will have reliable feedback and encouragement. If the mentor and apprentice stay engaged for long enough, the apprentice can earn the mentor’s trust and be nominated for promotion to the core team.
In Stinner’s view, mentorship must happen in private, so the apprentice can be comfortable asking “dumb” questions. The mentorship should also be secret, at least at the beginning. Core developer Ned Deily commented that it would be helpful to know who is being mentored so he could prioritize reviewing their patches and answering their questions. But Stinner said he does not announce when he begins mentoring someone, to avoid pressure. “It can be very scary to see many people looking at your work.”
Mentors should provide a series of rewards for apprentices to earn. Stinner said that he initially considered gamifying the mentorship process with badges, but rejected this idea. Instead, contributors are rewarded with ever greater responsibilities. Bug triage is a good first responsibility, since the cost of mistakes is trivial: a bug closed in error can be reopened, a mislabelled bug can be labelled correctly. In Stinner’s experience, apprentices are eager to gain more responsibility and they take each new task seriously. “They understand what it means and they do their best not to make mistakes.”
Stinner invited two recently promoted core developers to describe their experience as apprentices.
Pablo Galindo Salgado
Pablo Galindo Salgado was promoted to core developer in June 2018. He told the Language Summit that one of a mentor’s most important roles is as a source of tribal knowledge. Many tasks as a core developer require knowledge of undocumented behaviors, or the historical context for a piece of code, or who is the current expert about a certain aspect of Python. Apprentices have an advantage over other contributors because they have someone to ask these questions.
According to Galindo, there must be a moment in the mentorship where the core developer encourages the apprentice to embrace failure. “I committed some mistakes in the beginning,” he said. “When you don't have context, you think you broke the world.” Victor Stinner and Yury Selivanov explained gently that everyone is human, and shared stories about their own past mistakes.
Cheryl Sabella
Cheryl Sabella became a core developer in February 2019. When she began working on CPython two years earlier, it was the first open source project she had contributed to. “So I knew nothing,” she told the Language Summit. Fortunately, she said, the community supported her as she learned git, the Python development workflow, and the codebase itself. Her first pull request was a documentation change. When Mariatta Wijaya approved it and commented with the “Ta-da” emoji, says Sabella, “I was over the moon.”
Sabella contributed for some time, especially to IDLE, and one day Stinner wrote to say that he’d granted her bug triage authority. As she recounted to the Language Summit, this new power made her nervous; she would never have asked for it. The next year, when Stinner invited her to become a core developer, she hesitated for so long that Stinner eventually told her, "Okay, I'm not going to bother you anymore about this." Then he invited her again in January 2019, saying, "Well, I told you I wasn't going to but I'm bothering you again."
Sabella said she had not begun the mentorship program with the intent of becoming a core developer, she only wanted to contribute. It was the core team’s regular guidance and cheerleading that motivated her to join.
Victor Stinner
Victor Stinner returned to the podium to share his insights as a mentor. He said mentors should choose apprentices who represent not only diverse nationalities and genders, but also diverse skill levels. The core team spends much of their time reviewing contributors’ pull requests, and they need a variety of skills and personalities to review them all: some patches are documentation, some are in Python or C, some require specialized knowledge, some are just very tedious.
Stinner said that mentors should accept a range of outcomes. “It's not a failure if, at the end of some mentoring, the mentoree doesn't become a core developer.” Mentorships are often interrupted by professional duties or events in either participant’s life, or the mentor and apprentice turn out to be a poor match. There is value from the relationship nevertheless. The apprentice becomes a better Python contributor and a better programmer. The mentor learns, by observing the apprentice’s difficulties, about barriers to entry on the Python project, such as gaps in the documentation or tooling.
Mentoring is a small burden, Stinner told the Language Summit. Apprentices are only available one day a week, typically, because Python competes with a regular job or university program, thus they can only consume a few hours of the mentor’s time. The mentorship program is efficient and effective: In the previous twelve months, five apprentices have been promoted to core developer. Stinner told the Language Summit, “I think everybody in this room can do more mentoring.”
Monday, May 27, 2019
Mariatta Wijaya: Let's Use GitHub Issues Already!
Core developer Mariatta Wijaya addressed her colleagues, urging them to switch to GitHub issues without delay. Before she began, Łukasz Langa commented that the previous two sessions had failed to start any controversies. “Come on, we can do better!”
Wijaya replied, “Hold my tequila.”
Read more 2019 Python Language Summit coverage.
Python’s Issue Tracker Is Stagnating
The current Python bug tracker is hosted at bugs.python.org (“BPO”) and uses a bug tracker called Roundup. Roundup’s development is stagnant, and it lacks many features that the Python project could use. In theory, the Python community could improve Roundup, but there are barriers: Roundup is versioned in Mercurial and it has no continuous integration testing. “If the community cared about improving bugs.python.org,” Wijaya asked, “why we haven't been doing it all this time? Seems like you're interested in doing something else.”
Compared to Roundup, GitHub issues have a number of superior features. Project administrators can easily edit issues there or report abuse. GitHub issues permit replying by email, and GitHub supports two-factor authentication. The GitHub APIs allow the core team to write bots that take over many Python development chores. Already, bots backport patches and enforce the Contributor License Agreement (“CLA”); bots could become even more powerful once issues are moved to GitHub.
GitHub Issues: A Yearlong Debate
Shortly after last year’s summit, Wijaya proposed in PEP 581 that Python migrate to GitHub issues. She acknowledged that it was wrenching to give up on BPO and Roundup after so many years. However, in her opinion, it is time to move to a different issue tracker, and GitHub is the natural choice. The core developers are all familiar with it, as well as most potential contributors.
The plan for moving to GitHub issues is split into two PEPs: The rationale is explained in PEP 581, and the migration plan is in PEP 588. The first steps are to back up all GitHub data already associated with the repository, and to set up a CLA assistant for issues—research for both tasks is in progress. Additionally, the Python organization on GitHub needs a bug triage team for people with permission to label or close tickets.
Of course, the main job is to copy thousands of issues from Roundup to GitHub with maximum fidelity, which requires knowledge of the Roundup codebase. Wijaya asked for help from someone who could write the migration code or teach her how to do it. Either way, it is likely to be the core team’s final encounter with Roundup’s code.
“Now,” said Wijaya, “Let's just use Github already! Why aren't we doing this yet?” She asked the audience what anxieties GitHub issues provoked, or what questions were still unanswered in her PEP. The sooner the migration is complete, she believes, the better for the core developers and the entire Python community.
Discussion
Ned Deily suggested revising the Python Development Guide early to describe the GitHub issues workflow before migration begins. This would prevent a period of confusion among core developers after the migration. Besides, the process of updating the Guide might flush out more details that the PEPs need to specify.
Thomas Wouters made a proposal, which he feared was controversial: Don’t migrate the old bugs. Wijaya and audience members responded with several versions of this idea. BPO could be made read-only, with the addition of a “Migrate to GitHub” button on bugs that anyone could press if they cared about an old bug. Or BPO could stay read-write for a while; active bugs would be automatically migrated until a sunset date. Some issues have useful patches or comments which should not be lost, so either BPO must be kept online with links from GitHub issues to their BPO ancestors, or else each BPO issue’s entire history must be copied to GitHub.
Guido van Rossum concluded that there were many decisions yet to be made before the migration could begin. “I'm not trying to say let's spend another year thinking about this,” he said. “I want this as badly as you want it.” However, the team must consider the consequences more carefully before they act.
Steve Dower spoke up to say that he would prefer to stay on BPO. The current tracker’s “experts index” is particularly useful: it automatically notifies the Windows team, for example, when a relevant bug is filed, and there is no equivalent GitHub feature. He rebelled at being told in effect, “Here is the change, why haven't we done it already?” He felt the default decision on any PEP ought to be maintaining the status quo.
Barry Warsaw said, “Let's remember we have friends at GitHub that will help us with the process.” If the core team finds missing features in GitHub issues, perhaps GitHub will implement them.
Carol Willing argued, “There comes a point in time when have to put a stake in the ground. Nobody's saying Github is perfect, but you need to ask, are we holding back other contributions by staying on BPO?” Many scientific Python projects such as NumPy already track their issues on GitHub. If Python migrates to GitHub issues it could interact better with them, as well as with future projects that take Python in new directions. “By staying locked in bugs.python.org, we're doing ourselves a disservice.”
Postscript
Two weeks after the Summit, PEP 581 was officially approved, making the migration to GitHub inevitable.
Tuesday, May 21, 2019
Petr Viktorin: Extension Modules And Subinterpreters
When a Python subinterpreter loads an extension module written in C, it tends to unwittingly share state with other subinterpreters that have loaded the same module, unless that module is written very carefully. Petr Viktorin addressed the Python Language Summit to describe the problem in detail and propose a cleaner isolation of subinterpreters.
Read more 2019 Python Language Summit coverage.
Python can run several interpreter instances in a single process, keeping each subinterpreter relatively isolated from the others. There are two ways this feature could be used in the future, but both require improvements to Python. First, Python could achieve parallelism by giving each subinterpreter its own Global Interpreter Lock (GIL) and passing messages between them; Eric Snow has proposed this use of subinterpreters in PEP 554.
Another scenario is when libraries happen to use Python as part of their implementation. Viktorin described, for example, a simulation library that uses Python and NumPy internally, or a chat library that uses Python and asyncio. It should be possible for one application to load multiple libraries such as this, each of which uses a Python interpreter, without cross-contamination. This use case was the subject of Viktorin’s presentation. The problem, he said, is that “CPython is not ready for this,” because it does not properly manage global state.
Viktorin described a hierarchy, or perhaps a tree, of kinds of global state in an interpreter.
Process state: For example, open file descriptors.
Runtime state: The Python memory allocator’s data structures, and the GIL (until PEP 554).
Interpreter state: The contents of the "builtins" module and the dict of all imported modules.
Thread state: Thread locals like asyncio’s current event loop; fortunately this is per-interpreter.
Context state: Implicit state such as
Module state: Python variables declared at file scope or with the “global” keyword, which in fact creates module-local state.
With a series of examples, Viktorin demonstrated the subtle behavior of module-level state.
To begin with a non-surprising example, a pure-Python module’s state is recreated by re-importing it:
But surprisingly, a C extension module only appears to be recreated when it is re-imported:
The last line seems to show that the two modules are distinct, but as Viktorin said, “This is a lie.” The module’s initialization is not re-run, and the contents of the two modules are shared:
It is far too easy to contaminate other subinterpreters with these shared contents—in effect, a C extension’s module state is therefore a process global state.
C extensions written in the new style avoid this problem with subinterpreters. Not all C extensions in the standard library are updated yet; Christian Heimes commented that the
As an example of how tricky this is, Viktorin pointed out a bug in the
The
Audience members agreed this was a bug, but Viktorin insists that this particular bug is merely a symptom of a larger problem: it is too hard to write properly isolated extension modules. Viktorin and three coauthors have proposed PEP 573 to ease this problem, with special attention to exception types.
Viktorin advised all module authors to keep state at the module level. He recognized that this is not always possible: for example, the Python standard library’s
The correct way to code a C extension is to use module-local state, and that should be the most obvious place to store state from C. It seems to Viktorin that the newest style APIs do emphasize module-local state as he desires, but they are not yet well-known.
Further reading:
PEP 384 (3.2): Defining a Stable ABI
PEP 489 (3.5): Multi-phase extension module initialization
PEP 554 (3.9): Multiple Interpreters in the Stdlib
PEP 573 (3.9): Module State Access from C Extension Methods
Not a PEP yet: CPython C API Design Guidelines (layers & rings)
Read more 2019 Python Language Summit coverage.
Python-Based Libraries Use Subinterpreters For Isolation
Python can run several interpreter instances in a single process, keeping each subinterpreter relatively isolated from the others. There are two ways this feature could be used in the future, but both require improvements to Python. First, Python could achieve parallelism by giving each subinterpreter its own Global Interpreter Lock (GIL) and passing messages between them; Eric Snow has proposed this use of subinterpreters in PEP 554.
Another scenario is when libraries happen to use Python as part of their implementation. Viktorin described, for example, a simulation library that uses Python and NumPy internally, or a chat library that uses Python and asyncio. It should be possible for one application to load multiple libraries such as this, each of which uses a Python interpreter, without cross-contamination. This use case was the subject of Viktorin’s presentation. The problem, he said, is that “CPython is not ready for this,” because it does not properly manage global state.
There Are Many Kinds Of Global State
Viktorin described a hierarchy, or perhaps a tree, of kinds of global state in an interpreter.
Process state: For example, open file descriptors.
Runtime state: The Python memory allocator’s data structures, and the GIL (until PEP 554).
Interpreter state: The contents of the "builtins" module and the dict of all imported modules.
Thread state: Thread locals like asyncio’s current event loop; fortunately this is per-interpreter.
Context state: Implicit state such as
decimal.context
.Module state: Python variables declared at file scope or with the “global” keyword, which in fact creates module-local state.
Module State Behaves Surprisingly
With a series of examples, Viktorin demonstrated the subtle behavior of module-level state.
To begin with a non-surprising example, a pure-Python module’s state is recreated by re-importing it:
import enum old_enum = enum del sys.modules['enum'] import enum old_enum == enum # False
But surprisingly, a C extension module only appears to be recreated when it is re-imported:
import _sqlite3 old_sqlite3 = _sqlite3 del sys.modules['_sqlite3'] import _sqlite3 old_sqlite3 == _sqlite3 # False
The last line seems to show that the two modules are distinct, but as Viktorin said, “This is a lie.” The module’s initialization is not re-run, and the contents of the two modules are shared:
old_sqlite3.Error is _sqlite3.Error # True
It is far too easy to contaminate other subinterpreters with these shared contents—in effect, a C extension’s module state is therefore a process global state.
Modules Must Be Rewritten Thoughtfully
C extensions written in the new style avoid this problem with subinterpreters. Not all C extensions in the standard library are updated yet; Christian Heimes commented that the
ssl
module must be ported to the new style of initialization. Although it is simple to find modules that must be ported, the actual porting requires thought. Coders must meticulously distinguish among different kinds of global state. C static variables are process globals, PyState_FindModule
returns an interpreter-global reference to a module, and PyModule_GetState
returns module-local state. Each nugget of module data must be deliberately placed at one of the levels in the hierarchy.As an example of how tricky this is, Viktorin pointed out a bug in the
csv
module. If it is imported twice, exception-handling breaks:import _csv old_csv = _csv del sys.modules['_csv'] import _csv try: # Pass an invalid array to reader(): should be a string, not 1. list(old_csv.reader([1])) except old_csv.Error: # The exception clause should catch the error but doesn't. pass
The
old_csv.reader
function ought to raise an instance of old_csv.Error
, which would match the except
clause. In fact, the csv
module has a bug. When it is re-imported it overwrites interpreter-level state, including the _csv.Error
type, instead of keeping its state at the module-local level.Audience members agreed this was a bug, but Viktorin insists that this particular bug is merely a symptom of a larger problem: it is too hard to write properly isolated extension modules. Viktorin and three coauthors have proposed PEP 573 to ease this problem, with special attention to exception types.
Viktorin advised all module authors to keep state at the module level. He recognized that this is not always possible: for example, the Python standard library’s
readline
module wraps the C readline
library, which has global hooks. These are necessarily process-global state. He asked the audience, how should this scenario be handled? Should readline
error if it is imported in more than one subinterpreter? He said, “There’s some thinking to do.” In any case, CPython needs a good default.The correct way to code a C extension is to use module-local state, and that should be the most obvious place to store state from C. It seems to Viktorin that the newest style APIs do emphasize module-local state as he desires, but they are not yet well-known.
Further reading:
PEP 384 (3.2): Defining a Stable ABI
PEP 489 (3.5): Multi-phase extension module initialization
PEP 554 (3.9): Multiple Interpreters in the Stdlib
PEP 573 (3.9): Module State Access from C Extension Methods
Not a PEP yet: CPython C API Design Guidelines (layers & rings)
Saturday, May 18, 2019
Scott Shawcroft: History of CircuitPython
Scott Shawcroft is a freelance software engineer working full time for Adafruit, an open source hardware company that manufactures electronics that are easy to assemble and program. Shawcroft leads development of CircuitPython, a Python interpreter for small devices.
The presentation began with a demo of Adafruit’s Circuit Playground Express, a two-inch-wide circular board with a microcontroller, ten RGB lights, a USB port, and other components. Shawcroft connected the board to his laptop with a USB cable and it appeared as a regular USB drive with a source file called code.py. He edited the source file on his laptop to dim the brightness of the board’s lights. When he saved the file, the board automatically reloaded the code and the lights dimmed. “So that's super quick,” said Shawcroft. “I just did the demo in three minutes.”
Read more 2019 Python Language Summit coverage.
CircuitPython Is Optimized For Learning Electronics
The history of CircuitPython begins with MicroPython, a Python interpreter written from scratch for embedded systems by Damien George starting in 2013. Three years later, Adafruit hired Shawcroft to port MicroPython to the SAMD21 chip they use on many of their boards. Shawcroft’s top priority was serial and USB support for Adafruit’s boards, and then to implement communication with a variety of sensors. “The more hardware you can support externally,” he said, “the more projects people can build.”As Shawcroft worked with MicroPython’s hardware APIs, he found them ill-fitting for Adafruit’s goals. MicroPython customizes its hardware APIs for each chip family to provide speed and flexibility for hardware experts. Adafruit’s audience, however, is first-time coders. Shawcroft said, “Our goal is to focus on the first five minutes someone has ever coded.”
To build a Python for Adafruit’s needs, Shawcroft forked MicroPython and created a new project, CircuitPython. In his Language Summit talk, he emphasized it is a “friendly fork”: both projects are MIT-licensed and share improvements in both directions. In contrast to MicroPython’s hardware APIs that vary by chip, CircuitPython has one hardware API, allowing Adafruit to write one set of libraries for them all.
MicroPython has a distinct standard library that differs from CPython’s: for example, its time functions are in a module named
utime
with a different feature set from the standard time
module. It also ships modules with features not found in CPython’s standard library, such as advanced filesystem management features. In CircuitPython, Shawcroft removed the nonstandard features and modules. This change helps new coders ramp smoothly from CircuitPython on a microcontroller to CPython on a full-size computer, and it makes Adafruit’s libraries reusable on CPython itself. Another motive for forking was to create a separate community for CircuitPython. In the original MicroPython project’s community, Shawcroft said, “There are great folks, and there's some not-so-great folks.” The CircuitPython community welcomes beginners, publishes documentation suitable for them, and maintains standards of conduct that are safe for minors.
Audience members were curious about CircuitPython’s support for Python 3.8 and beyond. When Damien George began MicroPython he targeted Python 3.4 compliance, which CircuitPython inherits. Shawcroft said that MicroPython has added some newer Python features, and decisions about more language features rest with Damien George.
Minimal Barrier To Entry
Photo courtesy of Adafruit.
Shawcroft aims to remove all roadblocks for beginners to be productive with CircuitPython. As he demonstrated, CircuitPython auto-reloads and runs code when the user saves it; there are two more user experience improvements in the latest release. First, serial output is shown on a connected display, so a program like
print("hello world")
will have visible output even before the coder learns how to control LEDs or other observable effects. Second, error messages are now translated into nine languages, and Shawcroft encourages anyone with language skills to contribute more. Guido van Rossum and A. Jesse Jiryu Davis were excited to see these translations and suggested contributing them to CPython. Shawcroft noted that the existing translations are MIT-licensed and can be ported; however, the translations do not cover all the messages yet, and CircuitPython cannot show messages in non-Latin characters such as Chinese. Chinese fonts are several megabytes of characters, so the size alone presents an unsolved problem.
Later this year, Shawcroft will add Bluetooth support for coders to connect their phone or tablet to an Adafruit board and enjoy the same quick edit-refresh cycle there. Touchscreens will require a different sort of code editor, perhaps more like EduBlocks. Despite the challenges, Shawcroft echoed Russell Keith-Magee’s insistence on the value of mobile platforms: “My nieces, they have tablets and phones. They do not have laptops.”
Shawcroft’s sole request for the core developers was to keep new language features simple, with few special cases. First, because each new CPython feature must be reimplemented in MicroPython and CircuitPython, and special cases make this work thorny. Second, because complex logic translates into large code size, and the space for code on microcontrollers is minuscule.
Amber Brown: Batteries Included, But They're Leaking
Amber Brown of the Twisted project shared her criticisms of the Python standard library. This proved to be the day’s most controversial talk; Guido van Rossum stormed from the room during Q & A.
Read more 2019 Python Language Summit coverage.
Applications Need More Than The Standard Library
Python claims to ship with batteries included, but according to Brown, without external packages it is only “marginally useful.” For example,asyncio
requires external libraries to connect to a database or to speak HTTP. Brown asserted that there were many such dependencies from the standard library to PyPI: typing
works best with mypy
, the ssl
module requires a monkeypatch to connect to non-ASCII domain names, datetime
needs pytz
, and six
is non-optional for writing code for Python 2 and 3. Other standard library modules are simply inferior to alternatives on PyPI. The
http.client
documentation advises readers to use Requests, and the datetime
module is confusing compared to its competitors such as arrow
, dateutil
, and moment
. Poor Quality, Lagging Features, And Obsolete Code
“Python's batteries are leaking,” said Brown. She thinks that some bugs in the standard library will never be fixed. And even when bugs are fixed, PyPI libraries like Twisted cannot assume they run on the latest Python, so they must preserve their bug workarounds forever.
There are many modules that few applications use, but there is no method to install a subset of the standard library. Brown called out the XML parser and
tkinter
in particular for making the standard library larger and harder to build, burdening all programmers for the sake of a few. As Russell Keith-Magee had described earlier in the day, the size of the standard library makes it difficult for PyBee to run Python on constrained devices. Brown also noted that some standard library modules were optimized in C for Python 3, but had to be reimplemented in pure Python for PyPy to support them. Brown identified new standard library features that were “too little, too late,” leaving users to depend on backports to use those features in Python 2. For example,
socket
.sendmsg
was added only recently, meaning Twisted must ship its own C extension to use sendmsg
in Python 2. Although Python 2 is nearly at its end of life, this only holds for the core developers, according to Brown, and for users, Red Hat and other distributors will keep Python 2 alive “until the goddam end of time.” Brown also mentioned that some itertools
code is shown as examples in the documentation instead of shipped as functions in the itertools
module. Guido van Rossum, sitting at the back of the room, interrupted at this moment, “Can you keep to one topic? I'm sorry but this is just one long winding rant. What is your point?” Brown responded that her point was that there are a multitude of problems in the standard library.
Standard Library Modules Crowd Out Innovation
Brown’s most controversial opinion, in her own estimation, is that adding modules to the standard library stifles innovation, by discouraging programmers from using or contributing to competing PyPI packages. Ever since
asyncio
was announced she has had to explain why Twisted is still worthwhile, and now that data classes are in the standard library Hynek Schlawack must defend his attrs
package. Even as standard library modules crowd out other projects, they lag behind them. According to Brown, “the standard library is where code sometimes goes to die,” because it is difficult and slow to contribute code there. She acknowledged recent improvements, from Mariatta Wijaya’s efforts in particular, but Python is still harder to contribute to than PyPI packages. “So I know a lot of this is essentially a rant,” she concluded, “but it's fully intended to be.”
Discussion
Nick Coghlan interpreted Brown’s proposal as generalizing the “ensurepip” model to ensure some packages are always available but can be upgraded separately from the standard library, and he thought this was reasonable.
Van Rossum was less convinced. He asked again, “Amber, what is your point?” Brown said her point was to move
asyncio
to PyPI, along with most new feature development. “We should embrace PyPI,” she exhorted. Some ecosystems such as Javascript rely too much on packages, she conceded, but there are others like Rust that have small standard libraries and high-quality package repositories. She thinks that Python should move farther in that direction. Van Rossum argued instead that if the Twisted team wants the ecosystem to evolve, they should stop supporting older Python versions and force users to upgrade. Brown acknowledged this point, but said half of Twisted users are still on Python 2 and it is difficult to abandon them. The debate at this point became personal for Van Rossum, and he left angrily.
Nathaniel Smith commented, “I'm noticing some tension here.” He guessed that Brown and the core team were talking past each other because the core team had different concerns from other Python programmers. Brown went further adding that because few Python core developers are also major library maintainers, library authors’ complaints are devalued or ignored.
The remaining core developers continued the technical discussion. Barry Warsaw said that the core team had discussed deprecating modules in the standard library, or creating slim distributions with a subset of it, but that it required a careful design. Others objected that slimming down the standard library risked breaking downstream code, or making work for programmers in enterprises that trust the standard library but not PyPI.
Pablo Galindo Salgado was concerned that moving modules from the standard library to PyPI would create an explosion of configurations to test, but in Brown’s opinion, “We are already living that life.” Some Linux and Python distributions have selectively backported features and fixes, leading to a much more complex set of configurations than the core team realizes.
Wednesday, May 15, 2019
Paul Ganssle: Time Zones In The Standard Library
Python boasts that it comes with “batteries included,” but programmers have long been frustrated at one set of missing batteries: the standard library does not include any time zone definitions. The datetime module supports the idea of time zones, but a programmer who wants to know when Daylight Saving Time starts in Cleveland must install a third-party package. Paul Ganssle spoke to the Python Language Summit to offer a solution. Ganssle maintains the PyPI package dateutil, and contributes to the standard library datetime module. He described the state of Python time zone support and how time zone definitions could be added to the standard library.
Read more 2019 Python Language Summit coverage.
Python Comes With Limited Time Zone Support
A time zone is a function that maps a naïve time to an unambiguous Coordinated Universal Time (UTC). Individual time zones can be quite eccentric, so Python does not attempt to define time zone logic, it simply provides an abstract base class TZInfo that is subclassed by implementors. Although there could theoretically be unlimited kinds of time zones, most programmers encounter three concrete types:
1. UTC or a fixed offset from it.
2. Local time.
3. A time zone from the IANA database.
The first of these was added to the standard library in Python 3.2. Ganssle said, “Whenever I teach people about datetimes, it's really nice to be able to say, if you're using Python 3, you can just have a UTC object.” The purpose of Ganssle’s proposal was to add the second and third.
Ambiguous Times
Ganssle explained that when Eastern Daylight Time ends, clocks are set back from 2:00am to 1:00am, thus there are two UTC times that map to 1:30am local time on that day:
>>> NYC = tz.gettz("America/New_York") >>> dt0 = datetime(2004, 10, 31, 5, 30, tzinfo=tz.UTC) >>> print(dt0.astimezone(NYC)) 2004-10-31 01:30:00-04:00 >>> print((dt0 + timedelta(hours=1)).astimezone(NYC)) 2004-10-31 01:30:00-05:00
PEP 495 solved the problem of ambiguous times by adding the “fold” attribute to datetime objects. A datetime with fold=0 is the first occurrence of that local time, the second occurrence has fold=1. With this addition, standard Python provides all the prerequisites for proper time zones, so Ganssle argued they should now be added to the standard library.
How To Maintain The Time Zone Definitions?
IANA time zones are the de facto standard for time zone data, and they ship with many operating systems. Both Ganssle’s dateutil and the competing pytz package use the IANA database as their source of truth. Therefore it would be natural to include the IANA time zones in the Python standard library, but this presents a problem: the IANA database changes every time a government changes a time zone, which occurs as often as 20 times a year. Time zone changes are far more frequent than Python releases.
Ganssle offered two solutions for updating time zone data, and then offered a compromise between them as his actual proposal. The first solution is to rely on the operating system’s time zone database. Python could rely on the system update mechanism to refresh this data, and it would use the same time zone definitions as most other applications. System time zone data is not officially supported on Windows, however, and is not always installed on Linux.
The second solution is to publish IANA time zone definitions as a PyPI package. It could be updated frequently, but the core team would have to invent some way to notify users when it is time to update their time zone data. Plus, it would be risky for Python to use different time zones than the rest of the system.
Ganssle proposed a hybrid: the Python standard library should use the system’s time zone data if possible, otherwise fall back to a PyPI package which would be installed conveniently, analogous to installing pip with “ensurepip” today.
The Local Time Zone
Naïve times in Python are sometimes treated as times in the local time zone, sometimes not. Ganssle showed an example demonstrating that if a programmer converts a naïve time to UTC, Python assumes its original time zone is local:
>>> dt = datetime(2020, 1, 1, 12) >>> dt.astimezone(timezone.utc) 2020-01-01 17:00:00+00:00
However, adding a naïve time to a UTC time is prohibited:
>>> datetime(2020, 1, 1, 12) - datetime(2020, 1, 1, tzinfo=timezone.utc) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't subtract offset-naive and offset-aware datetimes
Ganssle’s dateutil package offers a more thorough implementation of “local time zone”, and he thinks Python programmers would appreciate local times in the standard library. To add them, however, the core team must first handle the astonishing behavior of local times when the system time zone changes. The first surprise is that changing the system time zone has no effect until the Python program calls time.tzset(). (And on Windows, time.tzset() is not available.) The second surprise is that changing system time and then calling time.tzset() changes the UTC offset of existing times created before the change.
Ganssle proposed several ways the standard library could act in this scenario. It could ignore changes to the system time zone while a Python program is running, or it could detect time zone changes but avoid mutating the offsets of existing time objects. He had no opinion about the best outcome.
Conclusion
Ned Deily wondered what Ganssle’s proposal would solve that which pytz does not. Ganssle responded that pytz’s author has stopped maintaining the package because he believes time zones should move to the standard library. Full time zone is a basic feature that should always be available. In Ganssle’s view, however, his own dateutil is a better package to emulate than pytz. “I would take dateutil, clean up some of the rough edges, and propose it as some of the batteries that would be included.”
Łukasz Langa said that he planned, as Python 3.8’s release manager, to issue monthly patch releases, and he thought that should be frequent enough to keep users’ time zone data updated. Russell Keith-Magee said no, North Korea once announced a time zone change with three days’ notice. Other audience members thought this scenario was obscure, and the PEP should not be required to handle such emergencies.
At the end of his talk Ganssle summarized his proposal. He believes that the standard library should support IANA time zones, using the operating system as the source of time zone data or falling back to a PyPI package. There are several options for handling local time zone changes at runtime. The design should be formalized in at least an informational PEP, “if not one where it's contentious and we all hate each other at the end of it.”
Russell Keith-Magee: Python On Other Platforms
Russell Keith-Magee spoke in his capacity as the founder and “Benevolent Dictator For Now” of the BeeWare Project. The project’s slogan is “Write once. Deploy everywhere.” The goal of the BeeWare Project is to run Python applications on Android, iOS, in the browser, even on smart watches, and to distribute Python applications using platform-specific channels like app stores. Keith-Magee described a number of obstacles to this goal, and expressed his hope that the core team would consider these problems when they plan Python’s future.
Cross-Compilation Is Supported And Tested Poorly
On the server side, x86-64 dominates, but mobile architectures are varied and rapidly changing. iOS alone has encompassed six architectures in recent memory: i386, x86-64, arm6, arm7, arm7s and arm64.To make matters worse, anyone deploying Python to these architectures must cross-compile, using a desktop or server machine to compile Python and its C extensions, but designating a mobile architecture as the target. “The good news,” said Keith-Magee, is that CPython uses Autoconf, and “Autoconf has really good support for cross compilation baked in. The bad news is that CPython doesn't so much.” Cross compilation is not tested continuously, and so it is broken occasionally in CPython’s build configuration. There is a vendored copy of libffi in the Python source tree, apparently only for supporting the ctypes module on PowerPC, which makes compilation for iOS even more difficult.
Distutils and pip present a tougher challenge. They do not support cross-compilation at all, and assume that the machine on which they run is the target architecture of the C extensions they compile. Neither do they support the “fat binary” format required to support multiple iOS architectures in a single extension module.
OS Differences Break The Test Suite
Python’s assumptions about system calls are violated, in a variety of ways, on mobile and web. The most dangerous pitfalls are fork and spawn — they are provided on iOS but any program that calls them will hang. Large amounts of code in the Python standard library assume the existence of fork and spawn, and even more in the test suite.Some years ago Keith-Magee provided a patch to fix or disable tests as needed for iOS, but as he recalled, core developer Ned Deily found them “too invasive.” Instead of changing the test suite so pervasively, a new approach might be to skip large chunks of the test suite based on platform limitations.
Python Applications Must Include a Python Installation
Each Python application on a phone or table OS must ship an entire Python installation with it, since Python is not available as a system library, and applications cannot install it in a shared location. “If you've got ten Python apps on your phone,” says Keith-Magee, “you are going to have ten installs of Python.” Python for iOS is a fat binary that supports multiple architectures, weighing around 100 MB in total. Python’s size handicaps browser applications, too, which cannot begin running until the Python runtime has been loaded.The Javascript community uses “tree shaking”: Javascript applications are distributed with the minimum subset of all their dependencies that are actually required. Keith-Magee proposes a tool that would automatically apply this technique to the Python standard library. Python builtins or portions of the interpreter itself could be jettisoned, too. For example, if an application is distributed as bytecode, then the parser and compiler could be removed. Keith-Magee said, “While I'm sure that someone has written code that uses Python's complex number data type, I am not that person, and in all my apps, the code for complex number handling is dead code.” To write a tool that would remove such components, however, would require some public API for editing the runtime and standard library.
asyncio Does Not Integrate With Mobile GUI Events
Mobile applications are GUIs, and GUI programming is event-based. Python’s asyncio is a natural framework for GUI programming, but it must be adapted to run atop the GUI environment’s own event loop. According to Keith-Magee this is relatively easy on Unix platforms like Linux, macOS, and iOS, with poll-based APIs. He estimates it requires a few hundred lines of code. On Android, Windows, and the web, however, the event model is entirely different, and asyncio integration has been an open bug for years. Keith-Magee said he does not know asyncio internals well enough to propose a solution but welcomes any suggestions.Only CPython Can Use C Standard Library Modules
Keith-Magee has been experimenting with alternatives to CPython for mobile and web. Replacing the core language implementation is “relatively achievable,” he said, “but the standard library’s another thing.” The C modules in the standard library are only compatible with CPython. Some, like the decimal module, have recently been ported from Python to C, which accelerates them but presents more hindrances for non-CPython interpreters.Keith-Magee asked the core developers to maintain pure-Python implementations of all standard library modules, for the sake of alternative interpreters. But recognizing how burdensome that would be, he requested as a fallback that the C interface for each extension module be clearly defined.
Setup.py Cannot Build App Store Bundles
An application packaged for a mobile app store requires various metadata, such as a bundle identifier, that is not currently expressible in setup.py. Keith-Magee described his confusion about how to proceed, given his confusion about the direction of Python packaging in general. Should the metadata be in the pyproject.toml file specified by PEP 518? Should he adapt pip or use a custom build tool? He felt that if the core team has any clear vision for the future of Python packaging, “I can tell you from down in the trenches that message isn't getting through.” He requested more information and more opinions about packaging from the core team.Wish List
The talk concluded with a wish list for adapting Python to mobile and web:1. Host/target separation testing in CI
2. Host/target separation in distutils/pip
3. Feature gating (especially in the test suite)
4. Unvendoring libffi for macOS
5. Tree (and/or root) shaking
6. AsyncIO support for other eventing styles
7. Reference implementation of modules (or a clear native interface)
8. Clearer communications on packaging
Keith-Magee claims that Python faces an “existential risk” if mobile and web support does not improve. His son, who uses only an iPad for school, asked him, “When can I learn to program like you?” Unless students like him can program Python on their devices, Python risks being left behind by the next generation of programmers.
The 2019 Python Language Summit
The Python Language Summit is a small gathering of Python language implementers, both the core developers of CPython and alternative Pythons, held on the first day of PyCon. The summit features short presentations from Python developers and community members, followed by longer discussions. The 2019 summit is the first held since Guido van Rossum stepped down as Benevolent Dictator for Life, replaced by a five-member Steering Council.
LWN.net covered the proceedings from 2015 to 2018; this year the PSF has chosen to feature summit coverage on its own blog, written by A. Jesse Jiryu Davis.
Lightning Talks, Round 1 (pre-selected lightning talks)
Async REPL and Async-Exec, Mattias Bussonnier
The Night’s Watch Is Fixing the CIs in the Darkness for You, Pablo Galindo Salgado
Asyncio and the Case for Re-Entrancy, Jason Fried
Optimising CPython, or Not, Mark Shannon
Black Under github.com/python, Łukasz Langa
The Night’s Watch Is Fixing the CIs in the Darkness for You, Pablo Galindo Salgado
The Night’s Watch Is Fixing the CIs in the Darkness for You, Pablo Galindo Salgado
Half-Hour Sessions
Python on Other Platforms, Russell Keith-Magee
Time Zones in the Standard Library, Paul Ganssle
Batteries Included, but They’re Leaking, Amber Brown
History of CircuitPython, Scott Shawcroft
Dynamic Extension Module Objects, Petr Viktorin
Let's Use GitHub Issues Already! Mariatta Wijaya
Python Core Developer Mentorship, Victor Stinner
Lightning Talks, Round 2 (on-site signup)
SSL Module Updates, Christian Heimes
Let’s Argue About Clinic, Larry Hastings
The C-API, Eric Snow
Python in the Windows Store, Steve Dower
Bors: How the Rust Team Avoids Pablo’s Problems, Nathaniel Smith
Mypyc for Stdlib: Extended Discussion, Michael Sullivan
Python Core Sprints at Bloomberg in September, Pablo Galindo
Status of the Stable ABI, Victor Stinner
Wednesday, May 01, 2019
Building the PSF: the Q2 2019 Fundraiser
Thank you to everyone who has donated to our past fundraisers! Donations, memberships, and sponsorships support sprints, meetups, community events, Python documentation, fiscal sponsorships, software development, and community projects.
We can’t do any of this without your financial contributions. We’ve just launched a new fundraiser for 2019! Please donate today and help us meet our goal of $60,000 by June 30th!
Your donations have IMPACT!
- The PSF awarded $118,543 in financial aid to 143 PyCon attendees in 2018
- $324,000 was paid in grants in 2018 to recipients in 51 different countries
- Donations and fundraisers resulted in $489,152 of revenue. This represents 15% of our total 2018 revenue.
- PSF and PyCon sponsors contributed over $1,071K in revenue!
We understand the need for transparency and hope to help our community and stakeholders find necessary information about the PSF in a single place. We’re proud to launch our first ever Annual Report. 2018 was a year of growth for the PSF while still focussing on sustainability for our staff and community. We’re excited to share these data points with you!
Something new this year - the PSF and Jet Brains!
This year we're trying something new. In addition to our regular donation drive, we're partnering with JetBrains to help raise money for the PSF. JetBrains PyCharm & the PSF is happy to announce a 30% discount with all proceeds going to the Python Software Foundation general fund!
Please consider becoming a PSF Supporting Member today. Or simply make a one-time or recurring donation.
If you're attending PyCon this year, you can donate at the PSF booth and get your choice of a limited edition pin or sticker set!
Help spread the word about our fundraiser! Here are some suggested tweets:
===================
- Donate today and help @ThePSF reach their fundraising goal! https://www.python.org/psf/donations/2019-q2-drive/ #idonatedtothepsf
- Your contributions help fund workshops, conferences, and pay meetup fees. The PSF can't do this without your support. Please consider donating to help us continue to do our work - https://www.python.org/psf/donations/2019-q2-drive
- A Supporting Membership of $99 pays for 6 months of Python meetup subscriptions. Become a member and help us continue our mission! https://www.python.org/psf/donations/2019-q2-drive/. #idonatedtothepsf
==================
Thank you for your support!
The PSF Team