Friday, September 17, 2021

Tereza Iofciu Awarded the PSF Community Service Award for Q1 2021

 


Tereza Iofciu, Data Science coach, PyLadies Hamburg organizer, and PSF Code of Conduct working group member has been awarded the Python Software Foundation 2021 Q1 community service award.

RESOLVED, that the Python Software Foundation award the Q1 2021 Community Service Award to Tereza Iofciu. Tereza is a PSF Code of Conduct WG member and has done a wonderful job helping, participating, and driving the Code of Conduct WG discussions. Tereza formed and continues to help organize the PyLadies event in 2021. Tereza is also a member of the newly formed PSF Diversity & Inclusion WG.

We interviewed Tereza to learn more about her inspiration and work with the Python community. Georgi Ker, a close associate of Tereza also speaks about Tereza. 

The Origin Story

Can you tell us about your origin story? Like how you got into tech?


I got into tech quite traditionally, I studied Computer Science in Bucharest, Romania, but I chose that not for a particular love for Informatik. I was good at Math and Physics in high school but I couldn't study those as I didn't want to become a teacher, seeing how teachers were treated in school. 

 

In the year 2000, Computer Science seemed like a thing for the future.

 

After that I kind of went with the flow, and the flow got me to Germany and doing a Ph.D. in Information Retrieval as the field of Data Science was emerging.


After that, I worked as a Data Scientist, Data Engineer, Product Management, Leadership, and now I am teaching (ha! the irony) Data science at the Neuefische Bootcamp.


Involvement with the Python Community and Inspiration


What was your earliest involvement with the Python community?

 

I would say in 2018 I saw on Twitter a friend of mine posting she was looking for a new job where diversity was part of the culture. 

 

Through her, I discovered the PyLadies Berlin meetups and I realized that I was missing such a community in Hamburg. We had lots of meetups in the city (things used to still be in-person back then), but most were talks and networking, and not so much about teaching and learning. 

 

It took a while to set it up but then I started the PyLadies Hamburg that year, which I wrote about here.

You have been a volunteer coordinator and organizer of PyLadies Hamburg. You are also a member of the PSF Code of Conduct WG, and the Diversity & Inclusion WG. This is amazing. What drives and inspires you into volunteering your time and resources in the Python Community?

 

I often felt that a normal day job doesn't fulfill all my needs, one gets paid for work and it is hard for companies to be consistent in providing other goals. Business is business and in the end, things come down to profit. 

 

So one rarely gets the opportunity to be surrounded at work by like-minded people all the time. 

 

I have volunteered in other organizations, but I found that the PyLadies does attract people who, while they are active in it, are very passionate and inspiring about making tech accessible to more than the majority. So in the end PyLadies was also a refuge and an energy top-up. 

 

It is like finding your village in the world! 

 

Tech companies in Germany are still very behind with diversity.. and changing that needs all the help it can get, women and people from underrepresented groups need a space where they can learn and grow and get inspired without invisible glass ceilings. 



How has your involvement within the Python community helped your career?

 

Being involved helped my career in several ways - I've discovered that I learn better when I teach, that is I cannot be bothered to learn a new thing when it is just for the sake of me learning it. 

 

This ultimately led to me believing I would succeed in my current role, and thus I took the opportunity. 

 

We've organized a lot of events - meetups, full-day workshops (IoT workshop at PyCon DE 2019), and conferences like Python Pizza Hamburg in 2019 and 2020, and International Women's Day PyLadies over 3 timezones. 

 

One learns a lot from organizing and it can also be lots of fun. Also, I have been in a leadership role since 2019, and part of the job is to inspire people to get out of their comfort zone, present their work, organize workshops, do meetups and this is something that I was already practicing within the community. 

 

And the network, being around inspiring people is inspiring, and in the end, one is part of an inspiration loop - people also come back with stories on how their life got better with PyLadies. 


Impact of Covid in the Python Community


How has Covid affected your work with the Python community and what steps are you taking to push the community forward during this trying time?

 

We moved pretty quickly to remote events, nobody really felt like being responsible for spreading covid and now there is the remote everywhere. 

 

Aside from the fatigue of the pandemic, going remote has greatly made the events accessible to more people, people from other cities, countries, or people who have to take care of other people and wouldn't have been able to travel to a meetup. 

 

We had this year’s workshops with speakers from the US and Canada. This would have not been possible previously.

 

On the PyLadies Hamburg side, we try to keep to the rhythm of monthly events. 

 

And the International Women's Day event became a three timezone event quite randomly, I posted about organizing one event in Hamburg and looking for speakers among the PyLadies organizers, then Lorena Mesa from Chicago saw it and asked if she could do a joint one in Chicago and then I asked her if she knows anyone on the other side of the globe for symmetry, and she said Georgi Ker in Bangkok who said: "of course."

 

This year I also attended for the first time PyCon US and I was part of the panel presenting the Diversity & Inclusion Workgroup, and we were geographically spread all over the world.


Georgi Ker Speaks on Tereza Iofciu's Impact

Georgi Ker, who had the opportunity of working together with Tereza and Lorena Mesa in organizing the online International Women’s Day 2021 event, speaks on Tereza’s impact.


Tereza is everywhere! I don't even know where to start. She was the one who initiated organizing the PyLadies IWD - International Women's Day - event in different time zones. Making the event accessible for more people.

Apart from involvement in the Interim Global Council, she is also one of the PyLadies moderators to ensure that PyLadies stays as a safe environment for everyone.

Tereza is like the guardian of PyLadies and PSF protecting the gates of the Python community caring for people.

We at the Python Software Foundation wish to once again congratulate and celebrate Tereza Iofciu for her amazing contributions to PyLadies and the wider Python community.


Wednesday, August 18, 2021

Shamika Mohanan has joined the PSF as Packaging Project Manager

The Python Software Foundation (PSF) is excited to welcome Shamika Mohanan as our new Packaging Project Manager! You can learn specifics about the role in our post announcing the position.

Recognizing that the success of the Python language and community relies on the success of its packaging ecosystem, the PSF is excited for the Packaging Project Manager role to facilitate, coordinate, and amplify the existing momentum in this space.


Shamika will be performing outreach to Python users to help the PSF better understand the landscape, identify fundable initiatives, seek grants, oversee funded projects, and report on their progress and results to improve Python packaging for all users. Shamika will also work with the PSF Director of Infrastructure to make progress on developing PyPI into a sustainable service that the community can continue to rely on for years to come.


Once again we want to thank our Visionary Sponsor Bloomberg for their initiative in “Shifting Left” and supporting this role for its initial term of two years.


Wednesday, July 21, 2021

Python Software Foundation Fellow Members for Q2 2021

The PSF is pleased to announced its second batch of PSF Fellows for 2021! Let us welcome the new PSF Fellows for Q2! The following people continue to do amazing things for the Python community:

Cheuk Ting Ho

Twitter, GitHub, LinkedIn, Website

Emily Morehouse-Valcarcel

Twitter, GitHub, Website

Francisco Palm

Twitter, GitHub, LinkedIn, Website

Ivan Levkivskyi

GitHub

Jakub Baláš

João Sebastião de Oliveira Bueno

Twitter, GitHub, StackOverflow profile

Jukka Lehtosalo

Michael J. Sullivan

Miroslav Šedivý


Thank you for your continued contributions. We have added you to our Fellow roster online.

The above members help support the Python ecosystem by being phenomenal leaders, sustaining the growth of the Python scientific community, contributing to diversity efforts through PyLadies and other communities, maintaining virtual Python communities, maintaining Python libraries, creating educational material, organizing Python events and conferences, starting Python communities in local regions, and overall being great mentors in our community. Each of them continues to help make Python more accessible around the world. To learn more about the new Fellow members, check out their links above.

Let's continue recognizing Pythonistas all over the world for their impact on our community. The criteria for Fellow members is available online: https://www.python.org/psf/fellows/. If you would like to nominate someone to be a PSF Fellow, please send a description of their Python accomplishments and their email address to psf-fellow at python.org. We are accepting nominations for quarter 3 through August 20, 2021.

Are you a PSF Fellow and want to help the Work Group review nominations? Contact us at psf-fellow at python.org.

Monday, July 12, 2021

Łukasz Langa is the inaugural CPython Developer-in-Residence!


The PSF and the Python Steering Council are pleased to announce that the inaugural Developer-in-Residence role will be held by core developer Łukasz Langa.


CPython, the reference implementation of Python, is developed and primarily maintained by volunteers. Inspired by the Django Fellowship Program's success, the PSF has strategically planned to support CPython in a similar way beginning this year. Thanks to the support from sponsors such as Google, this effort is now being put into motion! 

Łukasz will work full-time for one year to assist CPython maintainers and the Steering Council. Areas of responsibility will include analytical research to understand the project's volunteer hours and funding, investigation of project priorities and their tasks going forward, and begin working on those priorities. Regular reporting and full transparency to the community are also a large part of Łukasz’ role. If the program is impactful and the PSF raises enough funds, there is potential for the Developer-in-Residence role to continue beyond one year. We look forward to updating the community as work progresses!

Check out Łukasz’ personal announcement here.

Wednesday, June 16, 2021

Update on the Python Software Foundation Executive Director

After ten years of exceptional service to the Python Software Foundation, the PSF Executive Director Ewa Jodlowska has decided to leave the Foundation at the end of 2021. We wish to thank Ewa for her many years of service and contributions to not only the Foundation but to the entire Python community. It’s safe to say the PSF, PyCon and the whole Python community would not be where it is today if not for Ewa.

In preparation for Ewa’s departure, the Python Software Foundation will begin a search to find a new Executive Director. The Executive Director is a key player in helping the Foundation pursue our mission “to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.”

The Board of Directors will work together with the Staff, the outgoing Executive Director, and the community in developing a timeline for the transition as well as posting periodic updates on the search when it formally begins. Please keep an eye on the PSF blog for these updates as well as the forthcoming job listing.

Wednesday, June 09, 2021

The 2021 Python Language Summit: Lightning Talks, Round 2

The second day of the 2021 Python Language Summit finished with a series of lightning talks from Ronny Pfannschmidt, Pablo Galindo, Batuhan Taskaya, Luciano Ramalho, Jason R. Coombs, Mark Shannon, and Tobias Kohn.

Annotated Assertions: Debugging With Joy

Ronny Pfannschmidt spoke about annotated assertions. He is a pytest maintainer and loves approachable debugging.

He compared assertions in unittest with assertions in pytest. He remarked that mistakes have been made in the past and are still being made today. Before pytest 2.1, pytest would reinterpret assertions, which was bad for side effects. Today, pytest deals with side effects by handling all of the variables, collecting them, and showing them to you. 

Here's what he would like to see in Python:

Here's what he'd like to do:

  • Create a PEP or have a PEP sponsor
  • Open the implementation of pytest to a wider audience

PEP 657: Fine-Grained Error Locations in Tracebacks

Pablo Galindo and Batuhan Taskaya shared their thoughts on what they want to do and what they don't want to do with PEP 657. The goal of this PEP is to improve the debugging experience by making the information in tracebacks more specific. It would also help with code coverage tools because it would allow expression-level coverage rather than just line-level coverage. JEP 358 has already accomplished something similar.

The speakers want to:

  • Keep maintenance costs low
  • Keep the size small without overcomplicating the compiler
  • Provide an API for tools to consume
  • Provide an opt-out mechanism

They want to avoid:

  • Adding a new set of .pyc files
  • Adding a new debugging info file format
  • Having a large number of new flags to customize
  • Implementing in memory/size encoding
  • Complicating the compiler too much
  • Providing more than one opt-out mechanism
  • Having manual metadata propagation

For the opt-out mechanism, there will be two ways to deactivate the feature:

  1. Environment variable: PYNODEBUGRANGES
  2. Command line option: -Xnodebugranges

Who Speaks for Mort on python-dev?

Luciano Ramalho explained that Mort, Elvis, and Einstein are names for personas that have been used within Microsoft to understand the needs of users:

  • Mort is an opportunistic developer who like to create quick solutions for immediate problems. He focuses on productivity and learns as needed.
  • Elvis is a pragmatic programmer who likes to create long-lasting solutions. He learns while working on solutions.
  • Einstein is a paranoid programmer who likes to create the most efficient solution to a problem. He typically learns before working on the solution.
Users of Python can be organized into similar groups with distinct needs. Since Einsteins may not clearly understand the needs of Morts and Elvises, Luciano Ramalho suggested that it may be time to recruit core users to speak for the Python users who aren't also core developers.

Annotations as Transforms

Jason R. Coombs shared his thoughts on designating transformation functions to be applied to parameters and return values. He had originally been inspired by the simplicity and power of decorators, and his idea could in theory be applied with decorators today. However, he determined that it would be more elegant to use annotations.

Using this approach would have advantages:

  • Elegant, simple declaration of intended behavior
  • Clear separation of concerns
  • Avoiding rewriting variables in the scope
  • Easy reuse of transformations
  • Explicit type transformation

However, there would also be challenges:

  • Compatibility: Although older versions of Python don't have this functionality, you could implement a compatibility shim.
  • Ambiguity between types and transforms: In order to address this concern, you could potentially:
    • Require transforming functions to be explicitly created
    • Provide a wrapping helper to specify that a type is used as a transform (e.g. -> transform(str))
    • Provide a wrapper helper or explicit types for nontransforming type declarations (e.g. Int or strict(int))

Tiers of Execution: Making CPython Execute Efficiently

Mark Shannon started by defining four tiers of execution:

  • Tier 0: The slowest tier, with minimal memory usage and low startup time
  • Tier 1: Primary interpreter, the adaptive, specializing interpreter
  • Tier 2: Small region, lightweight JIT
  • Tier 3: Large region, heavyweight JIT

The higher a tier, the hotter the code that it will execute. Today, CPython is at tier 0.3. It's a compromise between memory use and speed but isn't optimized for either. He said that tier 0 could be considered for Python 3.11 or later. It could:

  • Minimize startup time and memory use at the expense of execution speed
  • Support a full set of features, including sys.settrace
  • Be able to execute from a .pyc file that is mmapped and immutable
Tier 1 is planned for Python 3.11:
  • Adaptive, specializing interpreter (PEP 659)
  • Possible lack of support for some features, such as sys.settrace
Tiers 2 and 3 are entirely hypothetical at the moment and would involve JIT compilers. They maybe be more like LuaJIT than JVM.

Switching between tiers can be expensive, but the goal is to make it cheaper by having the same in-memory data layout for all tiers. In order to support all of Python, we will need to switch between tiers often. Each tier should be maintained mostly independently for open-source development. The performance cost won't be high if the memory layout is designed carefully.

Running Parallel Python Code in the Browser

Tobias Kohn has been working on TPython, a new Python implementation that works in the browser. His objectives were to do multiprocessing in the browser, not block the UI, and use native JavaScript libraries. 

Because JavaScript has a single thread event queue that contains even I/O and garbage collection, as long as your current thread is running, nothing else can happen while your current task is running. You can use web workers with messages in each of the web worker's event queues, but those messages won't become visible until the event queue gets to them.

You could suspend the current task and let everything in the event queue happen so that the message can be processed and then resume your task later on. To do that, you could use the bytecode in Python 3.6+ because the frame already has an index into the bytecode and captures state, to a certain extent. However, some bytecode instructions are too complex. _add_ can execute arbitrary Python code, fail, call _radd_, and execute other Python code. The standard bytecode is insufficient.

He's currently using an MPI interface for parallel processing. There is:

  • Early-stage multiprocessing support
  • A NumPy-like interface for JavaScript typed arrays
  • No blocking or freezing of the browser's UI
It runs on unmodified CPython 3.6+ bytecode.

Sunday, June 06, 2021

The 2021 Python Language Summit: Fuzzing and Testing Python With Properties

At the 2021 Python Language Summit, Zac Hatfield-Dodds gave a presentation about fuzzing and testing with Python properties. This presentation tied in with the one he gave at the 2020 Python Language Summit.

Zac Hatfield-Dodds
 

What Is Testing?

For the purposes of this talk, he defined testing as the art and science of running code and then checking if it did what it was supposed to do. He added that, although assertions, type checkers, linters, and code review are good, they are not testing.

There are two general reasons why we might have tests:

  1. For correctness:
    1. The goal is to validate software and determine that they are no bugs.
    2. Nondeterminism is acceptable.
    3. Finding any fault is a success.
  2. For software engineering (programming, over time, in teams):
    1. The goal is to validate changes or detect regressions.
    2. Nondeterminism is bad.
    3. Bugs should be in only the diff.

When these two reasons for testing aren't distinguished, there can be miscommunications.

What Is Property-Based Testing?

There are many types of tests:

  • Unit tests
  • Integration tests
  • Snapshot tests
  • Parameterized tests
  • Fuzz tests
  • Property-based tests
  • Stateful model tests

The speaker then walked the summit attendees through an example to explain going from traditional unit tests through to parameterized tests and then seeing how that plays into property-based tests.

Imagine that you needed to test the sorted() builtin. With a traditional set of unit tests, you can write a bunch of cases with the expected inputs and outputs:


If you want to avoid repeating yourself, you can write a list of inputs and outputs:


If you don't have a known good result, then you can still write tests using only the input argument. One option would be to compare to another reference implementation:

However, comparing with another reference implementation might not be an option, so you could just test if the output seems to be right:

In order to improve on this test, you might want to add another property that you can test. You could check that the length of the output is the same as the length of the input and that you have the same set of elements:

This would pass on the incorrect sorted([1, 2, 1]) -> [1, 2, 2].  A brute-force approach using itertools.permutations() would detect that too:


But the best solution is collections.Counter():

This last test uses property-based testing:

 
Instead of having a specific list of inputs, you could use Hypothesis:
 

That test will fail because NaN compares unequal to itself, so any list containing NaN will appear to not be in sorted order. So it could be good to have specified behavior for the ordering on NaN elements in the sorting algorithm:

 
He said that one of the big advantages of using something like Hypothesis rather than a list of handwritten examples is that is will raise conceptual issues that you may not have already thought through yourself.

In summary, property-based testing lets you:

  • Generate input data that you might not have thought of yourself
  • Check that the result isn't wrong, even without the right answer
  • Discover bugs in your understanding rather than just in your code
Often, you don't even need assertions in the test. Generating unusual input data is surprisingly effective. It can give you the sort of feedback you could get from real users, but you don't need to ship before getting the feedback.

A common concern is that, if you have randomized testing, then are things flaky? How do you deal with determinism? Hypothesis has been working on that for years, so they have solid answers to these kinds of questions:

If that's not enough, then you also have other options:

The Hypothesis database is a collection of files on disk that represent the various examples. Since it's a key-value store, it's easy to implement your own custom one:

In this example, you have a local database on disk. You can also have a shared network database on something like Redis, for example.

Coverage-guided fuzzing takes this to the next level:

What's New?

At the 2020 Python Language Summit, when he said that we would find more bugs if we used property-based testing for CPython and the standard library, the response was positive, but then not much happened. Since then, Paul Ganssle has opened a PR on CPython to add some Hypothesis tests for the zoneinfo library. Zac Hatfield-Dodds said that CPython is doing very well on unit testing and has a strong focus on regressions but that it would be quite valuable to add some of the tools that have been developed for testing for correctness.

These tools don't only find existing bugs. They're good at finding regressions where someone checked in new code with what turned out to be inadequate test coverage:

There is a pace at which we find and fix bugs that were preexisting in addition to the ongoing rate of introducing new bugs that then get detected by fuzzing instead of lasting for too long:


What's Next?

There is a three-step plan:

  1. Merge Paul Ganssle's PR or come up with an alternative proposal to get Hypothesis into CPython's CI in order to unblock ongoing incremental work
  2. Merge some tests
  3. Run them in CI and on OSS-Fuzz
For interested parties, you can see and engage in the follow-ups to this work on the Python Steering Council's issue tracker.

Saturday, June 05, 2021

The 2021 Python Language Summit: What Should I Work on as a Core Dev?

At the 2021 Python Language Summit, Eric Snow gave a presentation about how core developers can receive guidance to help them work on improvements to the language that will bring the most benefit to the Python community.

Eric Snow

What Does the Community Need?

When Eric Snow first got involved in core development over a decade ago, he liked that there was so much to learn. But he found that the organization didn't offer a lot of direction in terms of guiding volunteers to the kind of work that would have the biggest impact. 

Over the years, he's thought about how he and other contributors decide what to work on and why. What directs the efforts of contributors and core developers? Contributors make decisions based on their availability, expertise, and interests. Core developers need to act as stewards for the language. There's plenty of collaboration that goes on, but everyone has their own idea of what the community needs, what would be interesting to work on, and what the Python language needs. As a result, it can be hard to see the bigger picture.

Would a PM Help?

Time and time again, he has asked himself what he can work on to best help the Python community. We all care about this language and the community that surrounds it, so we want to help it as much as we can. Sometimes, it's hard to get a sense of what will help the community and the language the most from our own limited, individual perspectives.

One solution could be to have a dedicated PM who can provide the direction that we've been missing. This person could be provided by the PSF or a sponsoring organization. They could compile and maintain a list of improvements that would be most beneficial to the community. They wouldn't dictate what would be worked on, but they could surface what the community needs.

There has been talk of the Steering Council providing a road map. Having a PM could help:

  • Provide a clear picture of what that road map could look like
  • Help developers and maintainers make decisions about where to spend their time as volunteers
  • Facilitate collaboration

Discussion

Eric Snow was interested in hearing whether or not other core developers would find a PM helpful.  

Luciano Ramalho said that he was strongly in favor of having a PM who could assess the needs of the community from the perspective of the community rather than only the perspective of core developers. This idea overlaps with the questions he raised in his lightning talk at this summit. He also mentioned that Go has a PM role similar to what Eric Snow was suggesting.

Other attendees discussed how this kind of role could be funded and considered how much benefit the role could bring considering that the PSF is working with limited financial resources. They also discussed the differences between a Product Manager and a Program Manager and determined that this role would be more like a Program Manager.

Saturday, May 29, 2021

The 2021 Python Language Summit: What Is the stdlib?

Brett Cannon gave a presentation at the 2021 Python Language Summit about the standard library in order to start a conversation about whether it's time to write a PEP that more clearly defines it.

Brett Cannon

 

What Is the stdlib?

He succinctly described the stdlib as "a collection of modules that ship with CPython (usually)." This was the most accurate definition he could give, considering how big it is, how varied its contents are, and how long it has been around.

He didn't offer an answer to the question about whether or not there should be a new informational PEP to define clear goals for the stblib, but he wanted core developers to engage with the question. There are a variety of opinions on the stdlib, but it could be beneficial to come to some kind of agreement about:

  • What it should be
  • How to manage it
  • What it should focus on
  • How to decide what will or will not be added to it

He shared that he semi-regularly sees requests for adding a TOML parser to the stdlib. When he considers requests, he asks himself:

  • Should a module be added?
  • How should such a decision be made?
  • What API should it have?
  • Is there a limit to how big the stdlib should get?

So far, there haven't been basic guidelines for answering these kind of questions. As a result, decisions have been made on a case-by-case basis.

How Big Is the stdlib?

He did some data digging in March of 2021 and shared his findings. Here are the broad strokes:

  • python -v -S -c pass imports 14 modules.
  • The are 208 top-level modules in the standard library.
  • The ratio of modules to people registered to vote in the last Steering Council election is 2.3, but not all of those people are equally available to help maintain the stdlib.

What Should the stdlib Cover?

Some people have suggested that the stdlib should be focused on helping users bootstrap pip and no more. Others have said that it should focus on system administration or other areas. Considering that there are thirty-one thematic groupings in the index, the people who maintain the stdlib don't seem to have come to a collective decision either. The groupings cover everything from networking to GUI libraries to REPLs and more.

The stdlib has been around for a long time, and we need to be careful about breaking people's code, so the goal is not to deprecate what is already there but to consider guidelines for making additions.

How Do We Decide What Goes Into the stdlib?

He compared PEP 603 and graphlib to show how this question has been answered in different ways in the past. The goal of PEP 603 was to add the class frozenmap to the collections module. However, the graphlib module was only ever discussed in an issue and never got a PEP before it was added to the stdlib. There is no standardized approach for making these kinds of decisions, so he would like to know what approaches core developers think would be most appropriate.

What Is the Maintenance Cost?

The PR queue is already long, which can be overwhelming for maintainers and discouraging for contributors.

The following modules aren't used in any of the top 4000 projects:

  • mailcap
  • binhex
  • chunk
  • nis

Seventy-six modules are used in less than 1% of the 4000 most downloaded projects on PyPI. That's over 36% of all the modules in the stdlib. This raises some questions:

  • Do we want to continue to ship these modules?
  • What does this tell us about what the community finds useful in the stdlib?
  • How can that inform future guidelines about what to include in the stdlib?

Based on the data from March 2021, there were:

  • 37 modules with no open PRs
  • 1,451 PRs involving the stdlib, which made up the bulk of all the PRs
The module with the highest number of PRs was asyncio, which had only 50. That's only 3% of all of the open PRs at the time. 
 
The standard library has a significant maintenance cost, but core developers can formulate a plan to get the most out of the maintenance that goes into the stdlib by deciding what it should focus on. They can discuss these issues and work towards resolving them this year.

Monday, May 24, 2021

The 2021 Python Language Summit: The Python Documentation Work Group

Mariatta Wijaya and Carol Willing gave a presentation about a new documentation work group at the 2021 Python Language Summit. Last year, Carol Willing and Ned Batchelder spoke about laying the groundwork for this project at the 2020 Python Language Summit.

Carol Willing and Mariatta Wijaya
 

Why Does Python Need a Documentation Work Group?

The mission of the Python Software Foundation is to advance the Python language and grow a diverse, international community of Python programmers so that the language can continue to flourish well into the future. However, when it comes to documentation, core developers don't necessarily reflect the larger world of Python users. If we can bring together core developers, documentarians, and educators, then we can have better documentation that serves the needs of the wider community more effectively.

What Should the Documentation Work Group Achieve?

The work group has two main goals:

  1. Improve documentation content so there's more documentation aimed at people who are learning the language
  2. Modernize documentation themes to make them responsive on mobile so they can better serve users working with limited bandwidth

Although core developers have sometimes felt a great deal of ownership of parts of the documentation, all of the documentation is a community resource. As a result, no one person should be responsible for any one part of the documentation. If the work group is large enough, then it can serve as an editorial board that could work towards consensus.

The work group will:

  • Set priorities and projects for the next year
  • Build a larger documentation community and help them feel engaged, connected, and empowered

What Will Stay the Same?

Changes to documentation will still go through the same PR process that is described in the dev guide. There will be the same commitment to quality. Although there will be new documentation to meet the needs of underserved users and topics, the existing docs at docs.python.org and devguide.python.org will remain.

What Will Change?

Documentation is a gateway to education. In order for it to be more effective, we need broader input from the community. There have already been considerable efforts with translation and localization, but it would also be beneficial to have a new landing page to help users find the resources they need.

What's Next?

The next step is to deal with the logistics of work-group membership. The current members are Mariatta Wijaya, Carol Willing, Ned Batchelder, and Julien Palard. The charter for the work group states that it can have up to twenty members. The application process will be similar to the one that the code of conduct work group used. The current members expect applications to come from the wider Python community as well as from core developers.

Once the group has more members, they will hold a monthly meeting that will be scheduled to accommodate a variety of time zones. They will discuss docs issues, open PRs, the status of projects, achievements, next steps, and more.

There are also plans for AMA sessions on Discourse so that docs team members can answer questions and connect with the wider docs community. In addition, the group will reach out to PyLadies and tap into the diverse skill sets of their members.

Where Can You Learn More?

To learn more, you can check out the Python docs community on:

Sunday, May 23, 2021

The 2021 Python Language Summit: The Challenges of Packaging Python for a Linux Distro

Matthias Klose gave a talk about the challenges of packaging Python for a Linux distribution at the 2021 Python Language Summit. He wanted to discuss:

  • CPython sources and how they fit with Debian and Ubuntu
  • Ownership of module installations 
  • Architecture and platform support
  • Inclusiveness and the ubiquity of Python on various platforms
  • Communication issues
 

What Is Python Like on Debian and Ubuntu?

He shared what Debian 11 and Ubuntu 21.04 have installed for Python. By default, there is almost nothing, so it's usually pulled in by various seeds or images. Linux distributions usually have a mature packaging system and don't ship naked CPython, unlike macOs or Windows.

These Debian and Ubuntu versions still use Python 2 to bootstrap PyPI, but one version of Python 3 is also shipped with them. CPython itself is split into multiple binary packages for license issues, dependencies, development needs, and cross-buildability needs. There is also a new python3-full package that includes everything for the batteries-included experience. About twenty percent of the packages shipped by Debian use Python. He said that this is usually enough for desktop users but may not be enough for developers. However, the line between those two groups is not always clear.

As for the QA that Python is getting in Debian and Ubuntu, for the main part of the archive, it must conform to the Debian Free Software Guidelines. These guidelines include free distribution, inclusion of source code, and ability to modify and create derived works. Packages have to build from source and pass the upstream test suite as well as CI tests.

What Distro-Specific Decisions Were Made for Debian and Ubuntu?

Debian has a policy for shipping Python packages that is also used by Ubuntu. Usually, applications are in application-specific locations so they don't get in the way of anything else. They ship modules as --single-version-externally-managed, and they are usually shipped only if the application needs them.

The site-packages directory has been renamed to /usr/lib/python3/dist-packages and /usr/local/lib/python3.x/dist-packages for local installs. The path doesn't change during Python upgrades (PEP 3147 and PEP 3149). Although there are a large number of packages that use Python, pip is not used in the archive but just provided for use.

You can't call Python with python, python2, python3.x, or python3. There is no Python executable name by default to call Python because they just removed most of the Python 2 stuff. There is a package called python-is-python3 that reinstalls the Python symlink. That package was a compromise, and there was some difficulty getting it into Debian.

In the past, there have been license issues with shipping the CPython upstream sources. There are still license issues with the _dbm module, which is only buildable with a GPL-3+ license. There were also some executables included in the sources that were removed for 3.10. The big remaining issue is that wheels are still included without the source and can't be shipped, so you have to build them using the regular setuptools and pip distributions. Usually, symlinks and dependencies are used to point to the proper setuptools and pip packages.

The relationship between pip and Linux distributions is a difficult one, and there is more than one way to install Python modules. Part of the motivation behind renaming site-packages to dist-packages was that pip was breaking desktop systems. They also wanted to resolve conflicts with locally built Python (installed in /usr/local).

There has been some controversy about what can break your system:

  • sudo rm -rf /
  • sudo pip install pil
  • sudo apt install python3-pil

PyPA does not consider sudo pip install to be dangerous, but Debian and Ubuntu have different opinions about how pip should behave. Mixing packages from two different installers can lead to problems. Although PEP 517 appears to say that pip is only recommended, pip does seem to be enforced more and more. Matthias Klose joked that perhaps ensurepip should be renamed to enforcepip. He also said that having some kind of offline mode for pip would help.

He discussed inclusiveness and said that Python being ubiquitous should be seen as an asset rather than a burden. He was sad to see negative attitudes towards platforms that are used less, such as AIX, and didn't see why Python would want to exclude some communities.

What Communication Issues Are There?

In December or November, there were tweets about problems within Debian and Ubuntu. He considered some of the concerns that were brought up to be valid, but he said that there were also legal threats made against the Debian project on behalf of the PSF. He was of the opinion that all parties could improve and that it wasn't just a Debian or Ubuntu problem.

Here are some of the communication problems he highlighted:

  • Distro issues don't reach distro people.
  • Problems with pip breaking systems don't reach pip developers.
  • There is no single place to discuss PyPA issues.
  • There have been problems with manylinux wheels built for CentOS that came up in distro channels.
He would like to see communication improve and reminded the summit attendees that no group speaks with a single voice.