SciPy—a NumFOCUS Affiliated Project—recently crossed a major milestone for any open source project: version 1.0! NumFOCUS extends our hearty congratulations to all of the SciPy contributors and community members who helped get the project to this point.
“SciPy the library has been a cornerstone for the scientific Python community. By providing a consistent interface to the best scientific libraries, it has helped millions of researchers, analysts, engineers, data scientists, and more. Crossing the 1.0 milestone is an event that signals to the community to trust and use our tools.”
~Andy Terrel, NumFOCUS Board President
Some of the highlights of the SciPy 1.0 release are:
- Major build improvements. Windows wheels are available on PyPI for the first time, and continuous integration has been set up on Windows and OS X in addition to Linux.
- A set of new ODE solvers and a unified interface to them (scipy.integrate.solve_ivp).
- Two new trust region optimizers and a new linear programming method, with improved performance compared to what scipy.optimize offered previously.
- Many new BLAS and LAPACK functions were wrapped. The BLAS wrappers are now complete.
The SciPy Community is Large and Growing
Impressively, 74 out of 121 contributors to the 1.0 release were first-time contributors. (That’s 61%!) The SciPy community is large and strong: GitHub counts 550 contributors. In the last month, 29 authors have pushed 95 commits to master and 115 commits to all branches. On master, 229 files have changed and there have been 4,056 additions and 1,635 deletions (see here).
Ilhan Polat, one of the SciPy devs, says, “I think the strength of SciPy can also be seen from the mainstream Q&A sites StackOverflow/reddit-python/..
16 years to 1.0
SciPy Development Timeline:
- 2001: the first SciPy release
- 2005: transition to NumPy
- 2007: creation of scikits
- 2008: scipy.spatial module and first Cython code added
- 2010: moving to a 6-monthly release cycle
- 2011: SciPy development moves to GitHub
- 2011: Python 3 support
- 2012: adding a sparse graph module and unified optimization interface
- 2012: removal of scipy.maxentropy
- 2013: continuous integration with TravisCI
- 2015: adding Cython interface for BLAS/LAPACK and a benchmark suite
- 2017: adding a unified C API with scipy.LowLevelCallable; removal of scipy.weave
- 2017: SciPy 1.0 release
A version number should reflect the maturity of a project – and SciPy was a mature and stable library that is heavily used in production settings for a long time already. From that perspective, the 1.0 version number is long overdue.
Some key project goals, both technical (e.g. Windows wheels and continuous integration) and organisational (a governance structure, code of conduct and a roadmap), have been achieved recently.
Many of us are a bit perfectionist, and therefore are reluctant to call something “1.0” because it may imply that it’s “finished” or “we are 100% happy with it”. This is normal for many open source projects, however that doesn’t make it right. We acknowledge to ourselves that it’s not perfect, and there are some dusty corners left (that will probably always be the case). Despite that, SciPy is extremely useful to its users, on average has high quality code and documentation, and gives the stability and backwards compatibility guarantees that a 1.0 label imply.
Gina Helfrich, NumFOCUS Communications Director: “What would you say were the major things that prevented SciPy from a 1.0 release before now? Was it lack of specific key features (e.g. Windows integration), or was it more apprehension about the “completeness” that a v.1.0 represents?”
Tyler Reddy, SciPy dev & Steering Council Member: “Probably the latter, as mentioned in Ralf’s reflections on the release. A lot of the core developers are perfectionists and experiencing the thoroughness of the peer review process when submitting code changes / improvements is consistent with this. The need to be incredibly thorough is reflected in the wide usage of this library — breaking something could cause some serious downstream issues for many people / organizations. Adding the (asv) benchmark suite was also quite important — we want to make sure we are moving forward (code doesn’t get slower over time / as new things are added) and we now have an automated way to check for regressions over many parts of the code base.”
Ilhan Polat, SciPy developer: “Regarding why it was not done before, I guess individual modules were lacking certain parts that didn’t feel complete. As they have grown and became satisfactory, the general look towards SciPy probably changed in the eyes of the ‘residents’ of each library. Beyond this particular threshold, version number 1.0 proposal, as far as I can tell, did not receive any objections.
I think there were both psychological saturation about the version 0.xxx numbering and also there was an expectancy from such a round version number that implied a necessity of a ‘major’ release that previous releases were not ‘worthy’ enough. It is a very nice coincidence that Windows builds made it to version 1.0. It both makes the userbase unified around pip/conda installations and also gives us immediate feedback since our test suite can also test for Windows builds and hence, increased test coverage. It provided a nice reset about the painful installation process up to that point. I agree with Tyler and a ‘brutal’ review process is only possible with people who know what they are talking about.”
Implementing Formal Governance
Traditionally, Project leadership was provided by a subset of Contributors, called Core Developers, whose active and consistent contributions have been recognized by their receiving “commit rights” to the Project GitHub repositories. In general all Project decisions are made through consensus among the Core Developers with input from the Community.
While this approach has served us well, as the Project grows we see a need for a more formal governance model. The SciPy Core Developers expressed a preference for a leadership model which includes a BDFL (Benevolent Dictator for Life). Therefore, moving forward The Project leadership will consist of a BDFL and Steering Council.
SciPy now has a formal governance structure consisting of a Steering Committee and a BDFL (Benevolent Dictator For Life). From the governance docs, “The overall role of the Council is to ensure, through working with the BDFL and taking input from the Community, the long-term well-being of the project, both technically and as a community […] the BDFL is more a role for fallback decision making rather than that of a director/CEO.” SciPy has also adopted an official code of conduct in order to “welcome and encourage participation by everyone.”
- Anne Archibald
- Andrew Nelson
- Charles Harris
- CJ Carey
- Denis Laxalde
- Eric Larson
- Eric Moore
- Eric Quintero
- Evgeni Burovski
- Jaime Fernández del Río
- Josef Perktold
- Josh Wilson
- Matthew Brett
- Nikolay Mayorov
- Pauli Virtanen
- Ralf Gommers (Council Chair and Release Manager for 1.0)*
- Tyler Reddy
- Warren Weckesser
The BDFL (Benevolent Dictator For Life) is Pauli Virtanen.
*(Ralf Gommers is the current Secretary for the NumFOCUS Board of Directors and has served on our Board since 2012.)
Gina Helfrich, NumFOCUS Communications Director: “Why do you think it took so long for the project to move to a formal governance structure? What do you see as the most important benefits of formal governance?”
Tyler Reddy, SciPy dev & Steering Council Member: “The main ‘formal’ component that I’ve noticed recently is the implementation of a Code of Conduct. There are obvious advantages to deterring inappropriate behavior in a project of this size. We want to encourage a diverse pool of contributors and have a protected / anonymous reporting mechanism when an issue arises. Hopefully this means we have even more contributors moving forward & and that cases where potential contributors are (silently?) discouraged are reduced.
In terms of the delay in adopting formal governance: most of the issues that the core team addresses on a daily basis are related to computer code & can typically be resolved in a mathematical way (i.e, the classic ‘code wins arguments’). Many of us are drawn into open projects like SciPy because it allows us to make the world better using computer code, without having to worry about various political issues (using a single repository / code hosting tool, development workflow, and a single primary programming language is effectively a utopia for making progress compared to i.e., ‘real-world’ science in many fields). What limited time we have we often want to spend building amazing things that can make the world better, rather than recreating political / governance structures that generate many of the obstacles to progress in more restrictive / fragmented ‘real-world’ (closed) development / scientific workflows.
I think that general prioritization to making amazing things with code will remain, but perhaps the recent (completely justified) increase in pressure to address discrimination / diversity issues in software engineering (and related fields) reached a sufficiently high critical mass that remaining informal in good behavior enforcement could have been risky — i.e., silence on the matter might be interpreted as a lack of interest in supporting a diverse contributor / core team. It is also perhaps a good reminder to those of us that are ‘coding first’ mindset that we have to strike a balance between ‘code wins arguments’ (i.e., harsh feedback) and being welcoming on i.e., new pull requests to the project.”
Ilhan Polat, SciPy developer: “I don’t have all the necessary historical details about the internal organizational structure, but as far as I can see, there was also an external pressure from the open-source community standards. We might speculate that neighboring projects were, one after the other, assuming such formal structures and it was about time.
From my limited point of view, a formal governance gives a clear message about its sustained existence and its place in the ecosystem of Python packages to the outside world. It signals some sense of an implicit guarantee that your code can rely on this package and it is handled with a professional attitude rather than bunch of people adding code as they see fit. Even in a package like SciPy which was already quite mature this makes a difference.
But I think, if you look at the software conferences in general lately, there is an increased awareness about issues such as diversity/handling misconducts/discrimination/