NumFOCUS Affiliated Projects are focused on open source data science, make meaningful use of NumFOCUS-sponsored tools, have a significant and consistent community of contributors, and have supported the open source data science computing community through contributions of code. Affiliated Projects are not fiscally sponsored by NumFOCUS.
We highlight affiliated projects to encourage the community to contribute to, promote, and support these open source tools! If your project meets the above criteria and you would like to become a NumFOCUS Affiliated Project, please .
Bokeh is a Python interactive visualization library that targets modern web browsers for presentation.
Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, but also deliver this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications
Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them.
It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.
A community-led collection of recipes, build infrastructure and distributions for the conda package manager.
The conda-forge GitHub organization contains thousands of repositories of conda recipes and a framework of automation scripts for facilitating CI setup and maintenance for these recipes.The goal is to provide peer-reviewed, community-standard recipes and a self-consistent ecosystem of binary packages that those recipes produce.In its current implementation, conda-forge relies on free services from AppVeyor, CircleCI and Travis CI to power the continuous build service on Windows, Linux and OS X, respectively. Each recipe is contained in a separate repository also containing the CI configuration. This repository is referred to as a feedstock, and is automatically built in a clean and repeatable way on each platform. Package hosting is currently done on anaconda.org.
Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself.
This allows developers to write complex parallel algorithms and execute them in parallel either on a modern multi-core machine or on a distributed cluster.
The Data Retriever is a package manager for data. It downloads, cleans, and stores publicly available data, so that analysts spend less time cleaning and managing data, and more time analyzing it.
It is inspired by NumPy, the Python array programming library at the core of the scientific Python stack, but tries to address a number of obstacles encountered by some of its users. Examples of this are support for variable-sized string, ragged array types, and convenient usage from C++. The library is in a preview development state, and can be thought of as a sandbox where features are being tried and tweaked to gain experience with them.
MDAnalysis is a Python library to analyze trajectories from molecular dynamics (MD) simulations.
It can read and write most popular formats, and provides a flexible and fast framework for writing custom analysis through making the underlying data easily available as NumPy arrays.
Numba gives you the power to speed up your applications with high performance functions written directly in Python.
With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.
pomegranate is a Python module for fast and flexible probabilistic modeling inspired by the design of scikit-learn.
A primary focus of pomegranate is to abstract away the intricacies of a model from its definition, allowing users to easily prototype with complex models and training strategies. Its modular implementation allows for probability distributions to be swapped in or out for each other with ease and for models to be stacked within each other, yielding such delights as a mixture of Bayesian networks or a Gaussian mixture model Bayes classifier.
It is also the name of a very popular conference on scientific programming with Python. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization.
Spack is a flexible package manager that builds multiple versions of packages for different configurations, platforms, and compilers. It was created to deploy large-scale scientific simulations on HPC systems, but it can deploy software on Linux and macOS machines, as well.
Interactive development environment for Python that features advanced editing, interactive testing, debugging and introspection capabilities, as well as a numerical computing environment made possible through the support of IPython, NumPy, SciPy, and matplotlib.
xarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures.
My career is built on top of NumFOCUS-sponsored projects.
Patrick Harrison, Supporting Member