Skip to content
On this page

ML Libraries

Exploring using BlackJAX, JAX & PyTorch.

Top

Web

  • Shumai - Fast differentiable tensor library for research in TypeScript and JavaScript. Built with bun + flashlight. (HN)
  • ml5.js - Friendly machine learning for the web.
  • ml.js - Machine learning tools in JavaScript.

Embedded

  • NNoM - High-level inference Neural Network library specifically for microcontrollers.

Other

  • SynapseML - Simple and Distributed Machine Learning. (Web) (Article)
  • imgaug - Image augmentation for machine learning experiments.
  • PlaidML - Framework for making deep learning work everywhere.
  • Leaf - Open Machine Intelligence Framework for Hackers. (GPU/CPU).
  • Apache MXNet - Deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity.
  • Sonnet - Library built on top of TensorFlow for building complex neural networks.
  • tvm - Open deep learning compiler stack for cpu, gpu and specialized accelerators.
  • dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.
  • PySyft - Library for encrypted, privacy preserving deep learning.
  • numpy-ml - Machine learning, in numpy.
  • cuML - Suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects.
  • ONNX Runtime - Cross-platform, high performance scoring engine for ML models. (Web) (HN)
  • MLflow - Machine Learning Lifecycle Platform.
  • auto-sklearn - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
  • TensorNetwork - Library for easy and efficient manipulation of tensor networks.
  • lambda-ml - Small machine learning library aimed at providing simple, concise implementations of machine learning techniques and utilities.
  • scikit-learn - Python module for machine learning built on top of SciPy. (Tutorials) (Course) (Web) (HN) (Examples)
  • MLBox - Powerful Automated Machine Learning python library.
  • Mlxtend (machine learning extensions) - Python library of useful tools for the day-to-day data science tasks.
  • CrypTen - Framework for Privacy Preserving Machine Learning built on PyTorch.
  • Faiss - Library for efficient similarity search and clustering of dense vectors. (Tips) (HN)
  • pyHSICLasso - Versatile Nonlinear Feature Selection Algorithm for High-dimensional Data.
  • AutoGluon - AutoML Toolkit for Deep Learning.
  • DeepLearning.scala - Simple library for creating complex neural networks from object-oriented and functional programming constructs.
  • Optuna - Hyperparameter optimization framework. (Optuna Dashboard)
  • Vowpal Wabbit - Machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. (Web) (Article)
  • Brancher - User-centered Python package for differentiable probabilistic inference.
  • Karate Club - General purpose community detection and network embedding library for research built on NetworkX.
  • FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
  • DeltaPy - Tabular Data Augmentation & Feature Engineering.
  • TensorStore - Library for reading and writing large multi-dimensional arrays. (Article)
  • FATE - Industrial Level Federated Learning Framework.
  • Deepkit - Collaborative and real-time machine learning training suite: Experiment execution, tracking, and debugging.
  • Sls - Stochastic Line Search.
  • PyCaret - Open source low-code machine learning library in Python that aims to reduce the hypothesis to insights cycle time in a ML experiment. (Web)
  • scikit-multilearn - Python module capable of performing multi-label learning tasks.
  • imbalanced-learn - Python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance.
  • DeepSpeed - Deep learning optimization library that makes distributed training easy, efficient, and effective.
  • HoMM - Library for Homoiconic Meta-mapping.
  • Hummingbird - Library for compiling trained traditional ML models into tensor computations.
  • Ax - Accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.
  • Neuropod - Uniform interface to run deep learning models from multiple frameworks.
  • aerosolve - Machine learning package built for humans in Scala.
  • Kur - Descriptive Deep Learning.
  • NNI (Neural Network Intelligence) - Lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture Search, Hyperparameter Tuning and Model Compression.
  • LMfit-py - Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize.leastsq, and with many additional classes and methods for curve fitting.
  • tslearn - Machine learning toolkit for time series analysis in Python.
  • Libra - Ergonomic machine learning for everyone. (Docs)
  • NGBoost - Natural Gradient Boosting for Probabilistic Prediction.
  • LightGBM - Gradient boosting framework that uses tree based learning algorithms.
  • XGBoost - Optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.
  • DMLC-Core - Common bricks library for building scalable and portable distributed machine learning.
  • Linear Models - Add linear models including instrumental variable and panel data models that are missing from statsmodels.
  • skift - scikit-learn wrappers for Python fastText.
  • pulearn - Positive-unlabeled learning with Python.
  • pescador - Library for streaming (numerical) data, primarily for use in machine learning applications.
  • TPOT (Tree-based Pipeline Optimization Tool) - Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. (Docs)
  • GraKeL - Library that provides implementations of several well-established graph kernels. scikit-learn compatible.
  • creme - Python library for online machine learning. All the tools in the library can be updated with a single observation at a time, and can therefore be used to learn from streaming data. (Docs)
  • RecBole - Unified, comprehensive and efficient recommendation library.
  • NNFusion - Flexible and efficient DNN compiler that can generate high-performance executables from a DNN model description.
  • ncnn - High-performance neural network inference computing framework optimized for mobile platforms.
  • Scikit-Optimize - Sequential model-based optimization with a scipy.optimize interface.
  • scikit-rebate - Scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
  • Fedlearner - Collaborative machine learning frameowork that enables joint modeling of data distributed between institutions.
  • SkLearn2PMML - Python library for converting Scikit-Learn pipelines to PMML.
  • vecstack - Python package for stacking (machine learning technique).
  • LightSeq - High Performance Inference Library for Sequence Processing and Generation.
  • modestpy - Facilitates parameter estimation in models compliant with Functional Mock-up Interface.
  • Distiller - Open-source Python package for neural network compression research.
  • modAL - Modular active learning framework for Python.
  • Bambi - BAyesian Model-Building Interface in Python.
  • Bolt - Deep learning library with high performance and heterogeneous flexibility.
  • hypothesis - Python toolkit for (simulation-based) inference and the mechanization of science.
  • MMFeat - Multi-modal features toolkit in Python.
  • Flower - Friendly Federated Learning Framework. (Web) (Flower Summit 2021)
  • brain.js - GPU accelerated Neural networks in JavaScript for Browsers and Node.js. (Web)
  • Buffalo - Fast and scalable production-ready open source project for recommender systems.
  • EvalML - AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions.
  • MindSpore - New open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
  • Flashlight - Fast, Flexible Machine Learning in C++.
  • raster-deep-learning - ArcGIS built-in python raster functions for deep learning to get you started fast.
  • CTranslate2 - Fast inference engine for OpenNMT models.
  • Causal Discovery Toolbox - Algorithms for graph structure recovery (including algorithms from the bnlearn, pcalg packages), mainly based out of observational data.
  • FedML - Research Library and Benchmark for Federated Machine Learning.
  • Auto_TS - Automatically build multiple Time Series models using a Single Line of Code.
  • AutoGL (Auto Graph Learning) - AutoML framework & toolkit for machine learning on graphs.
  • tsalib - Tensor Shape Annotation Library (numpy, tensorflow, pytorch, ...).
  • MMClassification - Open source image classification toolbox based on PyTorch.
  • Nimble - Lightweight and Parallel GPU Task Scheduling for Deep Learning.
  • Dannjs - Neural Network library for JavaScript. (Web)
  • Shapley - Python library for evaluating binary classifiers in a machine learning ensemble.
  • Orion - Machine learning library built for unsupervised time series anomaly detection.
  • BigDL - Distributed Deep Learning on Apache Spark. (Docs)
  • MNN - Blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba.
  • Haste - CUDA implementation of fused RNN layers with built-in DropConnect and Zoneout regularization.
  • sklearn-xarray - Metadata-aware machine learning.
  • dabnn - Accelerated binary neural networks inference framework for mobile platform.
  • OneFlow - Performance-centered and open-source deep learning framework.
  • DeepWalk - Deep Learning for Graphs. (Web)
  • sequitur - Autoencoders for sequence data.
  • cleanlab - Machine learning python package for learning with noisy labels and finding label errors in datasets. (Web) (Lobsters)
  • deeptime - Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation.
  • Jelly Bean World - Framework for experimenting with never-ending learning.
  • Larq - Open-source deep learning library for training neural networks with extremely low precision weights and activations, such as Binarized Neural Networks (BNNs). (Web)
  • tsai - State-of-the-art Deep Learning for Time Series and Sequence Modeling.
  • edbo - Experimental Design via Bayesian Optimization.
  • TensorJS - JS/TS library for accelerated tensor computation intended to be run in the browser.
  • micro-TCN - Efficient neural networks for audio effect modeling. (Web)
  • DESlib - Python library for dynamic classifier and ensemble selection.
  • BytePS - High performance and generic framework for distributed DNN training.
  • Hyperactive - Hyperparameter optimization and meta-learning toolbox for convenient and fast prototyping of machine-learning models.
  • Jittor - Just-in-time(JIT) deep learning framework.
  • autofeat - Linear Prediction Model with Automated Feature Engineering and Selection Capabilities.
  • Distrax - Lightweight library of probability distributions and bijectors. It acts as a JAX-native reimplementation of a subset of TensorFlow Probability (TFP).
  • scikit-learn-extra - Set of useful tools compatible with scikit-learn.
  • GeneticAlgorithmPython - Building Genetic Algorithm in Python.
  • Newt - Gaussian process library in JAX.
  • Hedgehog - Bayesian networks in Python.
  • Backdoors 101 - PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.
  • Sabertooth - Standalone pre-training recipe with JAX+Flax.
  • ProbFlow - Python package for building Bayesian models with TensorFlow or PyTorch.
  • Mars - Tensor-based unified framework for large-scale data computation which scales Numpy, pandas, Scikit-learn and Python functions.
  • DeepMatch - Deep matching model library for recommendations & advertising.
  • Layout Parser - Unified toolkit for Deep Learning Based Document Image Analysis. (Web)
  • scikit-survival - Survival analysis built on top of scikit-learn.
  • PySR - Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing.
  • Snowman Hotword Detection
  • CLU - Contains common functionality for writing ML training loops using JAX.
  • SparseML - Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models.
  • CogDL - Extensive Toolkit for Deep Learning on Graphs. (Web)
  • TensorLy - Tensor Learning in Python. (Web)
  • Cornac - Comparative Framework for Multimodal Recommender Systems.
  • MegEngine - Fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
  • SeqIO - Task-based datasets, preprocessing, and evaluation for sequence models.
  • OpenAI Python - Provides convenient access to the OpenAI API from applications written in Python.
  • Mesh Transformer JAX - Model parallel transformers in JAX and Haiku. (HN)
  • Checking out a 6-Billion parameter GPT model, GPT-J, from Eleuther AI (2021)
  • deepC - Vendor independent deep learning library, compiler and inference framework designed for small form-factor devices.
  • Dlib - Modern C++/Python Toolkit for Machine Learning . (Web) (HN)
  • Continuum - Clean and simple data loading library for Continual Learning.
  • Smile - Statistical Machine Intelligence & Learning Engine.
  • AugLy - Data augmentations library for audio, image, text, and video.
  • Surprise - Python scikit for building and analyzing recommender systems. (Web)
  • TNN - High-performance, lightweight neural network inference framework.
  • Parallax - Immutable Torch Modules for JAX.
  • EvalAI - Open source platform for evaluating and comparing machine learning (ML) and artificial intelligence (AI) algorithms at scale. (Web)
  • Avalanche - End-to-End Library for Continual Learning. (Docs)
  • PyKale - Knowledge-Aware machine LEarning (KALE) from multiple sources in Python.
  • mltrace - Coarse-grained lineage and tracing for machine learning pipelines.
  • PPLNN - High-performance deep-learning inference engine for efficient AI inferencing.
  • Petastorm - Enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format.
  • Collie - Library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch. (Docs)
  • voxelmorph - Unsupervised Learning for Image Registration.
  • uTensor - TinyML AI inference library.
  • Tangram - Train a model from a CSV file on the command line.. (Web) (HN)
  • AdaptDL - Resource-adaptive cluster scheduler for deep learning training.
  • Triage - General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems.
  • Gorse - Open source recommender system service written in Go. (Web) (HN)
  • LensKit - Python Tools for Recommender Experiments. (Web)
  • StarSpace - Learning embeddings for classification, retrieval and ranking.
  • ELFI - Engine for Likelihood-Free Inference. (Docs)
  • DaisyRec - Python toolkit dealing with rating prediction and item ranking issue.
  • AutoTS - Forecasting Model Selection for Multiple Time Series.
  • PyFlux - Open source time series library for Python.
  • trajax - Python library for differentiable optimal control on accelerators.
  • TransmogrifAI - End-to-end AutoML library for structured data written in Scala that runs on top of Apache Spark. (Web)
  • chitra - Multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and Model Deployment.
  • DoubleML - Double Machine Learning in Python.
  • jaxfg - Factor graphs and nonlinear optimization in JAX.
  • pyltr - Python learning-to-rank toolkit with ranking models, evaluation metrics, data wrangling helpers, and more.
  • Wrangl - Ray-based parallel data preprocessing for NLP and ML.
  • Treex - Pytree-based Module system for Deep Learning in JAX. (Docs)
  • PhiFlow - Open-source simulation toolkit built for optimization and machine learning applications.
  • OpenVINO Toolkit - Deploy pre-trained deep learning models through a high-level C++ Inference Engine API integrated with application logic.
  • WILDS - Machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.
  • TurboTransformers - Fast and user-friendly runtime for transformer inference on CPU and GPU.
  • DeepOps - Mini Deep Learning framework supporting GPU accelerations written with CUDA.
  • Bayex - Bayesian Optimization Python Library powered by JAX.
  • Merlion - Machine Learning Framework for Time Series Intelligence.
  • Feast - Feature Store for Machine Learning. (Web)
  • nnabla - Neural Network Libraries by Sony. (Web)
  • RevLib - Simple and efficient RevNet-Library with DeepSpeed support.
  • DeepSparse - Neural network inference engine that delivers GPU-class performance for sparsified models on CPUs.
  • NVTabular - Engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
  • Treeo - Small library for creating and manipulating custom JAX Pytree classes.
  • FedJAX - JAX-based open source library for Federated Learning simulations that emphasizes ease-of-use in research.
  • oneAPI - OneAPI Deep Neural Network Library (oneDNN).
  • MosaicML Composer - Library of methods, and ways to compose them together for more efficient ML training.
  • deep-significance - Easy and Better Significance Testing for Deep Neural Networks.
  • Finetuner - Finetuning any DNN for better embedding on neural search tasks. (Docs)
  • mlcrate - Hon module of handy tools and functions, mainly for ML and Kaggle.
  • mle-hyperopt - Lightweight Hyperparameter Optimization Tool.
  • Feature Engine - Python library with multiple transformers to engineer and select features for use in machine learning models.
  • BaaL - Bayesian active learning library.
  • TorchArrow - torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format.
  • Arm NN - Software and tools that enables machine learning workloads on power-efficient devices.
  • OpenRec - Open-source and modular library for neural network-inspired recommendation algorithms.
  • FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
  • ColossalAI - Unified Deep Learning System for Large-Scale Parallel Training. (Docs) (Examples)
  • XManager - Framework for managing machine learning experiments.
  • T5X - Modular, composable, research-friendly framework for high-performance, configurable, self-service training.
  • mlinspect - Inspect ML Pipelines in Python in the form of a DAG.
  • Privacy Lint - Library that allows you to perform a privacy analysis (Membership Inference) of your model in PyTorch.
  • NVIDIA Object Detection Toolkit (ODTK) - Fast and accurate single stage object detection with end-to-end GPU optimization.
  • DeAI - Decentralized privacy-preserving ML training software framework, using p2p networking.
  • Varuna - Tool for efficient training of large DNN models on commodity GPUs and networking.
  • reXmeX - General purpose recommender metrics library for fair evaluation.
  • Einshape - DSL-based reshaping library for JAX and other frameworks.
  • BlobCity AutoAI - Framework to find the best performing AI/ML model for any AI problem.
  • PyPAL - Multiobjective active learning with tunable accuracy/efficiency tradeoff and clear stopping criterion.
  • dcbench - Benchmark of data-centric tasks from across the machine learning lifecycle.
  • Cockpit - Visual and statistical debugger specifically designed for deep learning.
  • CatBoost - Machine learning method based on gradient boosting over decision trees. (Web) (Tutorials)
  • Xplique - Neural Networks Explainability Toolbox.
  • Causal ML - Python Package for Uplift Modeling and Causal Inference with ML.
  • sklearn-onnx - Convert scikit-learn models and pipelines to ONNX.
  • Tools for JAX - Variety of tools for the differential programming library JAX.
  • KML - Machine Learning Framework for Operating Systems & Storage Systems. (HN)
  • ENN Incubator - Collection of in-progress libraries for entity neural networks.
  • Syne Tune - Large scale and asynchronous Hyperparameter Optimization at your fingertip.
  • Maggy - Framework for distribution transparent machine learning experiments on Apache Spark.
  • Apache SINGA - Distributed deep learning system. (Web)
  • Tiny CUDA Neural Networks - Lightning fast & tiny C++/CUDA neural network framework.
  • Apache TVM - Open Deep Learning Compiler Stack.
  • imodels - Interpretable ML package for concise, transparent, and accurate predictive modeling (sklearn-compatible).
  • FLSim - Flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API.
  • Human Learn - Machine Learning models should play by the rules, literally.
  • MiniTorch - DIY teaching library for machine learning engineers who wish to learn about the internal concepts underlying deep learning systems.
  • TorchRecipes - Train machine learning models with a couple of lines of code.
  • DABS - Domain-Agnostic Benchmark for Self-Supervised Learning.
  • apricot - Implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.
  • Theseus - Library for differentiable nonlinear optimization built on PyTorch.
  • MMSelfSup - OpenMMLab Self-Supervised Learning Toolbox and Benchmark.
  • NVFlare - NVIDIA Federated Learning Application Runtime Environment. (Docs)
  • OSLO - Open Source framework for Large-scale transformer Optimization.
  • snntorch - Deep and online learning with spiking neural networks in Python.
  • NVIDIA DALI - GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
  • MIPLearn - Framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML).
  • tree-math - Mathematical operations for JAX pytrees.
  • ExplainX - Explainable AI framework for data scientists. Explain & debug any blackbox machine learning model with a single line of code.
  • Contextual AI - Adds explainability to different stages of machine learning pipelines.
  • jax_dataclasses - Pytrees + static analysis.
  • kingly - Zero-cost state-machine library for robust, testable and portable user interfaces (most machines compile ~1-2KB).
  • RTNeural - Lightweight neural network inferencing engine written in C++.
  • JAXopt - Hardware accelerated, batchable and differentiable optimizers in JAX.
  • chop - Optimization library based on PyTorch, with applications to adversarial examples and structured neural network training.
  • WebDNN - Fastest DNN Running Framework on Web Browser.
  • nonconformist - Python implementation of the conformal prediction framework.
  • jaxdf - JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations.
  • DoWhy - End-to-end library for causal inference.
  • hypopt - Parallelized hyper-param optimization with validation set, not crossval.
  • ML Collections - Library of Python Collections designed for ML use cases.
  • Latte - Cross-framework Python Package for Evaluation of Latent-based Generative Models.
  • Raster Vision - Open source framework for deep learning on satellite and aerial imagery.
  • SPEAR - Semi-Supervised Data Programming for Data Efficient Machine Learning.
  • Ivy - Unified machine learning framework, enabling framework-agnostic functions, layers and libraries. (Web)
  • NeuralForecast - Python library for time series forecasting with deep learning models.
  • Pythae - Library for Variational Autoencoder benchmarking. (Paper)
  • Pyraug - Data Augmentation with Variational Autoencoders.
  • product-quantization - Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
  • learned_optimization - Training and evaluating learned optimizers in JAX.
  • OTT - Sturdy, versatile and efficient optimal transport solvers, taking advantage of JAX features, such as JIT, auto-vectorization and implicit differentiation.
  • Marian - Efficient Neural Machine Translation framework written in pure C++ with minimal dependencies. (Web)
  • segmind - MLOps for end-to-end deep learning lifecycle.
  • FLUTE - Federated Learning Utilities and Tools for Experimentation.
  • evosax - JAX-Based Evolution Strategies.
  • Neural Processes - Framework for composing Neural Processes in Python.
  • Anomalib - Library for benchmarking, developing and deploying deep learning anomaly detection algorithms.
  • Fasterai - Library to make smaller and faster models with FastAI.
  • ClearML Server - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, ML-Ops and Data-Management.
  • Human Library - 3D Face Detection & Rotation Tracking, Face Description & more.
  • Towhee - Flexible, application-oriented framework for generating embedding vectors via a pipeline of ML models and other operations.
  • AutoFaiss - Automatically create Faiss knn indices with the most optimal similarity search parameters.
  • Statistical Forecast - Lightning fast forecasting with statistical and econometric models.
  • MLSpec - Standardize the intercomponent schemas for a multi-stage ML Pipeline.
  • Alfred Python - Command line tool for deep-learning usage.
  • Bacon - Framework for orchestrating machine learning experiments on AWS.
  • PyClustering - Python, C++ data mining library.
  • PQk-means - Fast and memory-efficient clustering.
  • LeanTransformer - Memory-efficient transformer.
  • HoloClean - Machine Learning System for Data Enrichment. Built on top of PyTorch and PostgreSQL.
  • OpenDelta - Open-Source Framework for Paramter Efficient Tuning (Delta Tuning).
  • Alpa - Automatically parallelizes tensor computational graphs and runs them on a distributed cluster.
  • GPBoost - Combining Tree-Boosting with Gaussian Process and Mixed Effects Models.
  • CORDS - Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.
  • DISTIL - Cut down your labeling cost and time by 3x-5x.
  • OpenFL - Open-Source Framework For Federated Learning.
  • Basenji - Sequential regulatory activity predictions with deep convolutional neural networks.
  • PyDP - Python Differential Privacy Library.
  • veGiantModel - Torch based high efficient training library developed by the Applied Machine Learning team at Bytedance.
  • Flame - Federated learning system for edge with flexibility and scalability at the core of its design.
  • DPU Utilities - Utilities used by the Deep Program Understanding team.
  • XGBoost-Ray - Distributed backend for XGBoost, built on top of distributed computing framework Ray.
  • Easy Parallel Library - General and efficient library for distributed model training.
  • MetricFlow - Allows you to define, build, and maintain metrics in code.
  • HuggingFace Evaluate
  • PADL - Pipeline Abstractions for Deep Learning.
  • Vertex AI SDK for Python - Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
  • Tempo - MLOps Python Library.
  • LightFM - Python implementation of LightFM, a hybrid recommendation algorithm.
  • fklearn - Functional Machine Learning.
  • Transformer PhysX - Transformers for modeling physical systems.
  • Feathr - Enterprise-Grade, High Performance Feature Store. (Article)
  • To what extent can Rust be used for Machine Learning? (2022)
  • Vectorflow - Minimalist neural network library optimized for sparse data and single machine environments.
  • D2Go - Toolkit for efficient deep learning.
  • Slideflow - Deep learning pipeline for histology image analysis, with both Tensorflow and PyTorch support.
  • Forte - Bring good software engineering to your ML solutions, starting from Data.
  • Machine Learning(-ish) nix packages
  • PaddleSeg - High-Efficient Development Toolkit for Image Segmentation.
  • TorchSparse - High-performance neural network library for point cloud processing.
  • H2O - In-memory platform for distributed, scalable machine learning.
  • Ranger - Synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one code base.
  • Unseal - Mechanistic Interpretability for Transformer Models.
  • ANTsPy - Advanced Normalization Tools in Python.
  • FasterTransformer Backend - Triton backend for the FasterTransformer.
  • Nixtla - Automated time series processing and forecasting.
  • FederatedScope - Easy-to-use federated learning platform.
  • Habitat Lab - Modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.
  • Ranger21 - Integrating the latest deep learning components into a single optimizer.
  • Tevatron - Flexible toolkit for dense retrieval research and development.
  • mlrose - Python package for implementing a number of Machine Learning, Randomized Optimization and SEarch algorithms.
  • Scikit-Learn Compiled Trees
  • KotlinDL - High-level Deep Learning Framework written in Kotlin and inspired by Keras.
  • PGBM - Probabilistic Gradient Boosting Machines.
  • Fiddle - Python-first configuration library particularly well suited to ML applications.
  • tpunicorn - Python library and command-line program for managing TPUs.
  • CLAP - Contrastive Language-Audio Pretraining.
  • COMET - Neural Framework for MT Evaluation.
  • Magnitude - Feature-packed Python package and vector storage file format for utilizing vector embeddings in machine learning models.
  • TorchANI - Accurate Neural Network Potential on PyTorch.
  • gap-train - Gaussian Approximation Potential Training.
  • lleaves - LLVM-based compiler for LightGBM decision trees.
  • TensorScript - High-level language for specifying finite-dimensioned tensor computation. (Web)
  • Neural Fluid Fields - Small library for doing fluid simulation with neural fields.
  • OmniXAI - Library for eXplainable AI.
  • mmap.ninja - Library for storing your datasets in memory-mapped files, which leads to a dramatic speedup in the training time. Accelerate the iteration over your machine learning dataset by up to 20 times.
  • geomloss - Geometric loss functions between point clouds, images and volumes.
  • morphsnakes - Implementation of the Morphological Snakes for image segmentation. Supports 2D images and 3D volumes.
  • HyperLib - Common Neural Network components in the hyperbolic space (using the Poincare model).
  • Lite.Ai.ToolKit - C++ toolkit of awesome AI models.
  • RecZilla - Metalearning for algorithm selection on Recommender Systems.
  • EdgeML - Machine learning algorithms for edge devices developed at Microsoft Research India.
  • Quaterion - Framework for fine-tuning similarity learning models.
  • SecretFlow - Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
  • pycox - Python package for survival analysis and time-to-event prediction with PyTorch.
  • AI2 Tango - Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.
  • ADAPT - Awesome Domain Adaptation Python Toolbox.
  • giotto-deep - Deep learning made topological.
  • DeepSpeed-MII - Library from DeepSpeed, designed to make low-latency, low-cost inference of powerful transformer models.
  • logreg - Bayesian inference for a logistic regression model in various languages.
  • PINA - Physics-Informed Neural networks for Advanced modeling.
  • PyCave - Traditional Machine Learning Models for Large-Scale Datasets in PyTorch.
  • Draco - Formal framework for representing design knowledge about effective visualization design as a collection of constraints.
  • GRAPE - Rust/Python library for high-performance Graph Processing and Embedding.
  • dp-transformers - Differentially-private transformers using HuggingFace and Opacus.
  • TinyMaix - Tiny inference library for microcontrollers (TinyML).
  • x-unet - Implementation of a U-net complete with efficient attention as well as the latest research findings.
  • TorchPQ - Efficient implementations of Product Quantization and its variants using Pytorch and CUDA.
  • Mjx - Framework for Mahjong AI research.
  • LibMTL - PyTorch Library for Multi-Task Learning.
  • FEDOT - Automated modeling and machine learning framework.
  • ELI5 - Library for debugging/inspecting machine learning classifiers and explaining their predictions.
  • DeePMD-kit - Deep learning package for many-body potential energy representation and molecular dynamics.
  • Dragonfly - Open source python library for scalable Bayesian optimization.
  • fastMONAI - Simplifying deep learning for medical imaging.
  • Contextual Bandits - Python implementations of contextual bandits algorithms.
  • Open3D-ML - Extension of Open3D to address 3D Machine Learning tasks.
  • Daft - Fast, ergonomic and scalable open-source dataframe library: built for Python and Complex Data/Machine Learning workloads.
  • StellarGraph - Machine Learning on Graphs.
  • Sliceline - Python library for fast slice finding for Machine Learning model debugging.
  • AITemplate - Python framework which renders neural network into high performance CUDA/HIP C++ code. (Article)
  • spidr - Accelerated machine learning with dependent types.
  • Transformer Engine - Library for accelerating Transformer models on NVIDIA GPUs.
  • visu3d - 3D without friction (TF, Jax, Numpy).
  • Simulate - Creating and sharing simulation environments for embodied and synthetic data research.
  • smol - Statistical Mechanics on Lattices.
  • pathos - Parallel graph management and execution in heterogeneous computing.
  • FewBit - Library for memory efficient training of large neural networks.
  • Vizier - Reliable and Flexible Blackbox Optimization.
  • SubModLib - Easy-to-use, efficient and scalable Python library for submodular optimization with a C++ optimization engine.
  • be_great - Novel approach for synthesizing tabular data using pretrained large language models.
  • TabSurvey - Experiments on Tabular Data Models.
  • pocoMC - Python implementation of Preconditioned Monte Carlo for accelerated Bayesian Computation.
  • WTTE-RNN - Framework for churn and time to event prediction.
  • ONE - High-performance, on-device neural network inference framework.
  • Neograd - Deep learning framework created from scratch with Python and NumPy.
  • MMEval - Unified and open cross-framework evaluation library.
  • cuda-convnet - High-performance C++/CUDA implementation of abstract convolutional neural networks.
  • Vectory - Collection of tools to track and compare embedding versions.
  • Lovely Tensors - Tensors, ready for human consumption.
  • EPyMARL - Extended Python MARL framework.
  • Zero - Mangaki's recommendation algorithms.
  • LassoNet - Feature selection in neural networks.
  • AutoOED - Automated Optimal Experimental Design Platform.
  • WeightWatcher - Tool for predicting the accuracy of Deep Neural Networks. (Web)
  • Web Neural Network API samples
  • SparseTIR - Sparse Tensor Compiler for Deep Learning.
  • Embetter - Scikit-learn compatible embeddings for computer vision and text.
  • FlowMC - Normalizing-flow enhanced sampling package for probabilistic inference.
  • PyImpetus - Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features.
  • TPU Care - Automatically take good care of your preemptible TPUs.
  • Merlin Dataloader - Lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX.
  • SISH - Fast and scalable search of whole-slide images via self-supervised deep learning.
  • TuneTA - Intelligently optimizes technical indicators and optionally selects the least intercorrelated for use in machine learning models.
  • PyTensor - Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays.
  • SIATune - Hyperparameter Tuning Toolbox for OpenMMLab Frameworks, especially for Remote Sensing Tasks.
  • ggml - Tensor library for machine learning in C.
  • ONNXRuntime-Extensions - Pre- and post processing library for ONNX Runtime.
  • McTorch Lib - Manifold optimization functionality for PyTorch.
  • Torchhd - Python library for Hyperdimensional Computing.
  • DeeProb-kit - Python Library for Deep Probabilistic Modeling.
  • Allegro - Building highly scalable and accurate equivariant deep learning interatomic potentials.
  • Tiktoken - Fast tokenizer by OpenAI. (HN)
  • Fortuna - Library for Uncertainty Quantification.
  • A-UNet - Library that provides building blocks to customize UNets, in PyTorch.
  • Concrete-ML - Privacy-Preserving Machine Learning (PPML) open-source set of tools built on top of The Concrete Framework by Zama.
  • Poniard - Scikit-learn companion library that streamlines the process of fitting different machine learning models and comparing them.
  • Mango - Parallel Hyperparameter Tuning in Python.
  • River - Online machine learning in Python.
  • hypertune - Library for performing hyperparameter optimization.
  • CausalPy - Python package for causal inference in quasi-experimental settings.
  • TensorDict - PyTorch dedicated tensor container.
  • Runhouse
  • Hidet - Compilation-based DNN inference framework.
  • Colossal-AI - Unified Deep Learning System for Big Model Era.
  • difflogic - Library for Differentiable Logic Gate Networks.
  • EZKL - Library and command-line tool for doing inference for deep learning models and other computational graphs in a zk-snark.
  • DeLFT - Deep Learning Framework for Text.
  • lambdaprompt - Functional programming interface for building AI systems.
  • D-Adaptation - D-Adaptation for SGD, Adam and AdaGrad.
  • BBopt - Black box hyperparameter optimization made easy.
  • PyMC-Marketing - Bayesian marketing toolbox in PyMC. Media Mix, CLV models and more.
  • skforecast - Time series forecasting with scikit-learn regressors.
  • Alibi - Algorithms for explaining machine learning models.
  • UCC - Unified Communication Collectives Library.
  • Renate - Library for automatic retraining and continual learning.
  • CausalAI - Fast and Scalable framework for Causal Analysis of Time Series and Tabular Data.
  • Flashy - Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits.
  • Streaming - Data Streaming Library for Efficient Neural Network Training.
  • LogAI - Library for Log Analytics and Intelligence.
  • nn-Meter - DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
  • pyhf - Pure-Python HistFactory implementation with tensors and autodiff.
  • fcmaes - Python 3 gradient-free optimization library.
  • pyPESTO - Widely applicable and highly customizable toolbox for parameter estimation.
  • textgenrnn - Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
  • Parallelformers - Efficient Model Parallelization Toolkit for Deployment.
  • BenchML - ML benchmarking and pipeling framework.
  • MIOpen - AMD's library for high performance machine learning primitives.
  • Substra - Federated learning (FL) software. Enables the training and validation of machine learning models on distributed datasets.
  • Rust Circuit - Library for expressing and manipulating tensor computations for neural network interpretability, written in Rust and used in Python notebooks.
  • SparseConvNet - PyTorch library for training Submanifold Sparse Convolutional Networks.
  • PyLops - Linear-Operator Library for Python.
  • robustlearn - Robust machine learning for responsible AI.
  • fastRAG - Efficient Retrieval Augmentation and Generation Framework.
  • HuggingFace.js - Utilities to use the Hugging Face hub API.
  • DLRover - Automatically trains the Deep Learning model on the distributed cluster.
  • Consistency Models - Mini-library for training consistency models.
  • GraphStorm - Graph machine learning (GML) framework for enterprise use cases.
  • scikit-learn-ts - Powerful machine learning library for Node.js – uses Python's scikit-learn under the hood.
  • MMdnn - Comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models.
  • BanditPAM - C++ implementation and Python package. (HN)
  • pandas_dq - Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
  • FLAML - Fast library for AutoML and tuning.
  • ms2ml - Helps you convert raw mass spec data into tensors.
  • eindex - Multidimensional indexing for tensors.
  • Unmanic - Library Optimiser.
  • scikit-matter - Collection of scikit-learn compatible utilities that implement methods born out of the materials science and chemistry communities.
  • pymoo - NSGA2, NSGA3, R-NSGA3, MOEAD, Genetic Algorithms (GA), Differential Evolution (DE), CMAES, PSO.
  • Hamilton - General purpose micro-framework for creating dataflows from python functions.
  • Outlines - Generative Model Programming. (Tweet) (HN)
  • crepes - Conformal regressors and predictive systems.
  • Loopy - Transformation-Based Generation of High-Performance CPU/GPU Code.
  • MSPrior - Multi(scale/stream) prior model for realtime temporal learning.
  • EasyRunner - Lightweight tool for efficiently managing and executing parallel experiments.
  • synthcity - Library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
  • Multi-Output Gaussian Process Toolkit
  • Aeon - Unified framework for machine learning with time series. (HN)
  • Nixtla - Scalable machine learning for time series forecasting.
  • MS-AMP - Microsoft Automatic Mixed Precision Library.
  • AXLearn - Library for deep learning built upon Jax and GSPMD to support large-scale training.
  • functime - Time-series machine learning and embeddings at scale.
  • Torch-Grammar - Restricts a model to output a token sequence that conforms to a provided EBNF grammar.
  • CMSIS NN - Efficient neural network kernels developed to maximize the performance and minimize the memory footprint of neural networks on Arm Cortex-M processors.
  • Trident - Performance library for machine learning applications.
  • ytopt - Machine-learning-based search methods for autotuning.
  • ragas - Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines.
  • pykoi - Active learning in one unified interface. (Web) (HN)
  • TensorRTx - Implementation of popular deep learning networks with TensorRT network definition API.
  • GSLB - Comprehensive benchmark of Graph Structure Learning.
  • aleatory - Python library for Stochastic Processes Simulation and Visualisation.
  • micrograd - Minimalist neural networks library built on a tiny autograd engine.
  • NATTEN - Interface to neighborhood attention, and more generally sliding window attention.
  • teenygrad - If tinygrad wasn't small enough for you.
  • Autodidact - Pedagogical implementation of Autograd.
  • disco - Toolkit for Distributional Control of Generative Models.