ML Libraries
Exploring using BlackJAX, JAX & PyTorch.
Top
Web
- Shumai - Fast differentiable tensor library for research in TypeScript and JavaScript. Built with bun + flashlight. (HN)
- ml5.js - Friendly machine learning for the web.
- ml.js - Machine learning tools in JavaScript.
Embedded
- NNoM - High-level inference Neural Network library specifically for microcontrollers.
Other
- SynapseML - Simple and Distributed Machine Learning. (Web) (Article)
- imgaug - Image augmentation for machine learning experiments.
- PlaidML - Framework for making deep learning work everywhere.
- Leaf - Open Machine Intelligence Framework for Hackers. (GPU/CPU).
- Apache MXNet - Deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity.
- Sonnet - Library built on top of TensorFlow for building complex neural networks.
- tvm - Open deep learning compiler stack for cpu, gpu and specialized accelerators.
- dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.
- PySyft - Library for encrypted, privacy preserving deep learning.
- numpy-ml - Machine learning, in numpy.
- cuML - Suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects.
- ONNX Runtime - Cross-platform, high performance scoring engine for ML models. (Web) (HN)
- MLflow - Machine Learning Lifecycle Platform.
- auto-sklearn - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
- TensorNetwork - Library for easy and efficient manipulation of tensor networks.
- lambda-ml - Small machine learning library aimed at providing simple, concise implementations of machine learning techniques and utilities.
- scikit-learn - Python module for machine learning built on top of SciPy. (Tutorials) (Course) (Web) (HN) (Examples)
- MLBox - Powerful Automated Machine Learning python library.
- Mlxtend (machine learning extensions) - Python library of useful tools for the day-to-day data science tasks.
- CrypTen - Framework for Privacy Preserving Machine Learning built on PyTorch.
- Faiss - Library for efficient similarity search and clustering of dense vectors. (Tips) (HN)
- pyHSICLasso - Versatile Nonlinear Feature Selection Algorithm for High-dimensional Data.
- AutoGluon - AutoML Toolkit for Deep Learning.
- DeepLearning.scala - Simple library for creating complex neural networks from object-oriented and functional programming constructs.
- Optuna - Hyperparameter optimization framework. (Optuna Dashboard)
- Vowpal Wabbit - Machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. (Web) (Article)
- Brancher - User-centered Python package for differentiable probabilistic inference.
- Karate Club - General purpose community detection and network embedding library for research built on NetworkX.
- FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
- DeltaPy - Tabular Data Augmentation & Feature Engineering.
- TensorStore - Library for reading and writing large multi-dimensional arrays. (Article)
- FATE - Industrial Level Federated Learning Framework.
- Deepkit - Collaborative and real-time machine learning training suite: Experiment execution, tracking, and debugging.
- Sls - Stochastic Line Search.
- PyCaret - Open source low-code machine learning library in Python that aims to reduce the hypothesis to insights cycle time in a ML experiment. (Web)
- scikit-multilearn - Python module capable of performing multi-label learning tasks.
- imbalanced-learn - Python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance.
- DeepSpeed - Deep learning optimization library that makes distributed training easy, efficient, and effective.
- HoMM - Library for Homoiconic Meta-mapping.
- Hummingbird - Library for compiling trained traditional ML models into tensor computations.
- Ax - Accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.
- Neuropod - Uniform interface to run deep learning models from multiple frameworks.
- aerosolve - Machine learning package built for humans in Scala.
- Kur - Descriptive Deep Learning.
- NNI (Neural Network Intelligence) - Lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture Search, Hyperparameter Tuning and Model Compression.
- LMfit-py - Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize.leastsq, and with many additional classes and methods for curve fitting.
- tslearn - Machine learning toolkit for time series analysis in Python.
- Libra - Ergonomic machine learning for everyone. (Docs)
- NGBoost - Natural Gradient Boosting for Probabilistic Prediction.
- LightGBM - Gradient boosting framework that uses tree based learning algorithms.
- XGBoost - Optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.
- DMLC-Core - Common bricks library for building scalable and portable distributed machine learning.
- Linear Models - Add linear models including instrumental variable and panel data models that are missing from statsmodels.
- skift - scikit-learn wrappers for Python fastText.
- pulearn - Positive-unlabeled learning with Python.
- pescador - Library for streaming (numerical) data, primarily for use in machine learning applications.
- TPOT (Tree-based Pipeline Optimization Tool) - Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. (Docs)
- GraKeL - Library that provides implementations of several well-established graph kernels. scikit-learn compatible.
- creme - Python library for online machine learning. All the tools in the library can be updated with a single observation at a time, and can therefore be used to learn from streaming data. (Docs)
- RecBole - Unified, comprehensive and efficient recommendation library.
- NNFusion - Flexible and efficient DNN compiler that can generate high-performance executables from a DNN model description.
- ncnn - High-performance neural network inference computing framework optimized for mobile platforms.
- Scikit-Optimize - Sequential model-based optimization with a
scipy.optimize
interface. - scikit-rebate - Scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
- Fedlearner - Collaborative machine learning frameowork that enables joint modeling of data distributed between institutions.
- SkLearn2PMML - Python library for converting Scikit-Learn pipelines to PMML.
- vecstack - Python package for stacking (machine learning technique).
- LightSeq - High Performance Inference Library for Sequence Processing and Generation.
- modestpy - Facilitates parameter estimation in models compliant with Functional Mock-up Interface.
- Distiller - Open-source Python package for neural network compression research.
- modAL - Modular active learning framework for Python.
- Bambi - BAyesian Model-Building Interface in Python.
- Bolt - Deep learning library with high performance and heterogeneous flexibility.
- hypothesis - Python toolkit for (simulation-based) inference and the mechanization of science.
- MMFeat - Multi-modal features toolkit in Python.
- Flower - Friendly Federated Learning Framework. (Web) (Flower Summit 2021)
- brain.js - GPU accelerated Neural networks in JavaScript for Browsers and Node.js. (Web)
- Buffalo - Fast and scalable production-ready open source project for recommender systems.
- EvalML - AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions.
- MindSpore - New open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
- Flashlight - Fast, Flexible Machine Learning in C++.
- raster-deep-learning - ArcGIS built-in python raster functions for deep learning to get you started fast.
- CTranslate2 - Fast inference engine for OpenNMT models.
- Causal Discovery Toolbox - Algorithms for graph structure recovery (including algorithms from the bnlearn, pcalg packages), mainly based out of observational data.
- FedML - Research Library and Benchmark for Federated Machine Learning.
- Auto_TS - Automatically build multiple Time Series models using a Single Line of Code.
- AutoGL (Auto Graph Learning) - AutoML framework & toolkit for machine learning on graphs.
- tsalib - Tensor Shape Annotation Library (numpy, tensorflow, pytorch, ...).
- MMClassification - Open source image classification toolbox based on PyTorch.
- Nimble - Lightweight and Parallel GPU Task Scheduling for Deep Learning.
- Dannjs - Neural Network library for JavaScript. (Web)
- Shapley - Python library for evaluating binary classifiers in a machine learning ensemble.
- Orion - Machine learning library built for unsupervised time series anomaly detection.
- BigDL - Distributed Deep Learning on Apache Spark. (Docs)
- MNN - Blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba.
- Haste - CUDA implementation of fused RNN layers with built-in DropConnect and Zoneout regularization.
- sklearn-xarray - Metadata-aware machine learning.
- dabnn - Accelerated binary neural networks inference framework for mobile platform.
- OneFlow - Performance-centered and open-source deep learning framework.
- DeepWalk - Deep Learning for Graphs. (Web)
- sequitur - Autoencoders for sequence data.
- cleanlab - Machine learning python package for learning with noisy labels and finding label errors in datasets. (Web) (Lobsters)
- deeptime - Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation.
- Jelly Bean World - Framework for experimenting with never-ending learning.
- Larq - Open-source deep learning library for training neural networks with extremely low precision weights and activations, such as Binarized Neural Networks (BNNs). (Web)
- tsai - State-of-the-art Deep Learning for Time Series and Sequence Modeling.
- edbo - Experimental Design via Bayesian Optimization.
- TensorJS - JS/TS library for accelerated tensor computation intended to be run in the browser.
- micro-TCN - Efficient neural networks for audio effect modeling. (Web)
- DESlib - Python library for dynamic classifier and ensemble selection.
- BytePS - High performance and generic framework for distributed DNN training.
- Hyperactive - Hyperparameter optimization and meta-learning toolbox for convenient and fast prototyping of machine-learning models.
- Jittor - Just-in-time(JIT) deep learning framework.
- autofeat - Linear Prediction Model with Automated Feature Engineering and Selection Capabilities.
- Distrax - Lightweight library of probability distributions and bijectors. It acts as a JAX-native reimplementation of a subset of TensorFlow Probability (TFP).
- scikit-learn-extra - Set of useful tools compatible with scikit-learn.
- GeneticAlgorithmPython - Building Genetic Algorithm in Python.
- Newt - Gaussian process library in JAX.
- Hedgehog - Bayesian networks in Python.
- Backdoors 101 - PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.
- Sabertooth - Standalone pre-training recipe with JAX+Flax.
- ProbFlow - Python package for building Bayesian models with TensorFlow or PyTorch.
- Mars - Tensor-based unified framework for large-scale data computation which scales Numpy, pandas, Scikit-learn and Python functions.
- DeepMatch - Deep matching model library for recommendations & advertising.
- Layout Parser - Unified toolkit for Deep Learning Based Document Image Analysis. (Web)
- scikit-survival - Survival analysis built on top of scikit-learn.
- PySR - Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing.
- Snowman Hotword Detection
- CLU - Contains common functionality for writing ML training loops using JAX.
- SparseML - Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models.
- CogDL - Extensive Toolkit for Deep Learning on Graphs. (Web)
- TensorLy - Tensor Learning in Python. (Web)
- Cornac - Comparative Framework for Multimodal Recommender Systems.
- MegEngine - Fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
- SeqIO - Task-based datasets, preprocessing, and evaluation for sequence models.
- OpenAI Python - Provides convenient access to the OpenAI API from applications written in Python.
- Mesh Transformer JAX - Model parallel transformers in JAX and Haiku. (HN)
- Checking out a 6-Billion parameter GPT model, GPT-J, from Eleuther AI (2021)
- deepC - Vendor independent deep learning library, compiler and inference framework designed for small form-factor devices.
- Dlib - Modern C++/Python Toolkit for Machine Learning . (Web) (HN)
- Continuum - Clean and simple data loading library for Continual Learning.
- Smile - Statistical Machine Intelligence & Learning Engine.
- AugLy - Data augmentations library for audio, image, text, and video.
- Surprise - Python scikit for building and analyzing recommender systems. (Web)
- TNN - High-performance, lightweight neural network inference framework.
- Parallax - Immutable Torch Modules for JAX.
- EvalAI - Open source platform for evaluating and comparing machine learning (ML) and artificial intelligence (AI) algorithms at scale. (Web)
- Avalanche - End-to-End Library for Continual Learning. (Docs)
- PyKale - Knowledge-Aware machine LEarning (KALE) from multiple sources in Python.
- mltrace - Coarse-grained lineage and tracing for machine learning pipelines.
- PPLNN - High-performance deep-learning inference engine for efficient AI inferencing.
- Petastorm - Enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format.
- Collie - Library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch. (Docs)
- voxelmorph - Unsupervised Learning for Image Registration.
- uTensor - TinyML AI inference library.
- Tangram - Train a model from a CSV file on the command line.. (Web) (HN)
- AdaptDL - Resource-adaptive cluster scheduler for deep learning training.
- Triage - General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems.
- Gorse - Open source recommender system service written in Go. (Web) (HN)
- LensKit - Python Tools for Recommender Experiments. (Web)
- StarSpace - Learning embeddings for classification, retrieval and ranking.
- ELFI - Engine for Likelihood-Free Inference. (Docs)
- DaisyRec - Python toolkit dealing with rating prediction and item ranking issue.
- AutoTS - Forecasting Model Selection for Multiple Time Series.
- PyFlux - Open source time series library for Python.
- trajax - Python library for differentiable optimal control on accelerators.
- TransmogrifAI - End-to-end AutoML library for structured data written in Scala that runs on top of Apache Spark. (Web)
- chitra - Multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and Model Deployment.
- DoubleML - Double Machine Learning in Python.
- jaxfg - Factor graphs and nonlinear optimization in JAX.
- pyltr - Python learning-to-rank toolkit with ranking models, evaluation metrics, data wrangling helpers, and more.
- Wrangl - Ray-based parallel data preprocessing for NLP and ML.
- Treex - Pytree-based Module system for Deep Learning in JAX. (Docs)
- PhiFlow - Open-source simulation toolkit built for optimization and machine learning applications.
- OpenVINO Toolkit - Deploy pre-trained deep learning models through a high-level C++ Inference Engine API integrated with application logic.
- WILDS - Machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.
- TurboTransformers - Fast and user-friendly runtime for transformer inference on CPU and GPU.
- DeepOps - Mini Deep Learning framework supporting GPU accelerations written with CUDA.
- Bayex - Bayesian Optimization Python Library powered by JAX.
- Merlion - Machine Learning Framework for Time Series Intelligence.
- Feast - Feature Store for Machine Learning. (Web)
- nnabla - Neural Network Libraries by Sony. (Web)
- RevLib - Simple and efficient RevNet-Library with DeepSpeed support.
- DeepSparse - Neural network inference engine that delivers GPU-class performance for sparsified models on CPUs.
- NVTabular - Engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
- Treeo - Small library for creating and manipulating custom JAX Pytree classes.
- FedJAX - JAX-based open source library for Federated Learning simulations that emphasizes ease-of-use in research.
- oneAPI - OneAPI Deep Neural Network Library (oneDNN).
- MosaicML Composer - Library of methods, and ways to compose them together for more efficient ML training.
- deep-significance - Easy and Better Significance Testing for Deep Neural Networks.
- Finetuner - Finetuning any DNN for better embedding on neural search tasks. (Docs)
- mlcrate - Hon module of handy tools and functions, mainly for ML and Kaggle.
- mle-hyperopt - Lightweight Hyperparameter Optimization Tool.
- Feature Engine - Python library with multiple transformers to engineer and select features for use in machine learning models.
- BaaL - Bayesian active learning library.
- TorchArrow - torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format.
- Arm NN - Software and tools that enables machine learning workloads on power-efficient devices.
- OpenRec - Open-source and modular library for neural network-inspired recommendation algorithms.
- FlexFlow - Distributed deep learning framework that supports flexible parallelization strategies.
- ColossalAI - Unified Deep Learning System for Large-Scale Parallel Training. (Docs) (Examples)
- XManager - Framework for managing machine learning experiments.
- T5X - Modular, composable, research-friendly framework for high-performance, configurable, self-service training.
- mlinspect - Inspect ML Pipelines in Python in the form of a DAG.
- Privacy Lint - Library that allows you to perform a privacy analysis (Membership Inference) of your model in PyTorch.
- NVIDIA Object Detection Toolkit (ODTK) - Fast and accurate single stage object detection with end-to-end GPU optimization.
- DeAI - Decentralized privacy-preserving ML training software framework, using p2p networking.
- Varuna - Tool for efficient training of large DNN models on commodity GPUs and networking.
- reXmeX - General purpose recommender metrics library for fair evaluation.
- Einshape - DSL-based reshaping library for JAX and other frameworks.
- BlobCity AutoAI - Framework to find the best performing AI/ML model for any AI problem.
- PyPAL - Multiobjective active learning with tunable accuracy/efficiency tradeoff and clear stopping criterion.
- dcbench - Benchmark of data-centric tasks from across the machine learning lifecycle.
- Cockpit - Visual and statistical debugger specifically designed for deep learning.
- CatBoost - Machine learning method based on gradient boosting over decision trees. (Web) (Tutorials)
- Xplique - Neural Networks Explainability Toolbox.
- Causal ML - Python Package for Uplift Modeling and Causal Inference with ML.
- sklearn-onnx - Convert scikit-learn models and pipelines to ONNX.
- Tools for JAX - Variety of tools for the differential programming library JAX.
- KML - Machine Learning Framework for Operating Systems & Storage Systems. (HN)
- ENN Incubator - Collection of in-progress libraries for entity neural networks.
- Syne Tune - Large scale and asynchronous Hyperparameter Optimization at your fingertip.
- Maggy - Framework for distribution transparent machine learning experiments on Apache Spark.
- Apache SINGA - Distributed deep learning system. (Web)
- Tiny CUDA Neural Networks - Lightning fast & tiny C++/CUDA neural network framework.
- Apache TVM - Open Deep Learning Compiler Stack.
- imodels - Interpretable ML package for concise, transparent, and accurate predictive modeling (sklearn-compatible).
- FLSim - Flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API.
- Human Learn - Machine Learning models should play by the rules, literally.
- MiniTorch - DIY teaching library for machine learning engineers who wish to learn about the internal concepts underlying deep learning systems.
- TorchRecipes - Train machine learning models with a couple of lines of code.
- DABS - Domain-Agnostic Benchmark for Self-Supervised Learning.
- apricot - Implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.
- Theseus - Library for differentiable nonlinear optimization built on PyTorch.
- MMSelfSup - OpenMMLab Self-Supervised Learning Toolbox and Benchmark.
- NVFlare - NVIDIA Federated Learning Application Runtime Environment. (Docs)
- OSLO - Open Source framework for Large-scale transformer Optimization.
- snntorch - Deep and online learning with spiking neural networks in Python.
- NVIDIA DALI - GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
- MIPLearn - Framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML).
- tree-math - Mathematical operations for JAX pytrees.
- ExplainX - Explainable AI framework for data scientists. Explain & debug any blackbox machine learning model with a single line of code.
- Contextual AI - Adds explainability to different stages of machine learning pipelines.
- jax_dataclasses - Pytrees + static analysis.
- kingly - Zero-cost state-machine library for robust, testable and portable user interfaces (most machines compile ~1-2KB).
- RTNeural - Lightweight neural network inferencing engine written in C++.
- JAXopt - Hardware accelerated, batchable and differentiable optimizers in JAX.
- chop - Optimization library based on PyTorch, with applications to adversarial examples and structured neural network training.
- WebDNN - Fastest DNN Running Framework on Web Browser.
- nonconformist - Python implementation of the conformal prediction framework.
- jaxdf - JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations.
- DoWhy - End-to-end library for causal inference.
- hypopt - Parallelized hyper-param optimization with validation set, not crossval.
- ML Collections - Library of Python Collections designed for ML use cases.
- Latte - Cross-framework Python Package for Evaluation of Latent-based Generative Models.
- Raster Vision - Open source framework for deep learning on satellite and aerial imagery.
- SPEAR - Semi-Supervised Data Programming for Data Efficient Machine Learning.
- Ivy - Unified machine learning framework, enabling framework-agnostic functions, layers and libraries. (Web)
- NeuralForecast - Python library for time series forecasting with deep learning models.
- Pythae - Library for Variational Autoencoder benchmarking. (Paper)
- Pyraug - Data Augmentation with Variational Autoencoders.
- product-quantization - Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
- learned_optimization - Training and evaluating learned optimizers in JAX.
- OTT - Sturdy, versatile and efficient optimal transport solvers, taking advantage of JAX features, such as JIT, auto-vectorization and implicit differentiation.
- Marian - Efficient Neural Machine Translation framework written in pure C++ with minimal dependencies. (Web)
- segmind - MLOps for end-to-end deep learning lifecycle.
- FLUTE - Federated Learning Utilities and Tools for Experimentation.
- evosax - JAX-Based Evolution Strategies.
- Neural Processes - Framework for composing Neural Processes in Python.
- Anomalib - Library for benchmarking, developing and deploying deep learning anomaly detection algorithms.
- Fasterai - Library to make smaller and faster models with FastAI.
- ClearML Server - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, ML-Ops and Data-Management.
- Human Library - 3D Face Detection & Rotation Tracking, Face Description & more.
- Towhee - Flexible, application-oriented framework for generating embedding vectors via a pipeline of ML models and other operations.
- AutoFaiss - Automatically create Faiss knn indices with the most optimal similarity search parameters.
- Statistical Forecast - Lightning fast forecasting with statistical and econometric models.
- MLSpec - Standardize the intercomponent schemas for a multi-stage ML Pipeline.
- Alfred Python - Command line tool for deep-learning usage.
- Bacon - Framework for orchestrating machine learning experiments on AWS.
- PyClustering - Python, C++ data mining library.
- PQk-means - Fast and memory-efficient clustering.
- LeanTransformer - Memory-efficient transformer.
- HoloClean - Machine Learning System for Data Enrichment. Built on top of PyTorch and PostgreSQL.
- OpenDelta - Open-Source Framework for Paramter Efficient Tuning (Delta Tuning).
- Alpa - Automatically parallelizes tensor computational graphs and runs them on a distributed cluster.
- GPBoost - Combining Tree-Boosting with Gaussian Process and Mixed Effects Models.
- CORDS - Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.
- DISTIL - Cut down your labeling cost and time by 3x-5x.
- OpenFL - Open-Source Framework For Federated Learning.
- Basenji - Sequential regulatory activity predictions with deep convolutional neural networks.
- PyDP - Python Differential Privacy Library.
- veGiantModel - Torch based high efficient training library developed by the Applied Machine Learning team at Bytedance.
- Flame - Federated learning system for edge with flexibility and scalability at the core of its design.
- DPU Utilities - Utilities used by the Deep Program Understanding team.
- XGBoost-Ray - Distributed backend for XGBoost, built on top of distributed computing framework Ray.
- Easy Parallel Library - General and efficient library for distributed model training.
- MetricFlow - Allows you to define, build, and maintain metrics in code.
- HuggingFace Evaluate
- PADL - Pipeline Abstractions for Deep Learning.
- Vertex AI SDK for Python - Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
- Tempo - MLOps Python Library.
- LightFM - Python implementation of LightFM, a hybrid recommendation algorithm.
- fklearn - Functional Machine Learning.
- Transformer PhysX - Transformers for modeling physical systems.
- Feathr - Enterprise-Grade, High Performance Feature Store. (Article)
- To what extent can Rust be used for Machine Learning? (2022)
- Vectorflow - Minimalist neural network library optimized for sparse data and single machine environments.
- D2Go - Toolkit for efficient deep learning.
- Slideflow - Deep learning pipeline for histology image analysis, with both Tensorflow and PyTorch support.
- Forte - Bring good software engineering to your ML solutions, starting from Data.
- Machine Learning(-ish) nix packages
- PaddleSeg - High-Efficient Development Toolkit for Image Segmentation.
- TorchSparse - High-performance neural network library for point cloud processing.
- H2O - In-memory platform for distributed, scalable machine learning.
- Ranger - Synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one code base.
- Unseal - Mechanistic Interpretability for Transformer Models.
- ANTsPy - Advanced Normalization Tools in Python.
- FasterTransformer Backend - Triton backend for the FasterTransformer.
- Nixtla - Automated time series processing and forecasting.
- FederatedScope - Easy-to-use federated learning platform.
- Habitat Lab - Modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.
- Ranger21 - Integrating the latest deep learning components into a single optimizer.
- Tevatron - Flexible toolkit for dense retrieval research and development.
- mlrose - Python package for implementing a number of Machine Learning, Randomized Optimization and SEarch algorithms.
- Scikit-Learn Compiled Trees
- KotlinDL - High-level Deep Learning Framework written in Kotlin and inspired by Keras.
- PGBM - Probabilistic Gradient Boosting Machines.
- Fiddle - Python-first configuration library particularly well suited to ML applications.
- tpunicorn - Python library and command-line program for managing TPUs.
- CLAP - Contrastive Language-Audio Pretraining.
- COMET - Neural Framework for MT Evaluation.
- Magnitude - Feature-packed Python package and vector storage file format for utilizing vector embeddings in machine learning models.
- TorchANI - Accurate Neural Network Potential on PyTorch.
- gap-train - Gaussian Approximation Potential Training.
- lleaves - LLVM-based compiler for LightGBM decision trees.
- TensorScript - High-level language for specifying finite-dimensioned tensor computation. (Web)
- Neural Fluid Fields - Small library for doing fluid simulation with neural fields.
- OmniXAI - Library for eXplainable AI.
- mmap.ninja - Library for storing your datasets in memory-mapped files, which leads to a dramatic speedup in the training time. Accelerate the iteration over your machine learning dataset by up to 20 times.
- geomloss - Geometric loss functions between point clouds, images and volumes.
- morphsnakes - Implementation of the Morphological Snakes for image segmentation. Supports 2D images and 3D volumes.
- HyperLib - Common Neural Network components in the hyperbolic space (using the Poincare model).
- Lite.Ai.ToolKit - C++ toolkit of awesome AI models.
- RecZilla - Metalearning for algorithm selection on Recommender Systems.
- EdgeML - Machine learning algorithms for edge devices developed at Microsoft Research India.
- Quaterion - Framework for fine-tuning similarity learning models.
- SecretFlow - Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
- pycox - Python package for survival analysis and time-to-event prediction with PyTorch.
- AI2 Tango - Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.
- ADAPT - Awesome Domain Adaptation Python Toolbox.
- giotto-deep - Deep learning made topological.
- DeepSpeed-MII - Library from DeepSpeed, designed to make low-latency, low-cost inference of powerful transformer models.
- logreg - Bayesian inference for a logistic regression model in various languages.
- PINA - Physics-Informed Neural networks for Advanced modeling.
- PyCave - Traditional Machine Learning Models for Large-Scale Datasets in PyTorch.
- Draco - Formal framework for representing design knowledge about effective visualization design as a collection of constraints.
- GRAPE - Rust/Python library for high-performance Graph Processing and Embedding.
- dp-transformers - Differentially-private transformers using HuggingFace and Opacus.
- TinyMaix - Tiny inference library for microcontrollers (TinyML).
- x-unet - Implementation of a U-net complete with efficient attention as well as the latest research findings.
- TorchPQ - Efficient implementations of Product Quantization and its variants using Pytorch and CUDA.
- Mjx - Framework for Mahjong AI research.
- LibMTL - PyTorch Library for Multi-Task Learning.
- FEDOT - Automated modeling and machine learning framework.
- ELI5 - Library for debugging/inspecting machine learning classifiers and explaining their predictions.
- DeePMD-kit - Deep learning package for many-body potential energy representation and molecular dynamics.
- Dragonfly - Open source python library for scalable Bayesian optimization.
- fastMONAI - Simplifying deep learning for medical imaging.
- Contextual Bandits - Python implementations of contextual bandits algorithms.
- Open3D-ML - Extension of Open3D to address 3D Machine Learning tasks.
- Daft - Fast, ergonomic and scalable open-source dataframe library: built for Python and Complex Data/Machine Learning workloads.
- StellarGraph - Machine Learning on Graphs.
- Sliceline - Python library for fast slice finding for Machine Learning model debugging.
- AITemplate - Python framework which renders neural network into high performance CUDA/HIP C++ code. (Article)
- spidr - Accelerated machine learning with dependent types.
- Transformer Engine - Library for accelerating Transformer models on NVIDIA GPUs.
- visu3d - 3D without friction (TF, Jax, Numpy).
- Simulate - Creating and sharing simulation environments for embodied and synthetic data research.
- smol - Statistical Mechanics on Lattices.
- pathos - Parallel graph management and execution in heterogeneous computing.
- FewBit - Library for memory efficient training of large neural networks.
- Vizier - Reliable and Flexible Blackbox Optimization.
- SubModLib - Easy-to-use, efficient and scalable Python library for submodular optimization with a C++ optimization engine.
- be_great - Novel approach for synthesizing tabular data using pretrained large language models.
- TabSurvey - Experiments on Tabular Data Models.
- pocoMC - Python implementation of Preconditioned Monte Carlo for accelerated Bayesian Computation.
- WTTE-RNN - Framework for churn and time to event prediction.
- ONE - High-performance, on-device neural network inference framework.
- Neograd - Deep learning framework created from scratch with Python and NumPy.
- MMEval - Unified and open cross-framework evaluation library.
- cuda-convnet - High-performance C++/CUDA implementation of abstract convolutional neural networks.
- Vectory - Collection of tools to track and compare embedding versions.
- Lovely Tensors - Tensors, ready for human consumption.
- EPyMARL - Extended Python MARL framework.
- Zero - Mangaki's recommendation algorithms.
- LassoNet - Feature selection in neural networks.
- AutoOED - Automated Optimal Experimental Design Platform.
- WeightWatcher - Tool for predicting the accuracy of Deep Neural Networks. (Web)
- Web Neural Network API samples
- SparseTIR - Sparse Tensor Compiler for Deep Learning.
- Embetter - Scikit-learn compatible embeddings for computer vision and text.
- FlowMC - Normalizing-flow enhanced sampling package for probabilistic inference.
- PyImpetus - Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features.
- TPU Care - Automatically take good care of your preemptible TPUs.
- Merlin Dataloader - Lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX.
- SISH - Fast and scalable search of whole-slide images via self-supervised deep learning.
- TuneTA - Intelligently optimizes technical indicators and optionally selects the least intercorrelated for use in machine learning models.
- PyTensor - Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays.
- SIATune - Hyperparameter Tuning Toolbox for OpenMMLab Frameworks, especially for Remote Sensing Tasks.
- ggml - Tensor library for machine learning in C.
- ONNXRuntime-Extensions - Pre- and post processing library for ONNX Runtime.
- McTorch Lib - Manifold optimization functionality for PyTorch.
- Torchhd - Python library for Hyperdimensional Computing.
- DeeProb-kit - Python Library for Deep Probabilistic Modeling.
- Allegro - Building highly scalable and accurate equivariant deep learning interatomic potentials.
- Tiktoken - Fast tokenizer by OpenAI. (HN)
- Fortuna - Library for Uncertainty Quantification.
- A-UNet - Library that provides building blocks to customize UNets, in PyTorch.
- Concrete-ML - Privacy-Preserving Machine Learning (PPML) open-source set of tools built on top of The Concrete Framework by Zama.
- Poniard - Scikit-learn companion library that streamlines the process of fitting different machine learning models and comparing them.
- Mango - Parallel Hyperparameter Tuning in Python.
- River - Online machine learning in Python.
- hypertune - Library for performing hyperparameter optimization.
- CausalPy - Python package for causal inference in quasi-experimental settings.
- TensorDict - PyTorch dedicated tensor container.
- Runhouse
- Hidet - Compilation-based DNN inference framework.
- Colossal-AI - Unified Deep Learning System for Big Model Era.
- difflogic - Library for Differentiable Logic Gate Networks.
- EZKL - Library and command-line tool for doing inference for deep learning models and other computational graphs in a zk-snark.
- DeLFT - Deep Learning Framework for Text.
- lambdaprompt - Functional programming interface for building AI systems.
- D-Adaptation - D-Adaptation for SGD, Adam and AdaGrad.
- BBopt - Black box hyperparameter optimization made easy.
- PyMC-Marketing - Bayesian marketing toolbox in PyMC. Media Mix, CLV models and more.
- skforecast - Time series forecasting with scikit-learn regressors.
- Alibi - Algorithms for explaining machine learning models.
- UCC - Unified Communication Collectives Library.
- Renate - Library for automatic retraining and continual learning.
- CausalAI - Fast and Scalable framework for Causal Analysis of Time Series and Tabular Data.
- Flashy - Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits.
- Streaming - Data Streaming Library for Efficient Neural Network Training.
- LogAI - Library for Log Analytics and Intelligence.
- nn-Meter - DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
- pyhf - Pure-Python HistFactory implementation with tensors and autodiff.
- fcmaes - Python 3 gradient-free optimization library.
- pyPESTO - Widely applicable and highly customizable toolbox for parameter estimation.
- textgenrnn - Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
- Parallelformers - Efficient Model Parallelization Toolkit for Deployment.
- BenchML - ML benchmarking and pipeling framework.
- MIOpen - AMD's library for high performance machine learning primitives.
- Substra - Federated learning (FL) software. Enables the training and validation of machine learning models on distributed datasets.
- Rust Circuit - Library for expressing and manipulating tensor computations for neural network interpretability, written in Rust and used in Python notebooks.
- SparseConvNet - PyTorch library for training Submanifold Sparse Convolutional Networks.
- PyLops - Linear-Operator Library for Python.
- robustlearn - Robust machine learning for responsible AI.
- fastRAG - Efficient Retrieval Augmentation and Generation Framework.
- HuggingFace.js - Utilities to use the Hugging Face hub API.
- DLRover - Automatically trains the Deep Learning model on the distributed cluster.
- Consistency Models - Mini-library for training consistency models.
- GraphStorm - Graph machine learning (GML) framework for enterprise use cases.
- scikit-learn-ts - Powerful machine learning library for Node.js – uses Python's scikit-learn under the hood.
- MMdnn - Comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models.
- BanditPAM - C++ implementation and Python package. (HN)
- pandas_dq - Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
- FLAML - Fast library for AutoML and tuning.
- ms2ml - Helps you convert raw mass spec data into tensors.
- eindex - Multidimensional indexing for tensors.
- Unmanic - Library Optimiser.
- scikit-matter - Collection of scikit-learn compatible utilities that implement methods born out of the materials science and chemistry communities.
- pymoo - NSGA2, NSGA3, R-NSGA3, MOEAD, Genetic Algorithms (GA), Differential Evolution (DE), CMAES, PSO.
- Hamilton - General purpose micro-framework for creating dataflows from python functions.
- Outlines - Generative Model Programming. (Tweet) (HN)
- crepes - Conformal regressors and predictive systems.
- Loopy - Transformation-Based Generation of High-Performance CPU/GPU Code.
- MSPrior - Multi(scale/stream) prior model for realtime temporal learning.
- EasyRunner - Lightweight tool for efficiently managing and executing parallel experiments.
- synthcity - Library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
- Multi-Output Gaussian Process Toolkit
- Aeon - Unified framework for machine learning with time series. (HN)
- Nixtla - Scalable machine learning for time series forecasting.
- MS-AMP - Microsoft Automatic Mixed Precision Library.
- AXLearn - Library for deep learning built upon Jax and GSPMD to support large-scale training.
- functime - Time-series machine learning and embeddings at scale.
- Torch-Grammar - Restricts a model to output a token sequence that conforms to a provided EBNF grammar.
- CMSIS NN - Efficient neural network kernels developed to maximize the performance and minimize the memory footprint of neural networks on Arm Cortex-M processors.
- Trident - Performance library for machine learning applications.
- ytopt - Machine-learning-based search methods for autotuning.
- ragas - Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines.
- pykoi - Active learning in one unified interface. (Web) (HN)
- TensorRTx - Implementation of popular deep learning networks with TensorRT network definition API.
- GSLB - Comprehensive benchmark of Graph Structure Learning.
- aleatory - Python library for Stochastic Processes Simulation and Visualisation.
- micrograd - Minimalist neural networks library built on a tiny autograd engine.
- NATTEN - Interface to neighborhood attention, and more generally sliding window attention.
- teenygrad - If tinygrad wasn't small enough for you.
- Autodidact - Pedagogical implementation of Autograd.
- disco - Toolkit for Distributional Control of Generative Models.