Natural language processing
Find generative ML like Stable Diffusion/Midjourney fascinating.
Love using ChatGPT and its open variants like Alpaca. Alpaca specifically has lots of momentum building on top of it, like this UI.
Prompt Engineering is great read.
spaCy (with their NLP course) & Fairseq are interesting libraries. Natural Language Processing with Transformers Book is nice book. Hugging Face NLP Course is probably the best NLP intro out there.
DALL·E 2 is fascinating too. Trying to understand DALL-E in PyTorch implementation. Although Midjourney is strictly superior technology now.
Getting started with NLP for absolute beginners is a nice intro.
LangChain & Petals are interesting. Lightning GPT is nice minimal GPT implementation. Want to try use LLaMA model.
Tokenizers & tiktoken are interesting tokenizers.
rust-bert is useful for making NLP pipelines.
Want to explore fine tuning FLAN-T5 model together with examples from OpenAI Cookbook.
LLM-Chain is amazing tool. Building LLM applications for production is great read.
This is a great thread to understand transformer neural nets.
GPT in 60 Lines of NumPy is great read to understand how large language models work.
Notes
- Figuring out correctly when/what to escalate to a human would change customer service more than anything else.
- GPT-3 was created by mining a human-written internet that will never again exist thanks to the creation of GPT-3
- Creating a delightful AI assistant is not anymore a problem of getting smarter models. It is a now product problem. Better models will help but the main blocker is 100% a product problem at this point.
Links
- SpaCy - Industrial-strength Natural Language Processing (NLP) with Python and Cython. (HN: SpaCy 3.0 (2021))
- Adding voice control to your projects
- Increasing data science productivity; founders of spaCy & Prodigy
- Course materials for "Natural Language" course
- NLP progress - Track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art across the most common NLP tasks and their corresponding datasets. (Web)
- Natural - General natural language facilities for Node.
- YSDA Natural Language Processing course (2018)
- PyText - Natural language modeling framework based on PyTorch.
- FlashText - Extract Keywords from sentence or Replace keywords in sentences.
- BERT PyTorch implementation
- LASER Language-Agnostic SEntence Representations - Library to calculate and use multilingual sentence embeddings.
- StanfordNLP - Python NLP Library for Many Human Languages.
- nlp-tutorial - Tutorial for who is studying NLP(Natural Language Processing) using TensorFlow and PyTorch.
- Better Language Models and Their Implications (2019)
- gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners".
- Lingvo - Framework for building neural networks in Tensorflow, particularly sequence models.
- Fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
- Stanford CS224N: NLP with Deep Learning (2019) - Course page. (HN)
- Advanced NLP with spaCy: Free Course (Web) (HN) (HN)
- Code for Stanford Natural Language Understanding course, CS224u (2019)
- Awesome Reinforcement Learning for Natural Language Processing
- ParlAI - Framework for training and evaluating AI models on a variety of openly available dialogue datasets.
- Training language GANs from Scratch (2019)
- Olivia - Your new best friend built with an artificial neural network.
- Learn-Natural-Language-Processing-Curriculum
- This repository recorded my NLP journey
- Project Alias - Open-source parasite to train custom wake-up names for smart home devices while disturbing their built-in microphone.
- Cornell Tech NLP Code
- Cornell Tech NLP Publications
- Thinc - SpaCy's Machine Learning library for NLP in Python. (Docs)
- Knowledge is embedded in language neural networks but can they reason? (2019)
- NLP Best Practices
- Transfer NLP library - Framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP.
- FARM - Fast & easy transfer learning for NLP. Harvesting language models for the industry.
- Transformers - State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch. (Web)
- NLP Roadmap 2019
- Flair - Very simple framework for state-of-the-art NLP. Developed by Zalando Research.
- Unsupervised Data Augmentation - Semi-supervised learning method which achieves state-of-the-art results on a wide variety of language and vision tasks.
- Rasa - Open source machine learning framework to automate text-and voice-based conversations.
- T5 - Text-To-Text Transfer Transformer.
- 100 Must-Read NLP Papers (HN)
- Awesome NLP
- NLP Library - Curated collection of papers for the NLP practitioner.
- spacy-transformers - spaCy pipelines for pre-trained BERT, XLNet and GPT-2.
- AllenNLP - Open-source NLP research library, built on PyTorch. (Announcing AllenNLP 1.0)
- GloVe - Global Vectors for Word Representation.
- Botpress - Open-source Virtual Assistant platform.
- Mycroft - Hackable open source voice assistant. (HN)
- VizSeq - Visual Analysis Toolkit for Text Generation Tasks.
- Awesome Natural Language Generation
- How I used NLP (Spacy) to screen Data Science Resume (2019)
- Introduction to Natural Language Processing book - Survey of computational methods for understanding, generating, and manipulating human language, which offers a synthesis of classical representations and algorithms with contemporary machine learning techniques.
- Natural Language Processing with PyTorch: Build Intelligent Language Applications Using Deep Learning (Code)
- Tokenizers - Fast State-of-the-Art Tokenizers optimized for Research and Production. (Article)
- Example Notebook using BERT for NLP with Keras (2020)
- NLP 2019/2020 Highlights
- Overview of Modern Deep Learning Techniques Applied to Natural Language Processing
- Language Identification from Very Short Strings (2019)
- SentenceRepresentation - Code acompanies the paper 'Learning Sentence Representations from Unlabelled Data' Felix Hill, KyungHyun Cho and Anna Korhonen 2016.
- Deep Learning for Language Processing course
- Megatron LM - Ongoing research training transformer language models at scale, including: BERT & GPT-2. (Megatron with FastMoE) (Fork)
- XLNet - New unsupervised language representation learning method based on a novel generalized permutation language modeling objective.
- ALBERT - Lite BERT for Self-supervised Learning of Language Representations.
- BERT - TensorFlow code and pre-trained models for BERT.
- Multilingual Denoising Pre-training for Neural Machine Translation (2020)
- List of NLP tutorials built on PyTorch
- sticker - Sequence labeler that uses either recurrent neural networks, transformers, or dilated convolution networks.
- sticker-transformers - Pretrained transformer models for sticker.
- pke - Python Keyphrase Extraction module.
- How to train a new language model from scratch using Transformers and Tokenizers (2020)
- Interactive Attention Visualization - Small example of an interactive visualization for attention values as being used by transformer language models like GPT2 and BERT.
- The Annotated GPT-2 (2020)
- GluonNLP - Toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your NLP research.
- Finetune - Scikit-learn style model finetuning for NLP.
- Stanza: A Python Natural Language Processing Toolkit for Many Human Languages (2020) (HN)
- NLP Newsletter
- NLP Paper Summaries
- Advanced NLP with spaCy
- Myle Ott's research
- Natural Language Toolkit (NLTK) - Suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. (Web) (Book)
- NLP 100 Exercise - Bootcamp designed for learning skills for programming, data analysis, and research activities. (Code)
- The Transformer Family (2020)
- Minimalist Implementation of a BERT Sentence Classifier
- fastText - Library for efficient text classification and representation learning. (Code) (Article) (HN) (Fork)
- Awesome NLP Paper Discussions - Papers & presentations from Hugging Face's weekly science day.
- SynST: Syntactically Supervised Transformers
- The Cost of Training NLP Models: A Concise Overview (2020)
- Tutorial - Transformers (Tweet)
- TTS - Deep learning for Text to Speech.
- MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer (2020)
- gpt-2-simple - Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts.
- BERTScore - BERT score for text generation.
- ML and NLP Paper Discussions
- NLP Index - Collection of NLP resources.
- NLP Datasets
- Word Embeddings (2017)
- NLP from Scratch: Annotated Attention (2020)
- This Word Does Not Exist - Allows people to train a variant of GPT-2 that makes up words, definitions and examples from scratch. (Code) (HN)
- Ultimate guide to choosing an online course covering practical NLP (2020)
- HuggingFace
nlp
library - Quick overview (2020) (Twitter) - aitextgen - Robust Python tool for text-based AI training and generation using GPT-2. (HN)
- Self Supervised Representation Learning in NLP (2020) (HN)
- Synthetic and Natural Noise Both Break Neural Machine Translation (2017)
- Inferbeddings - Injecting Background Knowledge in Neural Models via Adversarial Set Regularisation.
- UCL Natural Language Processing group
- Interactive Lecture Notes, Slides and Exercises for Statistical NLP
- Beyond Accuracy: Behavioral Testing of NLP models with CheckList
- CMU LTI Low Resource NLP Bootcamp 2020
- GPT-3: Language Models Are Few-Shot Learners (2020) (HN) (Code)
- nlp - Lightweight and extensible library to easily share and access datasets and evaluation metrics for NLP.
- Brainsources for NLP enthusiasts
- Movement Pruning: Adaptive Sparsity by Fine-Tuning (Paper)
- NLP Resources
- TaBERT: Learning Contextual Representations for Natural Language Utterances and Structured Tables (Article) (HN)
- vtext - NLP in Rust with Python bindings.
- Language Technology Lab @ University of Cambridge
- The Natural Language Processing Dictionary
- Introduction to NLP using Fastai (2020)
- Gwern on GPT-3 (HN)
- Semantic Machines - Solving conversational artificial intelligence. Part of Microsoft.
- The Reformer – Pushing the limits of language modeling (HN)
- GPT-3 Creative Fiction (2020) (HN)
- Classifying 200k articles in 7 hours using NLP (2020) (HN)
- HN: Using GPT-3 to generate user interfaces (2020)
- Thread of GPT-3 use cases (2020)
- GPT-3 Code Experiments (Examples)
- How GPT3 Works - Visualizations and Animations (2020) (Lobsters) (HN)
- What is GPT-3? written in layman's terms (2020) (HN)
- GPT3 Examples (HN)
- DQI: Measuring Data Quality in NLP (2020)
- Humanloop - Train and deploy NLP. (HN)
- Do NLP Beyond English (2020) (HN)
- Giving GPT-3 a Turing Test (2020) (HN)
- Neural Network Methods for Natural Language Processing (2017)
- Tempering Expectations for GPT-3 and OpenAI’s API (2020)
- Philosophers on GPT-3 (2020) (HN)
- GPT-3 Explorer - Power tool for experimenting with GPT-3. (Code)
- Recent Advances in Natural Language Processing (2020) (HN)
- Project Insight - NLP as a Service. (Forum post)
- Bob Coecke: Quantum Natural Language Processing (QNLP) (2020) (Article)
- Language-Agnostic BERT Sentence Embedding (2020)
- Language Interpretability Tool (LIT) - Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.
- Booste Pre Trained Models - Free-to-use GPT-2 API. (HN)
- Context-theoretic Semantics for Natural Language: an Algebraic Framework (2007)
- THUNLP (Natural Language Processing Lab at Tsinghua University) research
- AI training method exceeds GPT-3 performance with fewer parameters (2020) (HN)
- BERT Attention Analysis
- Neural Modules and Models for Conversational AI (2020)
- BERTopic - Topic modeling technique that leverages BERT embeddings and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions.
- NLP Pandect - Comprehensive reference for all topics related to Natural Language Processing.
- Practical Natural Language Processing book (Code)
- NLP Reseach Project: Best Practices for Finetuning Large Transformer Language models (2020)
- Deep Learning for NLP notes (2020)
- Modern Practical Natural Language Processing course
- LXMERT: Learning Cross-Modality Encoder Representations from Transformers in PyTorch
- Awesome software for Text ML
- Pretrained Transformers for Text Ranking: BERT and Beyond (2020)
- SpaCy v3.0 Nightly (2020) (HN) (Tweet)
- Explore trained spaCy v3.0 pipelines
- spacy-streamlit - sGpaCy building blocks for Streamlit apps. (Tweet)
- Informers - State-of-the-art natural language processing for Ruby.
- How to Structure and Manage Natural Language Processing (NLP) Projects (2020)
- Sentence-BERT for spaCy - Wraps sentence-transformers (also known as sentence-BERT) directly in spaCy.
- Lingua Franca - Mycroft's multilingual text parsing and formatting library.
- Simple Transformers - Based on the Transformers library by HuggingFace. Lets you quickly train and evaluate Transformer models.
- Deep Bidirectional Transformers for Language Understanding (2020) - Explains a legendary paper, BERT. (HN)
- EasyTransfer - Designed to make the development of transfer learning in NLP applications easier.
- LambdaBERT - Transformers-style implementation of BERT using LambdaNetworks instead of self-attention.
- DialoGPT - State-of-the-Art Large-scale Pretrained Response Generation Model.
- Neural reading comprehension and beyond - Danqi Chen's Thesis (2020) (Code)
- LAMA: LAnguage Model Analysis - Probe for analyzing the factual and commonsense knowledge contained in pretrained language models.
- awesome-2vec - Curated list of 2vec-type embedding models.
- Rethinking Attention with Performers (2020) (HN)
- BERT Research - Key Concepts & Sources (2019)
- The Pile - Large, diverse, open source language modelling data set that consists of many smaller datasets combined together.
- Bort - Companion code for the paper "Optimal Subarchitecture Extraction for BERT."
- Vector AI - Encode And Deploy Vectors At The Edge. (Code)
- KeyBERT - Minimal keyword extraction with BERT. (Web)
- Multimodal Transformer for Unaligned Multimodal Language Sequences - In PyTorch.
- The Illustrated GPT-2 (Visualizing Transformer Language Models) (2020)
- A Primer in BERTology: What we know about how BERT works (2020) (HN)
- GPT Neo - Open-source GPT model, with pretrained 1.3B & 2.7B weight models. (HN)
- TextSynth - Bellard's free GPT-NeoX-20B, GPT-J playground and paid API. (Playground) (HN)
- How to Go from NLP in 1 Language to NLP in N Languages in One Shot (2020)
- Contextualized Topic Models - Family of topic models that use pre-trained representations of language (e.g., BERT) to support topic modeling.
- Language Style Transfer - Code for Style Transfer from Non-Parallel Text by Cross-Alignment paper.
- NLU - Power of Spark NLP, the Simplicity of Python. 1 line for hundreds of NLP models and algorithms.
- PyTorch Implementation of Google BERT
- High Performance Natural Language Processing (2020)
- duoBERT - Multi-stage passage ranking: monoBERT + duoBERT.
- Awesome GPT-3
- SMAC3 - Sequential Model-based Algorithm Configuration.
- Semantic Experiences by Google - Experiments in understanding language.
- Long-Range Arena - Systematic evaluation of efficient transformer models.
- PaddleHub - Awesome pre-trained models toolkit based on PaddlePaddle.
- DeepSPIN (Deep Structured Prediction in Natural Language Processing) (GitHub)
- Multi-Task Learning in NLP
- FastSeq - Provides efficient implementation of popular sequence models (e.g. Bart, ProphetNet) for text generation, summarization, translation tasks etc.
- Sentence Embeddings with BERT & XLNet
- FastFormers - Provides a set of recipes and methods to achieve highly efficient inference of Transformer models for Natural Language Understanding (NLU).
- Adversarial NLI - Adversarial Natural Language Inference Benchmark.
- textract - Extract text from any document. No muss. No fuss. (Docs)
- NLP e Named Entity Recognition (2020)
- Big Bird: Transformers for Longer Sequences
- NLP PyTorch Tutorial
- EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
- CrossWeigh: Training Named Entity Tagger from Imperfect Annotations (2019) (Code)
- Does GPT-2 Know Your Phone Number? (2020)
- Towards Fully Automated Manga Translation (2020)
- Text Classification Models - All kinds of text classification models and more with deep learning.
- Awesome Text Summarization
- Shortformer: Better Language Modeling using Shorter Inputs (2020) (HN)
- huggingface_hub - Client library to download and publish models and other files on the huggingface.co hub.
- Embeddings from the Ground Up (2020)
- Ecco - Tools to visuals and explore NLP language models. (Web) (HN)
- Interfaces for Explaining Transformer Language Models (2020)
- DALL·E: Creating Images from Text (2021) (HN) (Reddit)
- CLIP: Connecting Text and Images (2021) (HN) (Paper) (Code)
- OpenNRE - Open-Source Package for Neural Relation Extraction (NRE).
- Princeton NLP Group (GitHub)
- Must-read papers on neural relation extraction (NRE)
- FewRel Dataset, Toolkits and Baseline Models
- Tree Transformer: Integrating Tree Structures into Self-Attention (2019) (Code)
- SentEval: evaluation toolkit for sentence embeddings
- gpt-scrolls - Collaborative collection of open-source safe GPT-3 prompts that work well.
- SLING - A natural language frame semantics parser - Built to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion.
- Awesome Neural Adaptation in NLP
- Natural language generation: The commercial state of the art in 2020 (HN)
- Non-Autoregressive Generation Progress
- Trankit: A Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
- VecMap - Framework to learn cross-lingual word embedding mappings.
- Kiri - Natural Language Engine. (Web)
- GPT3 List - List of things that people are claiming is enabled by GPT3.
- DeBERTa - Decoding-enhanced BERT with Disentangled Attention.
- Sockeye - Open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet. (Docs)
- Robustness Gym - Python evaluation toolkit for natural language processing.
- State-of-the-Art Conversational AI with Transfer Learning
- GPT-Neo - GPT-3-sized model, open source and free. (HN) (Code)
- Deep Daze - Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network).
- Notebooks using the Hugging Face libraries
- NLP Cloud - Serve spaCy pre-trained models, and your own custom models, through a RESTful API.
- CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters (2020) (Code)
- jiant - Multitask and transfer learning toolkit for NLP. (Web)
- Must-read Papers on Textual Adversarial Attack and Defense
- Reranker - Build Text Rerankers with Deep Language Models.
- rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...).
- rust-tokenizers - Offers high-performance tokenizers for modern language models.
- Replicating GPT-2 at Home (2021) (HN)
- Shifterator - Interpretable data visualizations for understanding how texts differ at the word level.
- CMU Neural Networks for NLP Course (2021) (Videos)
- minnn - Exercise in developing a minimalist neural network toolkit for NLP.
- Controllable Sentence Simplification (2019) (Code)
- Awesome Relation Extraction
- retext - Natural language processor powered by plugins part of the unified collective. (Awesome)
- CLIP Playground - Try OpenAI's CLIP model in your browser.
- GPT-3 Demo - GPT-3 Examples, Demos, Showcase, and NLP Use-cases.
- Big Sleep - Simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.
- Beyond the Imitation Game Benchmark (BIG-bench) - Collaborative benchmark intended to probe large language models, and extrapolate their future capabilities.
- AutoNLP - Automatic way to train, evaluate and deploy state-of-the-art NLP models for different tasks.
- DeText - Deep Neural Text Understanding Framework for Ranking and Classification Tasks.
- Paragraph Vectors in PyTorch
- NeuSpell: A Neural Spelling Correction Toolkit
- Natural Language YouTube Search - Search inside YouTube videos using natural language.
- Accelerate - Simple way to train and use NLP models with multi-GPU, TPU, mixed-precision.
- Classical Language Toolkit (CLTK) - Python library offering natural language processing (NLP) for pre-modern languages. (Web)
- Guide: Finetune GPT2-XL
- GENRE (Generarive ENtity REtrieval) - Uses a sequence-to-sequence approach to entity retrieval (e.g., linking), based on fine-tuned BART architecture.
- Teachable NLP - GPT-2 Training as a Service.
- DensePhrases - Provides answers to your natural language questions from the entire Wikipedia in real-time.
- How to use GPT-3 recursively to solve general problems (2021)
- Podium - Framework agnostic Python NLP library for data loading and preprocessing.
- Prompts - Advanced GPT-3 playground. (Code)
- TextFlint - Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing.
- Awesome Text Summarization
- SimCSE: Simple Contrastive Learning of Sentence Embeddings (2021) (Code)
- Berkeley Neural Parser - High-accuracy NLP parser with models for 11 languages. (Web)
- nlpaug - Data augmentation for NLP.
- Top2Vec - Learns jointly embedded topic, document and word vectors.
- Focused Attention Improves Document-Grounded Generation (2021) (Code)
- NLPretext - All the goto functions you need to handle NLP use-cases.
- spaCy + UDPipe
- adapter-transformers - Friendly fork of HuggingFace's Transformers, adding Adapters to PyTorch language models.
- TextAttack - Generating adversarial examples for NLP models.
- GPT-NeoX - Implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library.
- Transfer Learning in Natural Language Processing (2019) (Code)
- Cohere - Help computers understand language. (Tweet)
- Transformers Interpret - Model explainability tool designed to work exclusively with the transformers package.
- Whatlang - Natural language detection library for Rust. (Web)
- Category Theory + NLP Papers
- UniLM - Pre-trained models for natural language understanding (NLU) and generation (NLG) tasks. (HN)
- AutoNLP - Faster and easier training and deployments of SOTA NLP models.
- TAble PArSing (TAPAS) - End-to-end neural table-text understanding models.
- Replacing Bert Self-Attention with Fourier Transform: 92% Accuracy, 7X Faster (2021)
- FNet: Mixing Tokens with Fourier Transforms (2021) (Tweet)
- True Few-Shot Learning with Language Models (2021) (Tweet) (Code)
- End-to-end NLP workflows from prototype to production (Web)
- Haystack - End-to-end Python framework for building natural language search interfaces to data. (HN)
- PLMpapers - Must-read Papers on pre-trained language models.
- English-to-Spanish translation with a sequence-to-sequence Transformer in Keras
- Evaluation Harness for Large Language Models - Framework for few-shot evaluation of autoregressive language models.
- MLP GPT - Jax - GPT, made only of MLPs, in Jax.
- Few-Shot Question Answering by Pretraining Span Selection (2021) (Code)
- Neural Extractive Search (2021) (Demo)
- Hugging Face NLP Course (Code)
- SentencePiece - Unsupervised text tokenizer for Neural Network-based text generation.
- LoRA: Low-Rank Adaptation of Large Language Models (2021) (Code) (Code) (HN)
- PromptPapers - Must-read papers on prompt-based tuning for pre-trained language models.
- Obsei - Automation tool for text analysis need.
- Evaluating Large Language Models Trained on Code (2021) (Code)
- Survey of Surveys for Natural Language Processing (SOS4NLP)
- CLIP guided diffusion
- Data driven literary analysis
- DALL·E Mini - Generate images from a text prompt.
- Jury - Evaluation for Natural Language Generation.
- Rubrix - Free and open-source tool to explore, label, and monitor data for NLP projects.
- Knowledge Neurons in Pretrained Transformers (2021) (Code) (Code)
- OpenCLIP - Open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).
- Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning (2021) (Code)
- Can a Fruit Fly Learn Word Embeddings? (2021)
- Spark NLP - Natural Language Processing library built on top of Apache Spark ML. (Web)
- Spark NLP Workshop - Showcasing notebooks and codes of how to use Spark NLP in Python and Scala.
- ConceptNet Numberbatch - Set of semantic vectors (also known as word embeddings) than can be used directly as a representation of word meanings.
- OpenAI Codex - AI system that translates natural language to code. (HN)
- Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM (2021)
- NL-Augmenter - Collaborative Repository of Natural Language Transformations.
- wevi - Word embedding visual inspector. (Code)
- clip-retrieval - Easily computing clip embeddings and building a clip retrieval system with them.
- NVIDIA NeMo - Toolkit for conversational AI.
- Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
- BEIR - Heterogeneous benchmark containing diverse IR tasks. It also provides a common and easy framework for evaluation of your NLP-based retrieval models within the benchmark.
- UER-py - Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo.
- ExplainaBoard - Explainable Leaderboard for NLP.
- Fast-BERT - Super easy library for BERT based NLP models.
- Genie Tookit - Generator of Natural Language Parsers for Compositional Virtual Assistants. (Paper)
- Quantum Stat - Your NLP Model Training Platform.
- Mistral - Framework for transparent and accessible large-scale language model training, built with Hugging Face. (Docs)
- NERDA - Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks.
- Data Augmentation Techniques for NLP
- Feed forward VQGAN-CLIP model
- Yet Another Keyword Extractor (Yake) - Unsupervised Approach for Automatic Keyword Extraction using Text Features.
- Challenges in Detoxifying Language Models (2021) (Tweet)
- TextBrewer - PyTorch-based model distillation toolkit for natural language processing.
- GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain (2021)
- PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models (2021) (Code)
- VQGAN-CLIP Overview - Repo for running VQGAN+CLIP locally.
- TLDR: Extreme Summarization of Scientific Documents (2020) (Code)
- Can Language Models be Biomedical Knowledge Bases? (2021)
- ColBERT: Contextualized Late Interaction over BERT (2020)
- Investigating Pretrained Language Models for Graph-to-Text Generation (2020) (Code)
- Ubiquitous Knowledge Processing Lab (GitHub)
- DedupliPy - Python package for deduplication/entity resolution using active learning.
- Flexible Generation of Natural Language Deductions (2021) (Code)
- Machine Translation Reading List
- Compressive Transformers for Long-Range Sequence Modelling (2020) (Code)
- pyxclib - Tools for multi-label classification problems.
- ELECTRA - Pre-training Text Encoders as Discriminators Rather Than Generators.
- OpenPrompt - Open-Source Toolkit for Prompt-Learning.
- Unsupervised Neural Machine Translation with Generative Language Models Only (2021) (Tweet)
- Grounding Spatio-Temporal Language with Transformers (2021) (Code)
- Fast Sentence Embeddings (fse) - Compute Sentence Embeddings Fast.
- Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (2021)
- Surge AI - Build powerful NLP datasets using our global labeling force and platform. (Python SDK)
- Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels (Code)
- ogen - OpenAPI v3 code generator for go.
- PromptSource - Toolkit for collecting and applying prompts to NLP datasets. (Web) (HN)
- Creating User Interface Mock-ups from High-Level Text Descriptions with Deep-Learning Models (2021)
- Filtlong - Tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset.
- Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System (2021) (Code)
- xFormers - Hackable and optimized Transformers building blocks, supporting a composable construction.
- Language Models As or For Knowledge Bases (2021)
- Wikipedia2Vec - Tool for learning vector representations of words and entities from Wikipedia. (Code)
- Reflections on Foundation Models (2021) (Tweet)
- textacy - NLP, before and after spaCy.
- Natural Language Processing Specialization Course (Tweet)
- Hugging Face on Amazon SageMaker Workshop
- CS224N: Natural Language Processing with Deep Learning | Winter 2021 - YouTube
- GPT-3 creates geofoam, but out of text (2021)
- Towards Efficient NLP: A Standard Evaluation and A Strong Baseline (2021) (Code)
- Hierarchical Transformers Are More Efficient Language Models (2021) (HN) (Code)
- Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration (2021) (Code)
- GPT-3 is no longer the only game in town (2021) (HN)
- PatrickStar - Parallel Training of Large Language Models via a Chunk-based Memory Management.
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) (2021)
- Text2Art - AI Powered Text-to-Art Generator.
- Emergent Communication of Generalizations (2021) (Code)
- Awesome Pretrained Models for Information Retrieval
- SummerTime - Text Summarization Toolkit for Non-experts.
- NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework (2021) (Code)
- Differentially Private Fine-tuning of Language Models (2021) (Tweet)
- TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning (2021) (Code)
- Aphantasia - CLIP + FFT/DWT/RGB = text to image/video.
- OpenAI’s API Now Available with No Waitlist (2021) (HN)
- Recent trends of Entity Linking, Disambiguation, and Representation
- Intro to Large Language Models with Cohere
- spacy-experimental - Cutting-edge experimental spaCy components and features.
- AdaptNLP - High level framework and library for running, training, and deploying state-of-the-art Natural Language Processing (NLP) models for end to end tasks. (Docs)
- Reading list for Awesome Sentiment Analysis papers
- Aspect-Based-Sentiment-Analysis: Transformer & Explainable ML (TensorFlow)
- Deploy optimized transformer based models in production
- PyConverse - Conversational text Analysis using various NLP techniques.
- KILT - Library for Knowledge Intensive Language Tasks.
- RoFormer: Enhanced Transformer with Rotary Position Embedding (2021) (Code)
- N-grammer: Augmenting Transformers with latent n-grams (2021) (Code)
- textsearch - Find strings/words in text; convenience and C speed.
- Mastering spaCy Book (2021) (Code)
- sense2vec - Contextually-keyed word vectors.
- Pureformer: Do We Even Need Attention? (2021)
- Knover - Toolkit for knowledge grounded dialogue generation based on PaddlePaddle.
- Language Modelling at Scale: Gopher, Ethical considerations, and Retrieval | DeepMind (2021) (HN)
- CMU Advanced NLP 2021 - YouTube
- CMU Advanced NLP 2022 - YouTube (Tweet)
- whatlies - Toolkit to help understand "what lies" in word embeddings. Also benchmarking.
- CLIP-Guided-Diffusion
- Factual Probing Is [MASK]: Learning vs. Learning to Recall (2021) (Code)
- Improving Compositional Generalization with Latent Structure and Data Augmentation (2021)
- PORORO - Platform Of neuRal mOdels for natuRal language prOcessing.
- PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization (2021) (Code)
- To Understand Language Is to Understand Generalization (2021) (HN)
- GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (2020) (Code)
- Multimodal Transformers | Transformers with Tabular Data (Article)
- Learn to Resolve Conversational Dependency: A Consistency Training Framework for Conversational Question Answering (2021) (Code)
- Improving Language Models by Retrieving from Trillions of Tokens (2021)
- Open Information Extraction (OIE) Resources
- Deeper Text Understanding for IR with Contextual Neural Language Modeling (2019) (Code)
- x-clip - Concise but complete implementation of CLIP with various experimental improvements from recent papers.
- Calamity - Self-hosted GPT playground.
- VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation (2021) (Code)
- Transactions of the Association for Computational Linguistics (2021) (Code)
- DocEE - Toolkit for document-level event extraction, containing some SOTA model implementations.
- Autoregressive Entity Retrieval (2020)
- Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation (2020)
- A Span-Based Model for Joint Overlapped and Discontinuous Named Entity Recognition (2021)
- Deduplicating Training Data Makes Language Models Better (2021) (Code)
- Transformers without Tears: Improving the Normalization of Self-Attention (2019) (Code)
- CTCDecoder - Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
- Custom Named Entity Recognition with Spacy3
- BARTScore: Evaluating Generated Text as Text Generation (2021) (Code)
- minDALL-E on Conceptual Captions - PyTorch implementation of a 1.3B text-to-image generation model trained on 14 million image-text pairs.
- Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation (2021) (Code)
- Multitask Prompted Training Enables Zero-Shot Task Generalization (2021) (Code)
- spaCy models - Models for the spaCy Natural Language Processing (NLP) library.
- Awesome Huggingface
- SyntaxDot - Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.
- STriP Net - Semantic Similarity of Scientific Papers (S3P) Network.
- Small-Text - Active Learning for Text Classification in Python.
- Plug and Play Language Models: A Simple Approach to Controlled Text Generation (2020) (Code)
- RuDOLPH - One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP.
- PLM papers - Paper list of pre-trained language models (PLMs).
- Ongoing research training transformer language models at scale, including: BERT & GPT-2
- Improving language models by retrieving from trillions of tokens (2022) (Code)
- EntitySeg Toolbox - Towards precise and open-world image segmentation.
- Aligning Language Models to Follow Instructions (2022) (Tweet) (Code)
- Simple Questions Generate Named Entity Recognition Datasets (2021) (Code)
- KRED: Knowledge-Aware Document Representation for News Recommendations (2019) (Code)
- Stanford Open Information Extraction
- Python3 wrapper for Stanford OpenIE
- I-BERT: Integer-only BERT Quantization (2021) (Code)
- spaCy-wrap - Wrapping fine-tuned transformers in spaCy pipelines.
- DeepMatcher - Python package for performing Entity and Text Matching using Deep Learning.
- Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond (2020) (Code)
- medspacy - Library for clinical NLP with spaCy.
- Natural Language Processing with Transformers Book (Code)
- blurr - Library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.
- HanLP - Multilingual NLP library for researchers and companies, built on PyTorch and TensorFlow 2.x.
- Awesome Text-to-Image
- NLP News Newsletter
- Named Entity Recognition as Dependency Parsing (2020) (Code)
- Multilingual-CLIP - OpenAI CLIP text encoders for any language.
- FasterTransformer - Transformer related optimization, including BERT, GPT.
- Papers about Causal Inference and Language
- EET (Easy and Efficient Transformer) - Efficient PyTorch inference plugin focus on Transformer-based models with large model sizes and long sequences.
- Measuring Massive Multitask Language Understanding (2021) (Code)
- A Theoretical Analysis of the Repetition Problem in Text Generation (2021) (Code)
- TransformerSum - Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.
- Natural Language Processing with Transformers Book
- Transformer Memory as a Differentiable Search Index (2022) (HN) (Tweet)
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (2020) (Code)
- spaCy + Stanza - Use the latest Stanza (StanfordNLP) research models directly in spaCy.
- Awesome Document Understanding
- Sequential Transformer - Code for training Transformers on sequential tasks such as language modeling.
- bert-as-service - Mapping a variable-length sentence to a fixed-length vector using BERT model.
- A Contrastive Framework for Neural Text Generation (2022) (Code)
- Parallax - Tool for interactive embeddings visualization.
- Serve PyTorch model as an API using AWS + serverless framework
- Neural reality of argument structure constructions (2022)
- DeepNet: Scaling Transformers to 1,000 Layers (2022) (HN)
- Large Models of Source Code - Guide to using pre-trained large language models of source code.
- HyperMixer: An MLP-based Green AI Alternative to Transformers (2022)
- NLP Course Material & QA
- Survey of Surveys (NLP & ML) - Collection of 700+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).
- Awesome CLIP - Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
- MAGMA - GPT-style multimodal model that can understand any combination of images and language.
- Timexy - spaCy custom component that extracts and normalizes temporal expressions.
- New Capabilities for GPT-3: Edit and Insert (2022) (HN)
- Which hardware to train a 176B parameters model? (2022) (Tweet)
- Fundamentals of NLP - Series of hands-on notebooks for learning the fundamentals of NLP.
- BertViz - Visualize Attention in Transformer Models (BERT, GPT2, BART, etc.).
- Attention Is All You Need (2017) (Code) (PyTorch Code)
- Word2Vec Explained. Explaining the Intuition of Word2Vec (2021) (HN)
- imgbeddings - Python package to generate image embeddings with CLIP without PyTorch/TensorFlow.
- Linking Emergent and Natural Languages via Corpus Transfer (2022)
- Transformer Inference Arithmetic (2022)
- Training Compute-Optimal Large Language Models (2022) (Tweet)
- KeyphraseVectorizers - Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a document-keyphrase matrix.
- Gramformer - Framework for detecting, highlighting and correcting grammatical errors on natural language text.
- Classy Classification - Easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classificaiton with Huggingface.
- Sphere - Web-scale retrieval for knowledge-intensive NLP.
- muTransformers - Common Huggingface transformers in maximal update parametrization (µP).
- Event Extraction papers - List of NLP resources focused on event extraction task.
- Summarization Papers
- GLID-3 - Combination of OpenAI GLIDE, Latent Diffusion and CLIP.
- Optimum Transformers - Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.
- Pathways Language Model (PaLM): Scaling to 540B parameters (2022) (HN) (Code) (Code)
- A Divide-and-Conquer Approach to the Summarization of Long Documents (2020) (Code)
- Resources for learning about Text Mining and Natural Language Processing
- LinkBERT: Pretraining Language Models with Document Links (2022) (Code)
- Dall-E 2 (2022) (HN) (Tweet) (Tweet) (Code) (Code) (Code) (Tweet) (Tweet) (HN) (Video Summary) (HN) (Tweet)
- Variations of the Similarity Function of TextRank for Automated Summarization (2016) (Code)
- Logic-Guided Data Augmentation and Regularization for Consistent Question Answering (2020) (Code)
- Awesome Knowledge Distillation
- You Only One Sequence (2021)
- Towards Understanding and Mitigating Social Biases in Language Models (2021) (Code)
- DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization (2021) (Code)
- Humanloop Programmatic - Create large high-quality datasets for NLP in minutes. No hand labelling required. (HN)
- Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language (2022)
- Second order effects of the rise of large language models (2022)
- Simple Annotated implementation of GPT-NeoX in PyTorch
- BLEURT: Learning Robust Metrics for Text Generation (2020) (Code)
- Bootleg - Self-supervised named entity disambiguation (NED) system that links mentions in text to entities in a knowledge base. (Code)
- DALL-E in Mesh-TensorFlow
- A few things to try with DALL·E (2022) (HN)
- Google's 540B PaLM Language Model & OpenAI's DALL-E 2 Text-to-Image Revolution (2022)
- Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution (2021) (Code)
- Simple and Effective Multi-Paragraph Reading Comprehension (2017) (Code)
- Researchers Glimpse How AI Gets So Good at Language Processing (2022)
- Cornell Conversational Analysis Toolkit (ConvoKit) - Toolkit for extracting conversational features and analyzing social phenomena in conversations.
- UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models (2022) (Code)
- exBERT - Visual Analysis Tool to Explore Learned Representations in Transformers Models.
- How DALL-E 2 Works (2022) (HN)
- Getting started with NLP for absolute beginners (2022)
- EasyNLP - Comprehensive and Easy-to-use NLP Toolkit.
- Reframing Human-AI Collaboration for Generating Free-Text Explanations (2021) (Tweet)
- Detoxify - Comment Classification with PyTorch Lightning and Transformers.
- DLATK - End to end human text analysis package, specifically suited for social media and social scientific applications.
- Language modeling via stochastic processes (2022) (Code)
- An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling (2022) (Code)
- SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization (2021) (Code)
- DataLab - Unified platform that allows for NLP researchers to perform a number of data-related tasks in an efficient and easy-to-use manner.
- Limitations of DALL-E (HN)
- AutoPrompt - Automatic Prompt Construction for Masked Language Models.
- DALL·E Flow - Human-in-the-Loop workflow for creating HD images from text.
- Recon NER - Debug and correct annotated Named Entity Recognition (NER) data for inconsitencies and get insights on improving the quality of your data.
- CausalNLP - Practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.
- OPT: Open Pre-trained Transformer Language Models (2022) - Meta's 175B parameter language model. (Reddit) (Tweet)
- Bert Extractive Summarizer - Easy to use extractive text summarization with BERT.
- Dialogue Response Ranking Training with Large-Scale Human Feedback Data (2020) (Code)
- LM-Debugger - Interactive tool for inspection and intervention in transformer-based language models.
- 100 Pages of raw notes released with the language model OPT-175 (HN)
- Unsupervised Cross-Task Generalization via Retrieval Augmentation (2022) (Code)
- On Continual Model Refinement in Out-of-Distribution Data Streams (2022)
- GLID-3-XL - 1.4B latent diffusion model from CompVis back-ported to the guided diffusion codebase.
- Neutralizing Subjectivity Bias with HuggingFace Transformers (2022)
- Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists (2022) (Code) (Tweet)
- gse - Go efficient multilingual NLP and text segmentation; support english, chinese, japanese and other.
- BERTopic: The Future of Topic Modeling (2022) (HN)
- Unifying Language Learning Paradigms (2022) (Code)
- GLM: General Language Model Pretraining with Autoregressive Blank Infilling (2021) (Code)
- GPT-3 limitations (2022)
- Natural Language Processing Demystified
- Concise Concepts - Contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
- Dynamic language understanding: adaptation to new knowledge in parametric and semi-parametric models (2022) (Tweet)
- nlprule - Fast, low-resource Natural Language Processing and Text Correction library written in Rust.
- Quark: Controllable Text Generation with Reinforced Unlearning (2022) (Tweet)
- DALL-E 2 has a secret language (HN) (Tweet) (HN)
- AdaTest - Find and fix bugs in natural language machine learning models using adaptive testing.
- Diffusion-LM Improves Controllable Text Generation (2022) (Code) (Tweet)
- RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering (2021) (Code)
- Neural Prompt Search - Searching prompt modules for parameter-efficient transfer learning.
- makemore - Most accessible way of tinkering with a GPT - one hackable script.
- DALL-E Playground - Playground for DALL-E enthusiasts to tinker with the open-source version of OpenAI's DALL-E, based on DALL-E Mini.
- Craiyon - AI model drawing images from any prompt. Formerly DALL-E mini.
- Contrastive Learning for Natural Language Processing
- MSCTD: A Multimodal Sentiment Chat Translation Dataset (Code)
- Auto-Lambda: Disentangling Dynamic Task Relationships (2022) (Code)
- Concepts in Neural Networks for NLP
- DinkyTrain - Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration.
- Pretrained Language Models
- BERT-of-Theseus: Compressing BERT by Progressive Module Replacing (2020) (Code)
- YaLM 100B - GPT-like neural network for generating and processing text by Yandex. (HN) (Article)
- Pathways Autoregressive Text-to-Image model (Parti) - Autoregressive text-to-image generation model that achieves high-fidelity photorealistic image generation and supports content-rich synthesis involving complex compositions and world knowledge. (Web) (HN)
- How Imagen Actually Works (2022)
- First impressions of DALL-E, generating images from text (2022) (Lobsters)
- Meta is inviting researchers to pick apart the flaws in its version of GPT-3 (2022) (HN)
- 'Making Moves' In DALL·E mini (2022)
- min(DALL·E) - Minimal implementation of DALL·E Mini. It has been stripped to the bare essentials necessary for doing inference, and converted to PyTorch.
- Awesome Document Similarity Measures
- RETRO Is Blazingly Fast (2022)
- LightOn - Unlock Extreme-Scale Machine Intelligence. Most repos are focused on the use of photonic hardware. (GitHub)
- Minerva: Solving Quantitative Reasoning Problems with Language Models (2022) (Paper)
- winkNLP - Developer friendly Natural Language Processing. (Docs)
- Facebook Low Resource (FLoRes) MT Benchmark
- Using GPT-3 to explain how code works (2022) (Lobsters) (HN)
- Awesome Topic Models
- Introducing The World’s Largest Open Multilingual Language Model: BLOOM
- The DALL·E 2 Prompt Book (HN) (Tweet)
- RWKV - RNN with Transformer-level performance, which can also be directly trained like a GPT transformer (parallelizable).
- Kern AI - Open-source IDE for data-centric NLP. Combining programmatic labeling, extensive data management and neural search capabilities. (Code) (HN)
- spaCy fishing - spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata.
- DALL·E Now Available in Beta (2022) (HN)
- Inside language models (from GPT-3 to PaLM)
- Timeline of AI and language models
- Cascades - Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference, and more.
- Awesome Neural Symbolic
- Towards Knowledge-Based Recommender Dialog System (2019) (Code)
- Asent - Rule-based sentiment analysis library for Python made using SpaCy.
- extractacy - Pattern extraction and named entity linking for spaCy.
- A Hazard Analysis Framework for Code Synthesis Large Language Models (2022)
- Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? (2022) (Code)
- A Frustratingly Easy Approach for Entity and Relation Extraction (2021) (Code)
- Chinchilla's Wild Implications (2022) (HN)
- DALL·E 2 prompt book (2022) (HN)
- GLM-130B - Open Bilingual Pre-Trained Model.
- An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion (2022) (Code)
- DALL-E + GPT-3 = ♥ (2022) (HN)
- Run your own DALL-E-like image generator (2022) (HN)
- Stable Diffusion launch announcement (2022) (HN)
- Stable Diffusion
- MidJourney Styles and Keywords Reference
- Spent $15 in DALL·E 2 credits creating this AI image (2022) (HN)
- Phraser - Better way to generate prompts.
- Seminar on Large Language Models (2022)
- DocQuery - Document Query Engine Powered by NLP. (Article) (Tweet)
- Petals - Decentralized platform for running 100B+ language models. (Web) (HN) (HN)
- LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (2022) (Code)
- ekphrasis - Text processing tool, geared towards text from social networks.
- ALToolbox - Framework for practical active learning in NLP.
- Tools and scripts for experimenting with Transformers: Bert, T5
- Action Transformer (ACT-1) model in action
- Label Sleuth - Open source no-code system for text annotation and building text classifiers.
- Vectoring Words (Word Embeddings) (2022)
- CodeGeeX: A Multilingual Code Generative Model (2022)
- The first neural machine translation system for the Erzya language (2022) (Code)
- Awesome Efficient PLM Papers
- Polyglot: Large Language Models of Well-balanced Competence in Multi-languages
- Interactive Composition Explorer - Python library and trace visualizer for language model programs.
- TrAVis: Transformer Attention Visualizer (Code)
- Knowledge Unlearning for Mitigating Privacy Risks in Language Models (2022) (Code)
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model (2022) (Code)
- End-to-end Neural Coreference Resolution in spaCy (2022)
- Ask Me Anything: A simple strategy for prompting language models
- Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval (2022) (Code)
- A Kernel-Based View of Language Model Fine-Tuning (2022) (Code)
- Large Language Models are few(1)-shot Table Reasoners (2022) (Tweet)
- The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains (2022) (Code)
- Binding Language Models in Symbolic Languages (2022) (Code)
- ML and text manipulation tools (2022)
- Table-To-Text generation and pre-training with TabT5 (2022)
- concepCy - SpaCy wrapper for ConceptNet.
- AliceMind - ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab.
- CrossRE: A Cross-Domain Dataset for Relation Extraction (2022) (Code)
- Scaling Instruction-Finetuned Language Models (2022) (Tweet) (Tweet)
- Large Language Models Can Self-Improve (2022) (Tweet)
- Everyprompt - Playground for GPT-3. (Tweet)
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (2021) (Tweet)
- Composable Text Controls in Latent Space with ODEs (2022) (Code)
- flashgeotext - Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.
- lm-scorer - Language Model based sentences scoring library.
- CodeT: Code Generation with Generated Tests
- Bloom - BigScience Large Open-science Open-access Multilingual Language Model. (Tweet)
- Prompts - Free and open-source (FOSS) curation of prompts for OpenAI’s GPT-3, EleutherAI’s GPT-j, and other LMs.
- FSNER - Few-shot Named Entity Recognition.
- Ilya Sutskever (OpenAI): What's Next for Large Language Models (LLMs) (2022)
- Galactica - General-purpose scientific language model. It is trained on a large corpus of scientific text and data. (Code) (Tweet)
- Three-level Hierarchical Transformer Networks for Long-sequence and Multiple Clinical Documents Classification (2021) (Code)
- WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models (2022) (Code)
- Convenient Text-to-Text Training for Transformers
- Homophone Reveals the Truth: A Reality Check for Speech2Vec (2022) (Code)
- RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder (2022) (Code)
- Generate conversation starters given two personalities using AI
- MetaICL: Learning to Learn In Context (2021) (Code)
- PAL: Program-aided Language Models (2022) (Code)
- ReAct: Synergizing Reasoning and Acting in Language Models (2022) (Code)
- CogIE - Information Extraction Toolkit for Bridging Text and CogNet.
- T-NER - All-Round Python Library for Transformer-based Named Entity Recognition.
- mGPT: Multilingual Generative Pretrained Transformer
- LangChain - Building applications with LLMs through composability. (HN)
- HN Summary - Summarizes top stories from Hacker News using a large language model and posts them to a Telegram channel. (HN)
- OpenAI Model index for researchers
- ChatGPT
- Adventures in generating music via ChatGPT text prompts (2022)
- All the best examples of ChatGPT, from OpenAI
- ChatGPT nice examples
- WhatsApp-GPT
- What ChatGPT features/improvements do you want?
- Summarize-Webpage - Small NLP SAAS project that summarize a webpage.
- Nonparametric Masked Language Modeling (2022) - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks. (Reddit) (Code)
- Holistic Evaluation of Language Models - Framework to increase the transparency of language models. (Paper)
- Dramatron - Uses large language models to generate long, coherent text and could be useful for authors for co-writing theatre scripts and screenplays. (HN)
- ExtremeBERT - Toolkit that accelerates the pretraining of customized language models on customized datasets.
- Talking About Large Language Models (2022) (HN) (Tweet)
- The GPT-3 Architecture, on a Napkin (2022) (HN)
- Discovering Latent Knowledge in Language Models Without Supervision (2022) (HN)
- Lightning GPT
- Bricks - Open-source natural language enrichments at your fingertips.
- GPT-2 Output Detector
- Language Model Operationalization
- NLQuery - Natural language query engine on WikiData.
- Categorical Tools for Natural Language Processing (2022)
- Historical analogies for large language models (2022) (Tweet)
- CMU Advanced NLP Assignment: End-to-end NLP System Building
- New and Improved Embedding Model for OpenAI (2022) (HN)
- GPT-NeoX (HN)
- OpenAI Cookbook - Examples and guides for using the OpenAI API.
- OpenAI Question Answering using Embeddings
- GreaseLM: Graph REASoning Enhanced Language Models for Question Answering (2022) (Code)
- Rank-One Model Editing (ROME) - Locating and editing factual associations in GPT.
- Open Assistant - Give everyone access to a great chat based large language model. (Web) (HN)
- Characterizing Emergent Phenomena in Large Language Models (2022)
- SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features (2022) (Code)
- Blob - Powerful tool that uses language large models (LLMs) to assist in the creation and maintenance of software projects.
- Chain of Thought Prompting Elicits Reasoning in Large Language Models (2022) (Code)
- Compress-fastText - Python 3 package allows to compress fastText word embedding models.
- Large Language Models are Zero-Shot Reasoners (2022)
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning (2022)
- SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models (2022) (Code)
- Improving Language Model Behavior by Training on a Curated Dataset (2021)
- Reasoning in Large Language Models
- SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization (2022) (Code)
- Happy Transformer - Package built on top of Hugging Face's transformers library that makes it easy to utilize state-of-the-art NLP models.
- TextBox - Text generation library with pre-trained language models.
- Advances in Neural Information Processing Systems 30 (NIPS 2017)
- Poincaré Embeddings for Learning Hierarchical Representations (2017) (Code)
- llm-strategy - Implementing the Strategy Pattern using LLMs.
- Zshot - Zero and Few shot named entity & relationships recognition.
- Cramming: Training a Language Model on a Single GPU in One Day (2022) (Code)
- Trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models"
- Training language models to follow instructions with human feedback (2022) (Web) (Code)
- Lila: A Unified Benchmark for Mathematical Reasoning (2022)
- LibMultiLabel - Library for Multi-class and Multi-label Text Classification.
- Paper Notes on Pretrain Language Models with Factual Knowledge
- Atlas: Few-shot Learning with Retrieval Augmented Language Models (2022) (Code)
- Some Remarks on Large Language Models (2023) (HN)
- Massive Language Models Can Be Accurately Pruned in One-Shot (2023) (Reddit)
- LM Identifier - Toolkit for identifying pretrained language models from potentially AI-generated text.
- BRIO: Bringing Order to Abstractive Summarization (2022) (Code)
- DOC: Improving Long Story Coherence With Detailed Outline Control (2022) (Code)
- InPars: Data Augmentation for Information Retrieval using Large Language Models (2022) (Code)
- Unified Structure Generation for Universal Information Extraction (2022) (Code)
- Awesome Resource for NLP
- PromptArray: A Prompting Language for Neural Text Generators
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (2022) (Code)
- Multi Task NLP - Utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
- FairSeq with Apollo optimizer
- TFKit - Handling multiple NLP task in one pipeline.
- ReAct: Synergizing Reasoning and Acting in Language Models (2022)
- Repository of Language Instructions for NLP Tasks
- tasksource - Datasets curation and datasets metadata for NLP extreme multitask learning.
- ChatLangChain - Implementation of a chatbot specifically focused on question answering over the LangChain documentation.
- summaries - Toolkit for summarization analysis and aspect-based summarizers.
- SymbolicAI - Neuro-Symbolic Perspective on Large Language Models (LLMs).
- PEFT - Parameter-Efficient Fine-Tuning.
- Large Transformer Model Inference Optimization (2023) (HN)
- Embed-VTT - Generate & query embeddings from VTT files using openai & pinecone on Andrej Karpathy's's latest GPT tutorial.
- CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation (2021) (Code)
- Awesome LLM Engineering
- Minimal GPT-NeoX-20B in PyTorch
- Language Models of Code are Few-Shot Commonsense Learners (2022) (Code)
- Talking About Large Language Models (2022)
- Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP (2022) (Code)
- DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations (2022) (Code)
- LangChainHub (Article)
- NLP-Cube - Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing.
- Dust - Design and Deploy Large Language Models Apps. (Code) (Twitter)
- Awesome papers on Language-Model-as-a-Service (LMaaS)
- Sentences - Command line sentence tokenizer.
- Diff Models – A New Way to Edit Code (2023) (HN)
- MegaBlocks - Light-weight library for mixture-of-experts (MoE) training.
- Read Pilot - Analyzes online articles and generate Q&A cards for you. Powered by OpenAI & Next.js. (Code)
- Promptify - Prompt Engineering, Solve NLP Problems with LLM's & Easily generate different NLP Task prompts.
- polymath - Utility that uses AI to intelligently answer free-form questions based on a particular library of content.
- Incorporating External Knowledge through Pre-training for Natural Language to Code Generation (2020) (Code)
- Longformer: The Long-Document Transformer (2020) (Code)
- ProbSem - Probabilistic semantic parsing with program synthesis LLMs.
- Generate rather than Retrieve: Large Language Models are Strong Context Generators (2023) (Code)
- Text Generation Inference - Large Language Model Text Generation Inference.
- New AI classifier for indicating AI-written text (2023) (HN)
- DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature (2023) (HN)
- Towards Continual Knowledge Learning of Language Models (2022) (Code)
- AI Text Classifier - OpenAI API
- Fine-tuning GPTJ and other GPT models
- Adversarial Prompts
- Ignore Previous Prompt: Attack Techniques For Language Models (2022) (Code)
- Multimodal Chain-of-Thought Reasoning in Language Models (2023) (Paper)
- Prodigy OpenAI recipes - Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3.
- Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees (2023) (Code)
- Online Language Modelling Training Pipeline
- Storing OpenAI embeddings in Postgres with pgvector (2023) (HN)
- Theory of Mind May Have Spontaneously Emerged in Large Language Models (2023) (HN)
- Steamship Python Client Library For LangChain
- Toolformer: Language Models Can Teach Themselves to Use Tools (2023) (HN) (Code) (HN)
- Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery (2023) (Code)
- Understanding Large Language Models – A Transformative Reading List (2023) (HN)
- Discovering Latent Knowledge Without Supervision
- Offsite-Tuning: Transfer Learning without Full Model (2023) (Code)
- Awesome Neural Reprogramming Acoustic Prompting
- Chroma - Open-source embedding database. Makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.
- Prompt Engine - Microsoft's prompt engineering library. (HN)
- PCAE: A framework of plug-in conditional auto-encoder for controllable text generation (2022) (Code)
- EasyLM - Easy to use model parallel large language models in JAX/Flax with pjit support on cloud TPU pods.
- Promptable - Library that enables you to build powerful AI applications with LLMs and Embeddings providers such as OpenAI, Hugging Face, Cohere and Anthropic.
- Lightning + Colossal-AI - Efficient Large-Scale Distributed Training with Colossal-AI and Lightning AI.
- MarioGPT: Open-Ended Text2Level Generation through Large Language Models (2023) (Code)
- LangChain.js - Building applications with LLMs through composability.
- Top resources on prompt engineering (2023)
- What are Transformers & Named Entity Recognition (2023)
- Text is All You Need (2023) (HN)
- Awesome LLM
- On Prompt Engineering (2023)
- MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation (2022) (Code)
- How to make LLMs say true things (2023)
- A Fast Post-Training Pruning Framework for Transformers (2022) (Code)
- Awesome Prompt Engineering
- FlexGen - Running large language models on a single GPU. (HN) (HN)
- Butterfish - CLI tools for LLMs.
- Elk - Eliciting latent knowledge inside the activations of a language model.
- Neurosymbolic Reading Group
- One Embedder, Any Task: Instruction-Finetuned Text Embeddings (2022) (Code)
- Fine-tune FLAN-T5 for chat & dialogue summarization (2022)
- Cohere Playground - Summarize texts up to 50K characters.
- SGPT: GPT Sentence Embeddings for Semantic Search (2022) (Code)
- PromptKG - Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list.
- Text generation web UI - Gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, GPT-Neo, and Pygmalion.
- Knowledge is a Region in Weight Space for Fine-tuned Language Models (2023)
- LangChain Sidecar - UI starterkit for building LangChain apps that can be embedded on any website, similar to how Intercom can be embedded.
- embedland - Collection of text embedding experiments.
- Understanding large language models
- MindsJS - Build your workflows and app backends with large language models (LLMs) like OpenAI, Cohere and AlephAlpha.
- LLaMA Inference code
- Language Is Not All You Need: Aligning Perception with Language Models (2023) (Tweet)
- LLMs are compilers (2023) (Lobsters)
- Beating OpenAI CLIP with 100x less data and compute (2023) (HN)
- SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
- TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks
- LLM Security - New ways of breaking app-integrated LLMs.
- LangChain Chat
- Awesome Generative Information Retrieval
- Facebook LLAMA is being openly distributed via torrents (2023)
- Batch Prompting: Efficient Inference with Large Language Model APIs (2023) (Code)
- Local attention - Implementation of local windowed attention for language modeling.
- Tiktokenizer - Online playground for OpenAPI tokenizers. (Code)
- LLaMA: INT8 edition - Hastily quantized inference code for LLaMA models.
- The Waluigi Effect: an explanation of bizarre semiotic effects in LLMs (2023) (HN)
- Vellum - Dev platform for LLM apps. (HN)
- Large Language Model Training Playbook
- Inference-only implementation of LLaMA in plain NumPy
- GPT-3 will ignore tools when it disagrees with them (2023)
- Palm-E: An Embodied Multimodal Language Model (2023) (HN)
- UForm - Multi-Modal Inference Library For Semantic Search Applications and Mid-Fusion Vision-Language Transformers.
- Basaran - Open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
- 4 bits quantization of LLaMa using GPTQ
- ClickPrompt - Streamline your prompt design.
- Fork of Facebook’s LLaMa model to run on CPU (HN)
- Running LLaMA 7B on a 64GB M2 MacBook Pro with llama.cpp (2023)
- Llama.cpp - Port of Facebook's LLaMA model in C/C++, with Apple Silicon support. (HN)
- Large language models are having their Stable Diffusion moment right now (2023) (HN)
- Vaporetto - Fast and lightweight pointwise prediction-based tokenizer.
- Using LLaMA with M1 Mac (2023) (HN)
- Dalai - Automatically install, run, and play with LLaMA on your computer. (HN) (Code)
- What is Temperature in NLP? (2021) (HN)
- FLAN Instruction Tuning
- Minimal LLaMA
- ALLaMo - Simple, hackable and fast implementation for training/finetuning medium-sized LLaMA-based models.
- Stanford Alpaca - Instruction-following LLaMA model. (HN) (Web) (HN) (HN) (Web)
- Modern language models refute Chomsky’s approach to language (2023)
- High-throughput Generative Inference of Large Language Models with a Single GPU (2023) (HN)
- LLaMA-rs - Run LLaMA inference on CPU, with Rust. (HN)
- llama-dl - High-speed download of LLaMA, Facebook's 65B parameter GPT model. (HN)
- RLLaMA - Rust+OpenCL+AVX2 implementation of LLaMA inference code.
- Self-Instruct: Aligning Language Model with Self Generated Instructions (2022) (Code)
- LLaMA - Run LLM in A Single 4GB GPU
- GPT-4 (2023) (HN) (Demo) (Tweet) (Tweet)
- Evals - Framework for evaluating OpenAI models and an open-source registry of benchmarks. (HN)
- Anthropic | Introducing Claude (2023) (HN)
- Prompt in Context-Learning - Awesome resources for in-context learning and prompt engineering.
- GPT-4 System Card (2023)
- LangFlow - User Interface For LangChain.
- Alpaca-LoRA: Low-Rank LLaMA Instruct-Tuning
- Paper list of "The Life Cycle of Knowledge in Big Language Models: A Survey"
- AI Q&A for huggingface/diffusers
- bloomz.cpp - Inference of HuggingFace's BLOOM-like models in pure C/C++.
- MiniLLM: Large Language Models on Consumer GPUs
- Guardrails - Python package for specifying structure and type, validating and correcting the outputs of large language models.
- Alpaca.cpp - Run an Instruction-Tuned Chat-Style LLM on a MacBook. (HN)
- TextSynth Server - REST API to large language models. (HN)
- Wolverine - Give your python scripts regenerative healing abilities.
- Recursive LLM prompts - Implement recursion using English as the programming language and GPT as the runtime.
- Alpaca-LoRA as a Chatbot Service
- Simple UI for LLaMA Model Finetuning
- Serge - Web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
- llama-cli - Self-hosted, Simple LLaMA/alpaca API & CLI written in go.
- Kor - Extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples.
- Cheating is all you need (2023) (HN)
- Prompt Engineering (2023)
- Prompt Engineering Guide
- Anthropic Python SDK - Access to Anthropic's safety-first language model APIs.
- Alpaca-LoRA with Docker (HN)
- Autodoc - Toolkit for auto-generating codebase documentation using LLMs. (HN)
- Dolly - Fine-tunes the GPT-J 6B model on the Alpaca dataset using a Databricks notebook.
- Reflexion: an autonomous agent with dynamic memory and self-reflection (2023) (Code)
- CodeAlpaca – Instruction following code generation model (HN)
- LLaMA retrieval plugin - Using OpenAI's retrieval plugin.
- Retrieval in LangChain (2023) (HN)
- Open Sourcing Cody – Sourcegraph's AI-enabled editor assistant (2023) (HN)
- Cerebras-GPT: A Family of Open, Compute-Efficient, Large Language Models (2023) (HN)
- Lit-LLaMA - Open-source implementation of LLaMA. (HN)
- GPT4All - Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa. (HN) (Tweet)
- LLMs and GPT: Some of my favorite learning materials (HN)
- Malleable software in the age of LLMs (2023)
- Llama.cpp 30B runs with only 6GB of RAM now (2023) (HN)
- Baseplate - Back end-as-a-service for LLM apps. (HN)
- Vocode - Library for voice conversation with LLMs. (HN) (Code)
- Finetuning LLMs on a Single GPU Using Gradient Accumulation (2023)
- Marvin - Build AI functions that use an LLM as a runtime. (HN)
- llm-cli - Access large language models from the command-line.
- xturing - Build and control your own LLMs.
- Five years of GPT progress (2023)
- Actually, Othello-GPT Has A Linear Emergent World Representation (2023)
- PromptPerfect - Elevate your prompts to perfection. (HN)
- gpt4all.cpp
- Wove - Tool for building long-running workflows with LLMs.
- Eight Things to Know about Large Language Models (2023) (HN)
- MiniChain - Tiny library for coding with large language models.
- Dynamic Web Interface - Dynamically generate UI with Large Language Models.
- JARVIS - System to connect LLMs with ML community.
- OpenFlamingo - Open-source framework for training large multimodal models.
- LangChain Tutorials - Overview and tutorial of the LangChain Library.
- babyagi - Python script is an example of an AI-powered task management system.
- Implicit Representations of Meaning in Neural Language Models (2023)
- LLM playground you can run on your laptop (HN)
- Introducing Agents in Haystack: Make LLMs resolve complex tasks (2023) (HN)
- Fast Inference Solutions for BLOOM
- LLM Augmenter
- gpt4all-ts - TypeScript library that provides an interface to interact with GPT4All.
- PyLLaMACpp - Official supported Python bindings for llama.cpp + gpt4all.
- Baize - Open-source chatbot trained with ChatGPT self-chatting data.
- e2b - IDE powered by AI agents. Developers describe what they want to build by writing documentation.
- RPTQ: Reorder-based Post-training Quantization for Large Language Models (2023) (Code)
- Module: AI-Assisted Code Generation
- Alpaca Libre - Reimplementation of the task generation part from the Alpaca paper.
- TextAugment - Text Augmentation Library.
- NLP Test: Deliver Safe & Effective Models
- nanoT5 - Fast & Simple repository for pre-training and fine-tuning T5-style models.
- llama_infer - Inference script for Meta's LLaMA models using Hugging Face wrapper.
- OpenAI Tokenizer (HN)
- GradientJ - Build NLP Applications Faster with LLMs. (HN)
- Simply explained: how does GPT work? (2023) (HN)
- LLM Agents - Build agents which are controlled by LLMs.
- LLMParser - Classify and extract structured data with LLMs.
- ChatArena - Multi-Agent Language Game Environments for LLMs.
- State-of-the-art open-source chatbot, Vicuna-13B, just released model weights (HN)
- Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (2023) (HN)
- Diffusion language models (2023)
- Tabby - Self-Hosted GitHub Copilot. (HN)
- LMQL - Query language for programming (large) language models.
- Instruction Tuning with GPT-4 (2023) (Code)
- UltraChat: Large-scale, Informative, and Diverse Multi-round Dialogue Data
- GPTCache - Library for Creating Semantic Cache for LLM Queries.
- clip-interrogator - Image to prompt with BLIP and CLIP.
- StackLlama: A hands-on guide to train LlaMa with RLHF
- XPretrain - Multi-modality pre-training.
- Alpaca-Turbo - Web UI to run alpaca model locally.
- EVAL - Elastic Versatile Agent with Langchain. Will execute all your requests. Just like an eval method.
- GPT4All.zig - Run a GPT4All model locally.
- LLaMA-Adapter: Efficient Fine-tuning of LLaMA
- llama-node - Large Language Model LLaMA on node.js.
- LangChain as an AIPlugin
- Motörhead - Memory and information retrieval server for LLMs.
- Flan-Eval: Reproducible Held-Out Evaluation for Instruction Tuning
- LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models (2023) (Code)
- AgentGPT - Assemble, configure, and deploy autonomous AI Agents in your browser. (Web)
- LLaMa-Pruning: Structural Pruning for LLaMa
- Simple Hierarchical Transformer
- The LLama Effect: Leak Sparked a Series of Open Source Alternatives to ChatGPT (2023) (HN)
- Baby GPT (HN)
- Large Language Models Are Human-Level Prompt Engineers (2023) (HN)
- LLaMA_MPS - Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
- Ask HN: Open source LLM for commercial use? (2023)
- DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation (2022)
- Why is GPT-3 15.77x more expensive for certain languages? (2023)
- Data processing pipeline for the Koala chatbot language model
- TurboPilot - Self-hosted copilot clone which uses the library behind llama.cpp to run the 6 Billion Parameter Salesforce Codegen model in 4GiB of RAM.
- Prompt Lang - Mini programming language for prompting LLMs.
- Python Bindings for llama.cpp
- Maximizing the Potential of LLMs: A Guide to Prompt Engineering (2023) (HN)
- LLM-Chain - Rust crate for building chains in large language models allowing you to summarise text and complete complex tasks.
- PhaseLLM - Large language model evaluation and workflow framework from Phase AI. (HN)
- The Coming of Local LLMs (2023) (HN)
- CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society
- BabyAGI: an Autonomous and Self-Improving agent, or BASI
- I built my own AutoGPT that makes videos (2023)
- How does LangChain actually work? (2023)
- Awesome Decentralized LLM
- OpenAGI: When LLM Meets Domain Experts (2023) (Code)
- Teaching Large Language Models to Self-Debug (2023)
- Flowise - LangchainJS UI - Drag & drop UI to build your customized LLM flow using LangchainJS.
- LlamaChat - Chat with your favorite LLaMA models, right on your Mac.
- BabyAGI JS
- LLaMA.go - Like llama.cpp in pure Go.
- Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM (2023) (Reddit)
- How does a Large Language Model like ChatGPT actually work?
- Building LLM applications for production (2023) (HN)
- Poking around OpenAI (2023) (Lobsters)
- Web LLM - WebGPU Powered Inference of Large Language Models. (HN)
- Databerry - Connect your data to large language models.
- StableLM - Stability AI Language Models. (Article) (HN)
- Web LLM runs the vicuna-7b Large Language Model entirely in your browser, and it’s very impressive (2023) (Lobsters)
- How does GPT-3 spend its 175B parameters? (2023)
- Peak LLM? (2023) (HN)
- Autonomous Agents and Agent Simulations (2023) (HN)
- How to train your own large language models (2023) (HN)
- Question Extractor - Generate question/answer training pairs out of raw text.
- An example of LLM prompting for programming (2023) (HN)
- Awesome Adapter Resources - Collection of Tools and Papers related to Adapters (aka Parameter-Efficient Transfer Learning/ Fine-Tuning).
- Keeping Track of Affordable Language Models, 🦙 Cult and More
- Reasoning with Language Model Prompting Papers
- InsightFlow - LLM-based tool for parsing information and chatting with it.
- Megabots - State-of-the-art, production ready LLM apps made mega-easy.
- openpm-langchain - Openpm is a package manager for OpenAPI files.
- supercharger - Leverage locally-hosted Large Language Models to write software + unit tests for you.
- LLaVA: Large Language and Vision Assistant - Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities. (Web)
- GPT4All Chat - Locally-running AI chat application powered by the GPT4All-J.
- AI Playground by Vercel Labs
- Understanding Large Language Models (2023)
- RedPajama - Create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens.
- How to run your own LLM (GPT) (2023)
- H2O LLM Studio - Framework and no-code GUI for fine-tuning LLMs.
- Finetuning Large Language Models (2023) (HN)
- LLaMA-8bit-LoRA - Training a LoRA for the LLaMA model on HuggingFace with 8-bit quantization.
- Alpaca Electron - Simpler way to run Alpaca.
- Large Language Models: Scaling Laws and Emergent Properties (2023)
- Prompt engineering vs. blind prompting (2023) (HN)
- CameLLM - Run your favorite LLMs locally on macOS from Swift.
- Pretraining Language Models with Human Preferences (2023) (Code)
- LocalAI - OpenAI compatible API to run LLM models locally on consumer grade hardware. (Reddit)
- Reverse engineering of Google's Bard API
- How many decisions and pre-processing steps you need to train large language models (LLMs) such as LLaMA (2023)
- gpt-llama.cpp - Llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
- Micro Agent - Tiny implementation of an autonomous agent powered by LLMs (OpenAI GPT-4).
- spacy-huggingface-pipelines - Use pretrained transformer models for text and token classification.
- LLMSurvey - Collection of papers and resources related to Large Language Models.
- Sapiens - Fun with chatGPT API, llm-chain and huelib.
- Anthropic TypeScript SDK
- Recurrent Memory Transformer (2022) (Code)
- Dataless Knowledge Fusion by Merging Weights of Language Models (2023) (Code)
- PotatoGPT - Pure Typescript, dependency free, ridiculously slow implementation of GPT2 for educational purposes.
- Aria - AI Research Assistant Powered by Large Language Models.
- igoGPT - Tool inspired by AutoGPT implemented in Go.
- The Dual LLM pattern for building AI assistants that can resist prompt injection (2023) (Lobsters)
- HuggingChat (HN) (Code) (HN)
- PromptWatch - LangChain tracing on steroids.
- QuiLLMan - Voice Chat with LLMs. (HN)
- NeMo Guardrails - Toolkit for easily adding programmable guardrails to LLM-based conversational systems.
- LLMComposer - Go framework for language model-powered applications with composability and chaining. Inspired by LangChain.
- MicroGPT - Minimal general-purpose autonomous agent based on GPT-3.5 / GPT-4.
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (2023) (HN)
- A guide to prompting AI, for what it is worth (2023) (HN)
- deepdoctection - Document extraction and analysis using deep learning models. (HN)
- LangForge - Toolkit for Creating and Deploying LangChain Apps.
- Self-INSTRUCT - Similar to Auto-GPT but with better reasoning and planning.
- Semantic Tokenizer for Enhanced Natural Language Processing (2023) (HN)
- Current architectural best practices for LLM applications (2023) (HN)
- You probably don't know how to do Prompt Engineering (2023) (HN)
- Lamini - LLM engine for rapidly customizing models.
- Stability AI releases StableVicuna, a RLHF LLM Chatbot (2023) (HN)
- A brief history of LLaMA models (2023) (HN)
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
- Introducing Lamini, the LLM Engine for Rapid Customization (2023)
- The Practical Guides for Large Language Models - Curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers). (Paper)
- Instruction Tuning Papers
- WangChanGLM - Multilingual Instruction-Following Model.
- Otter - Instruction-tuned model built upon OpenFlamingo that has been customized for a context.
- Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models (HN)
- OpenLLaMA - Open Reproduction of LLaMA. (HN)
- Avoiding hallucinations in LLM-powered applications (2023) (HN)
- Replit's new Code LLM: Open Source, 77% smaller than Codex, trained in 1 week (2023) (HN)
- LangChain Supabase template
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
- llm.ts - Call any LLM with a single API. Zero dependencies.
- Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes (2023) (HN)
- Augmenting LLMs Beyond Basic Text Completion and Transformation (2023) (HN)
- Fixing Hallucination with Knowledge Bases (HN)
- Re-implementing LangChain in 100 lines of code (2023) (HN)
- gpt-json - Structured and typehinted GPT responses in Python. (HN)
- BMTools - Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins.
- AutoGPTQ - Easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- StarCoder - Language model (LM) trained on source code and natural language text.
- ReLLM: Exact Structure for Large Language Model Completions (2023) (HN)
- Releasing 3B and 7B RedPajama (2023) (HN)
- Unlimiformer: Long-Range Transformers with Unlimited Length Input (2023) (HN) (Code)
- MosaicML MPT-7B: A Commercially-Usable LLaMa-Quality Model (2023) (HN) (Reddit)
- WizardVicunaLM - LLM that combines the principles of wizardLM and vicunaLM.
- Dolphin - General video interaction platform based on LLMs.
- Uses Auto-GPT with Llama.cpp
- Open LLMs
- Awesome Instruction Dataset
- VardaGPT - Associative memory-enhanced GPT-2 model.
- LLM Foundry - LLM training code for MosaicML foundation models.
- ThinkGPT - Agent techniques to augment your LLM and push it beyong its limits.
- Indexify - Document Indexing Service with SOTA embedding models and Pluggable Vector Stores.
- ReplitLM - Inference code and configs for the ReplitLM model family.
- ReLLM - Regular Expressions for Language Model Completions.
- Pre-Training to Learn in Context (2023)
- PandaLM: Reproducible and Automated Language Model Assessment
- RasaGPT - Headless LLM chatbot platform built on top of Rasa and Langchain. (HN)
- privateGPT - Ask questions to your documents without an internet connection, using the power of LLMs. (HN)
- Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models (2023) (Code)
- AutoGPT.js - Auto-GPT on the browser.
- Lit-Parrot - Implementation of the StableLM/Pythia/INCITE language models based on nanoGPT. Supports flash attention, LLaMA-Adapter fine-tuning, pre-training.
- HugNLP - Unified and comprehensive NLP library based on HuggingFace Transformer.
- Automated interpretability
- OpenLM - OpenAI-compatible Python client that can call any LLM.
- Dromedary - Towards helpful, ethical and reliable LLMs.
- PaLM - Open-source implementation of Google's PaLM models.
- Pretraining Without Attention (2022) (Code)
- Hugging Face Releases Agents (HN)
- WizardLM: An Instruction-following LLM Using Evol-Instruct
- The Leverage of LLMs for Individuals (2023) (HN)
- PaLM 2 Technical Report (2023) (HN)
- Tips and tricks for working with Large Language Models like OpenAI's GPT-4 (HN)
- Open-Llama - Complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
- LangChain Go
- Anthropic - Introducing 100K Token Context Windows, Around 75,000 Words (2023) (HN)
- SmartGPT - Program that provides LLMs with the ability to complete complex tasks using plugins.
- LangChainTS Starter
- How to run Llama 13B with a 6GB graphics card (2023) (HN)
- Context-Free Grammar Parsing with LLMs (2023)
- Smol Developer - Human-centric & Coherent Whole Program Synthesis aka your own personal junior developer. (HN)
- Google Bard API
- StarCoder: may the source be with you! (2023) (HN)
- React LLM - Easy-to-use headless React Hooks to run LLMs in the browser with WebGPU.
- Guidance - Guidance language for controlling large language models. (HN)
- Dify - One API for plugins and datasets, one interface for prompt engineering and visual operation, all for creating powerful AI applications.
- Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 (2023) (Code)
- BriefGPT - Locally hosted tool that connects documents to LLMs for summarization and querying, with a simple GUI.
- StructGPT: A General Framework for Large Language Model to Reason over Structured Data (2023) (Code)
- Numbers every LLM Developer should know (HN)
- Rebuff - Prompt Injection Detector.
- API Bot tutorial
- openai-ext - Extension to OpenAI's API to support streaming chat completions.
- Langdock - Open Source LLM Plugin Platform..
- Awesome LangChain
- SuperAgent - Deploy LLM Agents to production.
- TokenHawk - WebGPU LLM inference tuned by hand.
- LMDB - Database powered by language models.
- ChatALL - Chat with ALL AI Bots Concurrently, Discover the Best.
- Zeno Build - Build, evaluate, analyze, and understand LLM-based apps.
- TinyStories: How Small Can Language Models Be and Still Speak Coherent English? (2023)
- LangChain: The Missing Manual (2023) (HN)
- LLM VM - LLM infrastructure for developers.
- LLM-Pruner - On the Structural Pruning of Large Language Models.
- Poe API Node
- Danswer - OpenSource Enterprise Question-Answering. (HN)
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
- Langchain Chat Bot - AI Chatbot for analyzing/extracting information from data in conversational format. Mainly PDF files.
- Lanarky - Ship production-ready LLM projects with FastAPI.
- Emdash - Uses AI to organize text snippets so you can actually remember & learn from what you read.
- PyLLMs - Minimal Python library to connect to LLMs (OpenAI, Anthropic, AI21, Cohere, Aleph Alpha, HuggingfaceHub, Google PaLM2, with a built-in model performance benchmark.
- string2string - String-to-String Algorithms for Natural Language Processing.
- gmessage - Web UI for gpt4all.
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2023) (Code) (Code)
- llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs (2023)
- GPTeam - Open-source multi-agent simulation.
- python-llm - LLM API for Humans.
- Plug and Plai - Integrating AI plugins to LLMs.
- ChatGPT, GenerativeAI and LLMs Timeline
- Agent Smith - Customizable CLI agents to answer all your quick questions.
- Open LLM Server - Run local LLMs via HTTP API in a single command.
- Pinecone Chatbot Demo
- Awesome Rust LLM
- BLOOMChat Training Repo
- Zep - Long-term memory store for LLM / Chatbot applications.
- Mercury - Train your own custom GPT. Chat with any file, or website.
- Nomic - Interact with Massive Embedding and Text Datasets in Your Web Browser.
- Open LLM Leaderboard
- Chatbot Arena Leaderboard
- Webpilot - Copilot for web. Allows you to have free-form conversations with web pages or engage in automatic arguments with other users.
- Attention - Visualizing attention for LLM users.
- WorkGPT
- Python Poe API - Reverse engineered API wrapper for Quora's Poe, which allows you free access to OpenAI's ChatGPT and GPT-4, as well as Antropic's Claude.
- yay - Interact with OpenAI API from command line.
- RWKV: Reinventing RNNs for the Transformer Era (2023) (HN)
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages (2023) (Code)
- OpenAI Manager - Speed up your OpenAI requests by balancing prompts to multiple API keys.
- GirlfriendGPT - Python project to build your own AI girlfriend using ChatGPT 4.0.
- TheoremQA: A Theorem-driven Question Answering dataset (2023) (Code)
- GPT Code UI - Open source implementation of OpenAI's ChatGPT Code interpreter.
- airoboros - Customizable implementation of the self-instruct paper.
- Glass - Domain-specific language for interacting with language models.
- Fondant - Sweet data-centric foundation model fine-tuning.
- OpenLLaMa on AWS Lambda
- GPT Tokenizer - JavaScript BPE Encoder Decoder for GPT-2 / GPT-3 / GPT-4.
- AlpacaFarm - Simulation Framework for RLHF and alternatives.
- RAVE-Latent Diffusion - Generate new latent codes for RAVE with Denoising Diffusion models.
- DiffusionNER: Boundary Diffusion for Named Entity Recognition (2023) (Code)
- LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model (2023) (Code)
- RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text (2023)
- minicons - Utility for analyzing Transformer based representations of language.
- InfiniteGPT - Python script that lets you input an unlimited size text into the OpenAI API. No more tedious copy & pasting. Long live multithreading.
- Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs (2023) (HN)
- ChainForge - Open-source visual programming environment for battle-testing prompts to LLMs.
- HelpHub - GPT chatbot for any site. (HN)
- How to Finetune GPT Like Large Language Models on a Custom Dataset (2023) (HN)
- Gorilla: Large Language Model Connected with Massive APIs (2023) (HN)
- LoopGPT - Modular Auto-GPT Framework.
- Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
- How to Finetune GPT-Like Large Language Models on a Custom Dataset (2023) (HN)
- A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models (2023) (HN)
- Voyager | An Open-Ended Embodied Agent with Large Language Models (2023) (HN)
- The False Promise of Imitating Proprietary LLMs (2023) (HN)
- Voyager: An Open-Ended Embodied Agent with Large Language Models (2023) (Code)
- MPT-Play - Command-line script for inferencing from models such as MPT-7B-Chat.
- State of GPT (2023)
- Gorilla: Large Language Model Connected with Massive APIs (2023) (Code) (HN)
- iX - Autonomous GPT-4 Agent Platform.
- Hard stuff when building products with LLMs (2023) (HN)
- Making your LLM run Python code with
llm-chain-tools
in Rust (2023) - PrivateGPT - App to interact privately with your documents using the power of GPT, 100% privately, no data leaks.
- Salute - Build AI agents in a declarative way. Designed to be easy to use for both humans and AIs.
- UnlimitedGPT - Python wrapper for OpenAI's ChatGPT API.
- Macaw-LLM - Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration.
- Multi-Modal LangChain agents in Production
- PandaGPT - One Model To Instruction-Follow Them All.
- Redco - Distributed LLM training with a single line of code.
- Ask HN: What's the best self hosted/local alternative to GPT-4? (2023)
- localGPT - Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
- SuperAGI - Build and run useful autonomous agents.
- Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance (HN)
- Scaling Data-Constrained Language Models (2023)
- Awesome Multimodal LLM
- PRM800K: A Process Supervision Dataset
- Large Language Models as Tool Makers (2023) (Code)
- Landmark Attention: Random-Access Infinite Context Length for Transformers (2023) (Code)
- Resources to Help Global Equality for PhDs in NLP / AI
- Aviary - Evaluate multiple LLMs easily.
- Falcon 40B LLM (which beats Llama) now Apache 2.0 (HN)
- Ambrosia - Clean up your LLM datasets.
- Langchain, Pinecone, and GPT with Next.js - Full Stack Starter
- Can AI Code? - Self-evaluating interview for AI coding models.
- Leaked Prompts
- AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration (2023) (Code)
- h2oGPT - Open-source GPT with document and image Q&A, 100% private chat, no data leaks.
- AIAvatarKit - Building AI-based conversational avatars lightning fast.
- Chainlit - Build Python LLM apps in minutes.
- CodeTF - One-stop Transformer Library for State-of-the-art Code LLM.
- A Minimal LLM API Starterkit with FastAPI and LangChain (2023)
- Pare - API for LLMs.
- BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks (2023) (Code)
- InternLM - Multilingual foundational language model with 104B parameters.
- LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed
- llmchain - Rust + Large Language Models - Make AI Services Freely and Easily. Inspired by LangChain.
- tiktoken-rs - Ready-made tokenizer library for working with GPT and tiktoken.
- gptee - LLMs done the UNIX-y way.
- femtoGPT - Pure Rust implementation of a minimal Generative Pretrained Transformer.
- Bytes Are All You Need: Transformers Operating Directly On File Bytes (2023) (HN)
- LLamaFlow - Typescript-first prompt engineering toolkit for working with chat based LLMs.
- Awesome Graph LLM
- What are embeddings - Deep dive into embeddings starting from fundamentals.
- GPT best practices (HN)
- The Falcon has landed in the Hugging Face ecosystem (2023)
- Reverse engineered API for Quora Poe
- Gopilot - 290M parameters language model trained exclusively on Go code using a small research budget (~100$).
- Proxy GPT - GPT backend powered by OpenAI.
- Understanding GPT Tokenizers (2023) (HN)
- Agnaistic - AI Agnostic (Multi-user and Multi-bot) Chat with Personalised Characters. Designed with scale in mind.
- FileGPT - Start a chat with any document with Ada Embedding and Davinci Completion.
- Efficient Long Sequence Modeling via State Space Augmented Transformer (2022) (Code)
- Daneel - Template for an OpenAI chat bot app, built with React, Tailwind and TypeScript.
- RedPajama 7B now available, instruct model outperforms all open 7B models on HELM benchmarks (2023)
- ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models (2023) (Code)
- Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method (2023) (Code)
- ExLlama - More memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
- Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs (2023) (Code)
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (2023) (Code)
- How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources (2023) (Code)
- GPT Engineer - Specify what you want it to build, the AI asks for clarification, and then builds it. (HN)
- GPT4Free TypeScript Version
- Poe OpenAI Proxy
- ExpertLLaMA - Open source ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.
- OpenLLM - Open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. (HN)
- ChatGLM-finetune-LoRA
- Llama.cpp: Full CUDA GPU Acceleration (HN)
- Orca: Progressive Learning from Complex Explanation Traces of GPT-4 (2023) (HN)
- garak - Security probing tool for LLMs.
- JS tokenizer for LLaMA based LLMs (HN)
- LLaMA-TRL - Fine-tuning LLaMA with PPO and LoRA.
- Native JSON Output from GPT-4 (2023) (HN)
- SqueezeLLM: Dense-and-Sparse Quantization (2023) (Code)
- TruLens - Evaluation and Tracking for LLM Experiments.
- WizardLM - Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning.
- PromptForge - AI assistant for prompt engineers.
- ChainFury - Build complex chat apps using LLMs in 4 clicks.
- RecurrentGPT - Official Code for Paper: RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text.
- SillyTavern - LLM Frontend for Power Users.
- Explore large language models on any computer with 512MB of RAM
- smol logger - Minimal viable logger for Prompt/LLM Engineering.
- The Secret Sauce behind 100K context window in LLMs: all tricks in one place (2023)
- OpenLLaMA 13B Released (HN)
- BigTrans - Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages.
- simpleaichat - Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
- Floneum - Graph editor for local AI workflows. (Intro) (HN)
- Full Parameter Fine-tuning for Large Language Models with Limited Resources (2023) (Code)
- vLLM - Easy, fast, and cheap LLM serving for everyone.
- Emerging architectures for LLM applications (2023) (HN)
- vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (HN)
- Textbooks Are All You Need (2023) (HN)
- Navigating Sharp Edges in OpenAI's Function Call Feature (2023)
- Pruning by Weights and Activations
- Rift - AI-native language server for your personal AI software engineer.
- Large Language Model Course
- reliableGPT - Stop OpenAI Errors in Production.
- Promptrix - Prompt layout engine for Large Language Models.
- embedchain - Framework to easily create LLM powered bots over any dataset.
- LLM-based code completion engine
- LLM Fine Tuning Guide for Enterprises in 2023
- IntelliNode - Unified access to various AI models, such as ChatGPT, Diffusion, Cohere, and others, using a few JS lines.
- LMFlow - Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.
- LLaMA Server - Combines the power of LLaMA C++ with the beauty of Chatbot UI.
- MotionGPT: Human Motion as a Foreign Language, a unfied motion-language generation model using LLMs
- Pezzo - Open-source AI development toolkit designed to streamline prompt design, version management, publishing, collaboration, troubleshooting, observability and more.
- LangChain Decorators - Syntactic sugar for LangChain.
- LangChain Guide to Get Started (2023) (HN)
- MPT-30B: Raising the bar for open-source foundation models (2023)
- From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought (2023)
- The Magic of Embeddings (2023)
- What are embeddings? (HN)
- ChatHN - Chat with Hacker News using natural language. Built with OpenAI Functions and Vercel AI SDK.
- LLM Powered Autonomous Agents (2023) (HN)
- Attempting Large Code Refactor using LLMs (Lobsters)
- XGen-7B, a new 7B foundational model trained on up to 8K length for 1.5T tokens (2023) (HN)
- Gorilla CLI - LLMs for CLI including K8s/AWS/GCP/Azure/sed and 1500 APIs. (HN)
- How long can open-source LLMs truly promise on context length? (2023) (HN)
- Training LLMs with AMD MI250 GPUs and MosaicML (2023)
- GPT-Migrate - Easily migrate your code from one framework or language to another. (HN)
- Awesome LLM Compression
- XGen - Salesforce open-source LLMs with 8k sequence length.
- Training LLMs with AMD MI250 GPUs and MosaicML (2023)
- C++ GPT-2 inference engine (HN)
- HeimdaLLM - Use LLMs to construct trusted output from untrusted input.
- Swarms - Automating all digital activities with AI Agents.
- AnythingLLM - Full-stack application that turns any documents into an intelligent chatbot with a sleek UI and easier way to manage your workspaces.
- LongChat - Supports training and evaluating long-context LLM based chatbots.
- Falcon LLM - Helper scripts and examples for exploring the Falcon LLM models.
- InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback (2023) (Code)
- OpenLLMs: Less is More for Open-source Models
- Generating Images with Multimodal Language Models
- InternLM - 7 billion parameter base model, a chat model tailored for practical scenarios and the training system. (HN)
- LongNet: Scaling Transformers to 1B Tokens (2023) (HN)
- LlamaIndex - Data Framework for LLM Applications. (HN)
- LiteChain - Build robust LLM applications with true composability.
- Langchain Is Pointless (2023) (HN)
- LLM CLI tool now supports self-hosted language models via plugins (2023)
- Hacking LangChain for fun and profit (2023) (HN)
- Claude 2 (2023) (HN)
- GPT-Prompt-Engineer (HN)
- Classifying customer messages with LLMs vs traditional ML (2023) (HN)
- Auto-Evaluator - App to evaluate the performance of question-answering LLM chains. (Web)
- AI Companion with Memory - Lightweight stack to create and host your own AI companions.
- OpenAI app using OpenAI GPT Plugins and Replicate to combine all AI APis into one
- local.ai - Desktop app for local, private, secured AI experimentation.
- Shikra - Unleashing Multimodal LLM’s Referential Dialogue Magic.
- Blockwise Parallel Transformer for Long Context Large Models (2023) (Code)
- Mining Meaningful Methods from Large Language Models (2023)
- Ziplm - Gzip-Backed Language Model. (HN)
- Aya: An open science project to build open multilingual models and datasets (2023)
- Tinygrad + rusticl + aco: why not? (2023) (HN)
- magentic - Seamlessly integrate LLMs as Python functions.
- lambeq - High-level Python library for Quantum Natural Language Processing.
- GPT Researcher - GPT based autonomous agent that does online comprehensive research on any given topic.
- Managing LLM Context Is a Knapsack Problem (2023)
- "Attention", "Transformers", in Neural Network "Large Language Models" (2023)
- Awesome Efficient LLM
- FastEdit - Editing large language models within 10 seconds.
- Claude 2 Internal API Client and CLI (HN)
- CLIPascene: Scene Sketching with Different Types and Levels of Abstraction (Code)
- apca - Crate for interacting with the Alpaca API at alpaca.markets.
- Petals Chat - Chatbot web app + HTTP and WebSocket endpoints for LLM inference with the Petals client.
- LLM Training Puzzles
- The Future of LLMs with Arthur, MosaicML, LangChain, and Weaviate (2023)
- Cody - Code AI with codebase context.
- Planting a SEED of Vision in Large Language Model
- Automorphic - Structured output from LLMs without reprompting. (Code) (HN)
- Large Language Models Are Human-Level Prompt Engineers
- Copy is all you need (2023) (HN)
- Llama 2 – Meta AI (2023) (HN)
- AutoChain - Build lightweight, extensible, and testable LLM Agents. (HN)
- Llama 2 Fine-tuning / Inference Recipes and Examples
- llm-replicate - LLM plugin for models hosted on Replicate.
- llm-mpt30b - LLM plugin adding support for the MPT-30B language model.
- llm-palm - Plugin for LLM adding support for Google's PaLM 2 model.
- model-catalog - Collection of standardized JSON descriptors for Large Language Model (LLM) files.
- LLaMA 2 Chatbot App
- LLM Engine - Open source engine for fine-tuning large language models.
- Accessing Llama 2 from the command-line with the llm-replicate plugin (2023) (HN)
- LightGlue - Local Feature Matching at Light Speed.
- Cursive - Intuitive LLM framework.
- Llama 2: an incredible open LLM (2023)
- MiniGPT4 in C++ - 4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML. (HN)
- LLaMA Efficient Tuning - Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan).
- RealChar - Create, customize and talk to your AI character/companion in real time.
- promptfoo - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
- OpenCompass - LLM evaluation platform, supporting a wide range of models.
- LangSmith - Debugging, testing, evaluating, and monitoring for LLM applications.
- ACL 2023 Tutorial: Retrieval-Based Language Models and Applications (2023) (HN)
- ZodGPT - Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.
- Ollama - Run, create, and share large language models (LLMs). (HN)
- TypeChat (HN) (Code)
- LLM-Blender - Ensembling LLMs with Pairwise Ranking & Generative Fusion.
- LLM API - Fully typed & consistent chat APIs for OpenAI, Anthropic, Azure's chat models for browser, edge, and node environments.
- CopilotKit - Add a powerful & hackable copilot to any app, in an afternoon.
- llmss - LLM simple serving (tensor model parallel, pubsub, grpc).
- BabyAGI UI - Make it easier to run and develop with babyagi in a web app, like a ChatGPT.
- Code Review GPT - Personal code reviewer powered by LLMs.
- mPLUG-Owl - Modularization Empowers Large Language Models with Multimodality.
- Langroid - Harness LLMs with Multi-Agent Programming.
- In the LLM space, "open source" is being used to mean "downloadable weights" (2023) (HN)
- Llama: Add grammar-based sampling (HN)
- LightLLM - Python-based LLM inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
- FreeWilly 1 and 2, two new open-access LLMs (2023) (HN)
- LLaMA 2 - Every Resource you need (2023)
- Lemon AI - Gateway to empower LLM agents to interact with the world.
- Llama2.c - Inference llama 2 in one file of pure C. (HN)
- What We Know About LLMs (2023) (HN)
- Marsha - Functional, higher-level, English-based programming language that gets compiled into tested Python software by an LLM. (HN)
- Comprehensive guide to running Llama 2 locally (2023) (HN)
- LLAMA.go - Native Go version of llama2.c.
- llama2 in Rust
- LLaMA2-Accessory - Open-source Toolkit for LLM Development.
- EasyEdit - Easy-to-use Framework to Edit Large Language Models.
- LLaMA2 WebUI - Run Llama 2 locally with gradio UI on GPU or CPU from anywhere.
- litellm - Lightweight package to simplify LLM API calls - Azure, OpenAI, Cohere, Anthropic. Manages input/output translation.
- How to scale LLMs better with an alternative to transformers (2023) (HN)
- MetaGPT - Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo.
- llama2.rs - Inference Llama 2 in one file of pure Rust.
- Universal and Transferable Attacks on Aligned Language Models
- Chidori - Reactive runtime for building durable AI agents.
- LLM Reading List
- A simple guide to fine-tuning Llama 2 (2023)
- llm - Access large language models from the command-line. (Docs)
- Large language models, explained with a minimum of math and jargon (2023)
- llama2.go - Go port of llama2.c.
- Alpaca Eval Leaderboard (HN)
- ResearcherGPT
- Llama2 LLM ported to Rust burn
- Awesome LLM Security
- FacTool - Fact-checking tool that detects factual errors.
- Prem - Unified environment to develop AI applications and deploy AI models on your infrastructure.
- Edmonbrain - Langchain driven project to create flexible LLM bots on Google Cloud Platform.
- Google Bard CLI in Rust
- LLMFlows - Simple, Explicit and Transparent LLM Apps.
- GrammarLLM - Grammar sampling with open source LLMs.
- llama2 - One-file Rust implementation of Llama2 that works pretty well. (HN)
- llama2.zig - Inference Llama 2 in one file of pure Zig.
- ToolBench - Open platform for training, serving, and evaluating large language model for tool learning.
- Text Generation Inference - Rust, Python and gRPC server for text generation inference.
- Awesome AI Agents
- LLMs walking the code graph (2023)
- Ax - Comprehensive AI framework for TypeScript.
- Sweep - AI junior developer. (HN)
- Awesome LLM-Powered Agent
- Llama2.jl - Llama2.c but in Julia.
- PromptTools - Open-source tools for evaluating LLMs and vector DBs. (HN)
- Run Llama 2 on your own Mac using LLM and Homebrew (2023) (HN)
- Alfred-40B, an OSS RLHF version of Falcon40B (2023) (HN)
- LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
- Gentopia - Build AGI through Interaction of Specialized Agents.
- NewHope: Harnessing 99% of GPT-4's Programming Capabilities
- Patterns for Building LLM-based Systems & Products (2023) (HN)
- BTLM-3B-8k: Small LLM that fits within 3GB of memory
- Run Llama 2 Uncensored Locally (2023) (HN)
- LocalLlama Reddit
- Why Gzip-KNN Works: The LZ77 Factor in Text Classification (2023)
- Catching up on the weird world of LLMs (2023) (Talk)
- wikivec2text - Simple embedding -> text model trained on a small subset of Wikipedia sentences.
- Add an AI Code Copilot to your product using GPT-4 (2023) (HN)
- ad-llama - Structured inference with LLaMa 2 in your browser.
- Gdansk AI - Full stack AI voice chatbot (speech-to-text, LLM, text-to-speech).
- wikid - Generate a SQLite database from Wikipedia & Wikidata dumps.
- OpenPipe - Test and deploy your llm prompts in a data-driven way on an open-source and self-hostable platform.
- Non-determinism in GPT-4 is caused by Sparse MoE (2023) (HN)
- Rust and LLM AI Infrastructure: Embracing the Power of Performance (2023)
- Mass Editing Memory in a Transformer (2023) (HN)
- Llama 2 Powered By ONNX (HN)
- LangChain x Pinecone: Supercharging Llama-2 with RAG (2023)
- Cria - OpenAI compatible API for serving LLAMA-2 model.
- Megatron LLM - Distributed trainer for LLMs.
- Multimodal Neurons in Pretrained Text-Only Transformers (2023)
- What's new in Llama 2 and how to run it locally (2023) (HN)
- MTEB: Massive Text Embedding Benchmark
- FlagEmbedding - Open-source Embeddings.
- Knit - Better LLM Playground. (HN)
- Rectified Rotary Position Embeddings (ReRoPE)
- ConformalLLM - Extending Conformal Prediction to LLMs.
- GPT-4 can't reason (2023) (HN)
- Agentflow - Complex LLM Workflows from Simple JSON.
- Chat with your data using OpenAI, Pinecone, Airbyte and Langchain (HN)
- Swift Chat and Language Model Tester - Shows how to integrate swift-transformers in a Swift app.
- A Simple and Effective Pruning Approach for Large Language Models (2023) (Code)
- AgentBench - Comprehensive Benchmark to Evaluate LLMs as Agents.
- Llama from Scratch (or how to implement a paper without crying) (2023) (HN)
- MTEB Leaderboard (Tweet)
- Fine-Tune LLaMA 2 with QLoRA (Tweet)
- Stack More Layers Differently: High-Rank Training Through Low-Rank Updates (2023) (Code)
- Generative Agents: Interactive Simulacra of Human Behavior
- LangChain "Advanced Retrieval" Webinar (2023)
- Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models (2023) (HN)
- liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching (HN)
- Beginner's Guide to Llama Models (2023) (HN)
- Awesome LLMOps
- ts_zip: Text Compression using Large Language Models (HN)
- Platypus - Series of fine-tuned variants based on the LLaMA and LLaMa 2 transformer architectures.
- Cumulative Reasoning with Large Language Models
- ggml.js - Run LLaMa2 on the Browser with Ggml.js. (Demo) (HN)
- Ask HN: I learned useless skill of prompt engineering, how relevant will it be? (2023)
- Agent Protocol - Common interface for interacting with AI agents.
- ContinualLM - Extensible Continual Learning Framework Focused on Language Models.
- Lit-GPT - Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT.
- Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies (2023) (Code)
- Council - Open-source platform for the rapid development and robust deployment of customized generative AI applications.
- Go TypeChat - Library that makes it easy to build natural language interfaces using types.
- Large Language Model Training Handbook
- Continuous batching to increase LLM inference throughput and reduce p50 latency (2023) (HN)
- LlaMA2 Embeddings FastAPI Server (HN)
- XLang Paper Reading - Paper collection on building and evaluating language model agents via executable language grounding.
- j-dev - Helps write code in an existing project under the command of an AI OpenAI GPT model.
- Using GPT-4 for content moderation (2023)
- Evaluating Language-Model Agents on Realistic Autonomous Tasks (2023)
- How Is LLaMa.cpp Possible? (2023) (HN)
- AI00 RWKV Server - Inference API server based on the RWKV model.
- AlpacaEval - Automatic Evaluator for Instruction-following Language Models.
- Neurips 1 LLM 1 GPU Challenge
- Ask HN: If we train an LLM with “data” instead of “language” tokens (2023)
- LlamaGPT - Self-hosted, offline, private AI chatbot, powered by Llama 2. (HN)
- Open Challenges in LLM Research (2023) (HN)
- DeepEval - Unit Testing for LLMs. (HN)
- Reading list of hallucination in LLMs
- Running my own LLM (2023)
- vThe Mathematics of Training LLMs (2023) (HN)
- Large Language Models are Zero-Shot Rankers for Recommender Systems (2023) (Code)
- Anti-hype LLM reading list (HN)
- You probably don’t need to fine-tune LLMs (2023)
- Poozle - Open Source Plaid for LLMs. (Web) (HN)
- Why GPT-3.5 is (mostly) cheaper than Llama 2 (2023)
- Bench - Tool for evaluating LLMs for production use cases.
- Opportunities and Risks of LLMs for Scalable Deliberation with Polis (2023)
- Why You (Probably) Don't Need to Fine-tune an LLM (2023) (Tweet)
- GPT-3.5 Turbo fine-tuning and API updates (2023) (HN)
- OpenCopilot - Build and embed open-source AI Copilots into your product with ease.
- Extending LLM Context Length
- Fast vector similarity using Rust and Python (HN)
- Lemur: The State-of-the-art Open Pretrained Large Language Models Balancing Text and Code Capabilities
- Openai Detector - Open AI classifier for indicating AI-written text.
- Graph of Thoughts (GoT) implementation
- GPT Pilot - How can GPT-4 be utilized to generate working apps.
- fairseq2: FAIR Sequence Modeling Toolkit 2
- SeamlessM4T - Foundational Models for State-of-the-Art Speech and Text Translation.
- ccserver - LSP server leveraging LLMs for code completion.
- prompt2model - Generate Deployable Models from Natural Language Instructions.
- Code Llama - Inference code for CodeLlama models. (HN) (Article) (HN)
- Inspecting and Editing Knowledge Representations in Language Models (2023) (Code)
- Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B (2023) (HN)
- go-llama.cpp - LLama.cpp golang bindings.
- Active Prompting with Chain-of-Thought for Large Language Models (2023) (Code)
- WebAI to API - ChatGPT, Claude, Bard to API.
- Graph of Thoughts: Solving Elaborate Problems with Large Language Models (2023) (HN)
- Awesome Visual Question Answering
- Build a chatbot with custom data sources, powered by LlamaIndex (2023)
- LLM.report - Open-source logging and analytics platform for OpenAI: Log your ChatGPT API requests, analyze costs, and improve your prompts.
- How Llama 2 learned how to code
- Fine-Tuning Embedding for RAG with Synthetic Data
- DoctorGPT - Advanced LLM prompting for PDFs and webpages.
- Lilac - Analyze, structure and clean unstructured data with AI.
- Llama-X - Open Academic Research on Improving LLaMA to SOTA LLM.
- Chatbox - Desktop app for multiple cutting-edge LLM models.
- GodMode - AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2.
- api2ai - Create API agents from OpenAPI Specs.
- Functionary - Chat language model that can interpret and execute functions/plugins.
- Out-of-the-box Large Language Model for Open Domain Sequence Understanding (2023)
- Experiments on speculative sampling with Llama models
- PMET: Precise Model Editing in a Transformer (2023) (HN)
- Personal assistant - Multiplatform app to run and chat with an AI locally.
- rust_llama.cpp - LLama.cpp rust bindings.
- OpenCopilot - AI Copilot for your own SaaS product. Open source AI sidekick for everyone.
- Lightning Segment-Anything Model - Fine-tune Segment-Anything Model with Lightning Fabric.
- AIScripts - Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub.
- WebLLM - Llama2 in the Browser. (HN)
- Langfuse - Open source observability and analytics for LLM applications. (HN)
- Llama from scratch - Llama from scratch, or How to implement a paper without crying.
- LLM Python/CLI tool adds support for embeddings (2023) (HN)
- Open Interpreter - OpenAI's Code Interpreter in your terminal, running locally.
- TinyLlama - Pretrain a 1.1B Llama model on 3 trillion tokens.
- Survey on LLM-based Autonomous Agents
- Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc (2023) (Lobsters)
- Batched LoRAs - Maximize GPU util by routing inference through multiple LoRAs in the same batch.
- [2211.12588] Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (2022) (Code)
- Paperify - Transform any document, web page, or ebook into a research paper. (HN)
- LLM Finetuning Hub - Fine tune LLMs via the Fine tuning Hub. (HN)
- Awesome Pruning
- SEC Insights - Uses the Retrieval Augmented Generation (RAG) capabilities of LlamaIndex to answer questions about SEC 10-K & 10-Q documents.
- GPTMe - Fancy CLI to interact with LLMs in a Chat-style interface, with additional capabilities like executing commands on the local machine.
- openai-chat-tokens - Estimate the number of tokens an OpenAI chat completion request will use.
- Can LLMs learn from a single example? (2023) (HN)
- OrchestrAI - Framework for building and testing custom autonomous agents.
- Open LLMetry - Open-source observability for your LLM application, based on OpenTelemetry.
- Traceloop - LLM Application Observability. (GitHub)
- Awesome LLM Prompt Reading List
- LLMs, RAG, and the missing storage layer for AI (2023) (HN)
- OnPrem.LLM - Tool for running on-premises large language models with non-public data. (HN)
- Falcon 180B (2023) (HN)
- Running a 180B parameter LLM on a single Apple M2 Ultra (2023) (HN)
- thiggle - Structured LLM APIs. (API code)
- ChatDev - Communicative Agents for Software Development.
- Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration (2023) (Code)
- What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
- llmonitor - Open-source monitoring & analytics for AI apps and agent.
- Rivet - AI agent and prompt chaining IDE and library.
- LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (2023)
- Mind2Web: Towards a Generalist Agent for the Web (2023) (Code)
- Emoji Generator with AI (HN)
- Asking 60 LLMs a set of 20 questions (2023) (HN) (Reddit)
- What models are you running? (2023)
- Prompt Refine - Helps you run better prompt experiments.
- Lindy.ai - Your AI Executive Assistant.
- LLM Training: RLHF and Its Alternatives (2023)
- AgentVerse - Framework for Multi-LLM Environment Simulation.
- Open Interpreter Docker
- SemanticFinder - Front end live semantic search with transformers.js.
- Awesome Code LLM
- LLM plugin for clustering embeddings