Databases
Currently prefer using Grafbase or PlanetScale as my data store for web apps. And SQLite for when I need local DB. Turso is nice for deploying SQLite.
In perfect world EdgeDB gets replication for lower latency. It builds on top of Postgres & has amazing query language & declarative schema modeling. Maybe EdgeDB can combine Neon's work on making multi-cloud, autoscaling Postgres.
Database Internals & Designing Data-Intensive Applications are great books on databases.
Metabase is a great SQL queries visualizer. Database access optimization doc is a good read.
Ditto, Replicache & Tuple Database are fascinating tools for for syncing local state for fast network bound operations using CRDTs and other tools. PlanetScale seems awesome too.
Also trying out Cozo Graph DB, has many sleek features.
trustfall is a great query engine for all kinds of data sources. Atlas is nice DB tool.
toyDB & minikeyvalue has nice code to study.
Slashbase is nice GUI app to talk with databases.
LSM in a Week, TinyKV Course & How Query Engines Work are great resources.
Interesting
- PlanetScale - Database for Developers. (HN) (Release Article)
- Materialize - Streaming SQL Database powered by Timely Dataflow. (Web)
- Tuple Database - Local-first, "end-user database" database. Embedded FoundationDB. Reactive indexable graph database. (Tweet)
- GreptimeDB - Open-source, cloud-native, distributed time-series database. (Web) (Article) (Design) (GreptimeDB Storage Engine Design)
- LMDB Store - Simple, efficient, ultra-fast, scalable data store wrapper for LMDB.
- Mentat - Persistent, relational store inspired by Datomic and DataScript.
- RocksDB - Persistent Key-Value Store for Flash and RAM Storage. (Rocksplicator - RocksDB Replication)
- TerarkDB - RocksDB compatible KV storage engine with better performance. (HN)
- LevelDB - Fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
- GoLevelDB - LevelDB key/value database in Go.
- NodeLevelDB - Fast & simple storage. A Node.js-style LevelDB wrapper for Node.js, Electron and browsers. (Awesome)
- TiDB - Distributed HTAP database compatible with the MySQL protocol. (Awesome)
- TiKV - Distributed transactional key-value database, originally created to complement TiDB. (pd - Placement driver for TiKV) (TiKV Rust Client)
- Noria - Dynamically changing, partially-stateful data-flow for web application backends. (Paper) (Notes)
- RethinkDB - Pushes JSON to your apps in realtime.
- RedixDB - Persistent real-time key-value store, with the same redis protocol with powerful features.
- AresDB - GPU-powered real-time analytics storage and query engine.
- Sophia - Modern transactional key-value/row storage library.
- Bolt - Embedded key/value database for Go.
- InfluxDB - Scalable datastore for metrics, events, and real-time analytics. (Web)
- EdgeDB - Open-source object-relational database built on top of PostgreSQL. (EdgeDB 1.0 Beta) (Web) (GitHub) (Talk)
- Cassandra - Highly-scalable partitioned row store. Rows are organized into tables with a required primary key.
- Scylla - Drop-in Apache Cassandra alternative big data database that powers your applications with ultra-low latency and extremely high throughput, while reducing TCO to a fraction of most NoSQL databases. Code (Scylla University)
- JanusGraph - Open-source, distributed graph database. (Web) (HN)
- DuckDB - Embeddable SQL OLAP Database Management System. (HN) (Playing With DuckDB) (5 minute intro) (Web)
- DuckDB-Wasm - Efficient Analytical SQL in the Browser. (HN) (Code)
- sled - Modern embedded database. (sledtool - CLI tool to work with Sled key-value databases)
- Genji - Document-oriented, embedded SQL database, works with Bolt, Badger and memory. (Web)
- Atlas - In-memory dimensional time series database.
- Pebble - RocksDB/LevelDB inspired key-value database in Go. (Intro article) (HN) (Pebble vs RocksDB: Implementation Differences)
- LogDevice - Distributed storage for sequential data.
- SQLite - C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.
- better-sqlite3 - Fastest and simplest library for SQLite3 in NodeJS.
- gStore - Graph database engine for managing large graph-structured data.
- MongoDB - General purpose, document-based, distributed database built for modern application developers.
- Ardb - High Performance Persistent NoSql, Full Redis-Protocol Compatibility.
- Datahike - Durable datalog implementation adaptable for distribution. (Web) (CSV Loader)
- Yugabyte DB - High-performance distributed SQL database for global, internet-scale apps.
- JuliaDB - Parallel analytical database in pure Julia. (Docs) (HN)
- Delta Lake - Storage layer that brings scalable, ACID transactions to Apache Spark and other big-data engines.
- M3 - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform. Prometheus compatible. (Web)
- WatermelonDB - Reactive & asynchronous database for powerful React and React Native apps.
- Neo4j - High performance graph store with all the features expected of a mature and robust database, like a friendly query language and ACID transactions.
- Dgraph - Horizontally scalable and distributed graph database, providing ACID transactions, consistent replication and linearizable reads.
- MeiliDB - Full-text search database based on the fast LMDB key-value store.
- CrateDB - Distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.
- Riak - Distributed, decentralized data storage system.
- CockroachDB - Open source, cloud-native SQL database. (CockroachDB: The Resilient Geo-Distributed SQL Database) (HN)
- ActorDB - Distributed SQL database.
- ksqlDB - Event streaming database purpose-built for stream processing applications. (HN) (Code)
- OmniSciDB - Open source SQL-based, relational, columnar database engine that leverages the full performance and parallelism of modern hardware.
- Sonnerie - Simple timeseries database.
- Dolt - Git for Data. SQL database that you can fork, clone, branch, merge. (HN) (Web) (HN)
- Crux - Open source document database with bitemporal graph queries. (Website) (Article) (HN)
- LokiJS - Document oriented database written in JavaScript.
- terrier - Carnegie Mellon's new database system project that is replacing Peloton.
- Nebula Graph - Open-source graph database capable of hosting super large scale graphs with dozens of billions of vertices (nodes) and trillions of edges, with milliseconds of latency. (HN) (Active Fork)
- SeaTable - Online lightweight database with a spreadsheet interface. (Code)
- Ceph - Distributed object, block, and file storage platform.
- Vitess - Database clustering system for horizontal scaling of MySQL through generalized sharding. (Web)
- MinIO - High Performance, Kubernetes Native Object Storage. (Web) (GitHub) (MinIO: A Bare Metal Drop-In for AWS S3) (MinIO Console) (MinIO Operator)
- Memory-Efficient Search Trees for Database Management Systems (2020) (HN)
- ShareDB - Realtime database backend based on Operational Transformation (OT).
- Irmin - Distributed database built on the same principles as Git. (Code) (HN)
- Noms - Decentralized database philosophically descendant from the Git version control system.
- SwayDB - Fast embeddable persistent and in-memory key-value storage engine that provides storage as simple data structures - Map, Set & Queue.
- TrailDB - Efficient tool for storing and querying series of events.
- QuestDB - Relational database with ultimate time-series performance. (HN)
- Prometheus - Systems and service monitoring system.
- Akumuli - Time-series database.
- SSDB - Redis compatible NoSQL database stored on disk.
- minikeyvalue - Distributed key value store in under 1000 lines. (HN)
- Bedrock - Simple, modular, WAN-replicated, Blockchain-based data foundation for global-scale applications. (Web)
- TerminusDB - Full featured in-memory graph database management system with a rich query language. (Code) (HN) (GitHub)
- WhiteDB - Lightweight database library operating fully in main memory. Disk is used only for dumping/restoring database and logging.
- FaunaDB - Database built for serverless, featuring native GraphQL.
- ImmuDB - Lightweight, high-speed immutable database for systems and applications. Written in Go. (HN) (HN)
- NutsDB - Simple, fast, embeddable, persistent key/value store written in pure Go.
- remoteStorage - Open protocol for per-user storage on the Web.
- TimescaleDB - Open-source database built for analyzing time-series data with the power and convenience of SQL. (timescaledb-tune) (HN) (2.0 release) (HN)
- Timescale Cloud (HN) (HN)
- ClickHouse - Open-source column-oriented database management system that allows generating analytical data reports in real time. (How ClickHouse Saved our Data) (HN) (Faster ClickHouse Imports) (HN: ClickHouse, Inc.) (Article) (HN)
- ArongoDB - Natively store data for graph, document and search needs. Utilize feature-rich access with one query language. (Go Driver) (Arangolite - Go Driver) (Python Driver) (Feed)
- LiteStore - Lightweight, self-contained, RESTful, multi-format NoSQL document store server written in Nim and powered by a SQLite backend for storage.
- RecallGraph - Versioning data store for time-variant graph data. (HN)
- Apache Pinot - Realtime distributed OLAP datastore. (Code)
- Apache Ignite - Horizontally scalable, fault-tolerant distributed in-memory computing platform for building real-time applications that can process terabytes of data with in-memory speed.
- TileDB - Storage Engine for Data Science.
- Pravega - Open source distributed storage service implementing Streams. It offers Stream as the main primitive for the foundation of reliable storage systems.
- libmdbx - Extremely fast, compact, powerful, embedded, transactional key-value store database.
- libfpta - Ultra fast, compact, Embedded Database for tabular and semistructured data.
- Realm - Mobile database that runs directly inside phones, tablets or wearables.
- HSE - Embeddable key-value store designed for SSDs based on NAND flash or persistent memory. (Docs)
- GhostDB - Distributed, in-memory, general purpose key-value data store that delivers microsecond performance at any scale. (HN)
- Datalevin - Port of Datascript in-memory Datalog database to Lightning Memory-Mapped Database (LMDB).
- DagDB - Syncable database built on IPLD.
- MonetDB - Column-store pioneer. (Web)
- RxDB - NoSQL-database for JavaScript Applications like Websites, hybrid Apps, Electron-Apps, Progressive Web Apps and NodeJs. (HN)
- Graviton Database - Simple, fast, versioned, authenticated, embeddable key-value store database in pure Go. (HN)
- SeaweedFS - Distributed object store and file system to store and serve billions of files fast.
- IndexedDB - IndexedDB, but with promises.
- JsStore - Complete IndexedDB wrapper with SQL like syntax. (Web)
- Quadrable - Authenticated multi-version database: sparse binary merkle tree with compact partial-tree proofs.
- Manticore Search - Database designed specifically for search, including full-text search. (HN) (HN)
- Amazon QLDB - Fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log. (Awesome)
- Oxigraph - Graph database implementing the SPARQL standard.
- JavaScript Database (JSDB) - Transparent, in-memory, streaming write-on-update JavaScript database for Small Web applications that persists to a JavaScript transaction log. (Intro) (Lobsters)
- Cete - Distributed key value store server written in Go built on top of BadgerDB.
- NoisePage - Self-Driving Database Management System. (Code) (HN)
- Sir.DB - Git-diff-able JSON database on yer filesystem. (HN)
- Bigbucket - Serverless NoSQL database with a focus on scalability, availability and simplicity. It has a Bigtable-style data model with storage backed by a Cloud Storage Bucket.
- AnnaBellaDB - Proof-of-concept (PoC) network latency and access-pattern aware key-value store.
- OpenCog AtomSpace - In-RAM knowledge representation (KR) database, an associated query engine and graph-re-writing system, and a rule-driven inferencing engine that can apply and manipulate sequences of rules to perform reasoning. (Web)
- Sybil - Append only analytics datastore with no up front table schema requirements. Just log JSON records to a table and run queries.
- Comdb2 - Clustered RDBMS built on Optimistic Concurrency Control techniques.
- Arctic - High performance datastore for time series and tick data.
- Warp 10 - Open Source Geo Time Series Platform designed to handle data coming from sensors, monitoring systems and the Internet of Things. (Web)
- Eva - Distributed database-system implementing an entity-attribute-value data-model that is time-aware, accumulative, and atomically consistent.
- Firestore - Develop rich applications using a fully managed, scalable, and serverless document database. (Intro) (Running Google Firestore locally)
- Graphik - Identity-aware, permissioned, persistant document/graph database & pubsub server written in Go.
- AIStore - Lightweight object storage system with the capability to linearly scale-out with each added storage node and a special focus on petascale deep learning. (Web)
- DatenLord - Computing Defined Storage, an application-orientated, cloud-native distributed storage system.
- AgateDB - Embeddable, persistent and fast key-value (KV) database written in pure Rust.
- TensorBase - Modern big data warehouse with performance in its core mind. (Web)
- Redwood - Highly-configurable, distributed, realtime database that manages a state tree shared among many peers.
- Drivine - Best and fastest graph database client (Neo4j & AgensGraph) for Node.js & TypeScript. (Code) (Starter Template)
- InfiniCache - In-memory cache that is built atop ephemeral serverless functions. (HN) (Docs)
- Blazegraph - Ultra high-performance graph database supporting Blueprints and RDF/SPARQL APIs.
- Escanor - High performance key value database with useful json document indexing and manipulations.
- Condensation - General-purpose distributed data system with conflict-free synchronization, and inherent end-to-end security. (GitHub)
- ZenoDB - Go-based embeddable time series database optimized for performing aggregated analytical SQL queries on dimensional data.
- IndraDB - Graph database written in rust.
- SteveCare - Peer-to-peer database system that enables people to build complex databases between peers, without any intermediary platform.
- CORTX - Open Source Mass-Capacity Optimized Object Store.
- PouchDB - JavaScript Database that Syncs. (Code)
- CryptDB - Database system that can process SQL queries over encrypted data. (Code)
- LBADD - Experimental distributed SQL database, written in Go.
- Baserow - Self-hosted Airtable alternative. (HN) (Code)
- KuiBaDB - Another Postgres rewritten with Rust and multi-threading.
- LeanStore - High-performance OLTP storage engine optimized for many-core CPUs and NVMe SSDs. (Web)
- KVS - Abstract Chain Database. (Code)
- Yorkie - Document store for building collaborative editing applications. (Code)
- InfluxDB IOx - Future core of InfluxDB, an open source time series database.
- EliasDB - Graph-based database.
- Resql - SQL database server that uses SQLite as its SQL engine and it provides replication and automatic failover capabilities.
- Tarantool - In-memory computing platform. (Go client) (Lua Code)
- EventQL - Database for large-scale event analytics. (Code)
- OceanBase - Distributed, banking suitable, open-source related database featuring high scalability and high compatibility. (HN)
- Kvrocks - Distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.
- UnQLite - Embedded NoSQL, Transactional Database Engine. (Web)
- LinDB - Scalable, high performance, high availability distributed time series database. (Web)
- TimeBase - High performance time series database. (Web)
- Greenplum Database - Advanced, fully featured, open source data warehouse, based on PostgreSQL. (Web)
- JinDB - Small relational database engine written in Rust with the standard library and no external dependencies.
- Memgraph - In-Memory Cypher Graph Database.
- LemonGraph - Log-based transactional graph (nodes/edges/properties) database engine that is backed by a single file.
- Go SQL DB - Relational database that supports SQL queries for research purposes in Go.
- Skizze - Probabilistic data structure service and storage.
- Skytable - Extremely fast, secure and reliable real-time NoSQL database with automated snapshots and TLS. (Web)
- IceFireDB - Distributed disk storage database based on Raft and Redis protocol. (HN)
- RefineDB - Strongly-typed document database that runs on any transactional key-value store.
- Engula - Cloud-native storage engine for next-generation data infrastructures. (Code)
- BerylDB - A data structure data manager that can be used to store data as key-value entries. (Docs)
- Hyrise - Research in-memory database. (Web)
- Apache Doris - Fast MPP database for all modern analytics on big data. (Code)
- Vertica - Big Data Analytics On-Premises, in the Cloud, or on Hadoop. (Getting Started with Vertica)
- Embeddinghub - Vector database built for Machine Learning embeddings. (HN)
- GQLite - Embedded graph database implemented with Rust.
- Xata - Database service for serverless apps. (HN) (GitHub) (Supabase to Xata)
- SpiceDB - Zanzibar-inspired database that stores, computes, and validates application permissions. (Article) (HN) (CLI)
- Authzed - Managed permissions database for everyone. (GitHub) (Authzed API)
- Datomic - Transactional database with a flexible data model, elastic scaling, and rich queries. (GitHub) (Replicating with Datomic)
- EdgelessDB - Open-source MySQL-compatible database for confidential computing. Runs entirely inside a secure enclave and comes with advanced features for collaboration, recovery, and access control. (Intro)
- Infinitree - Scalable and encrypted embedded database with 3-tier caching.
- Zerostash - Deduplicated, encrypted data store that provides native versioning capabilities, and was designed to secure all metadata related to the files.
- BonsaiDb - Rust-written, ACID-compliant, document-database inspired by CouchDB. (Web) (Retro One Year In) (HN) (Lobsters)
- Amazon Timestream - Fast, scalable, serverless time series database. (Tools and Samples)
- Hive - Lightweight and blazing fast key-value database written in pure Dart. (Docs)
- Couchbase Lite for iOS and MacOS - Lightweight, embedded, syncable NoSQL database engine for iOS and MacOS apps.
- Kepler - Decentralized storage based on permissioned data overlays called orbits.
- Ambry - Distributed object store that supports storage of trillion of small immutable objects (50K -100K) as well as billions of large objects.
- ChaosDB - Unauthorized Privileged Access to Microsoft Azure Cosmos DB. (Explained) (HN)
- MirDB - Persistent Key-Value Store with Memcached protocol.
- StupiDB - Built to understand how a relational database might be implemented.
- MatrixOne - Planet scale, cloud-edge native big data engine crafted for heterogeneous workloads. (Docs)
- doxa - Simple in-memory database, trying to copy the best solutions from datascript, xtdb, fulcro, autonormal and especially shadow-grove.
- Scalaris - Scalable, transactional, distributed and fault-tolerant key-value-store with strong data consistency for online databases and Web 2.0 services.
- Basenine - Schema-free, document-oriented streaming database that optimized for monitoring network traffic in real-time.
- MeerkatDB - Distributed append-only (no UPDATE/DELETE support) eventual consistent columnar storage for events and timeseries.
- OpenMLDB - Open-source machine learning database that provides a full-stack FeatureOps solution for enterprises.
- Google F1 - Distributed transactional database. Built on Google's Spanner so that it can reach strong consistency. (Paper)
- Skate - Personal key-value store. Use it to save and retrieve anything you’d like—even binary data.
- Hazelcast - Distributed computation and storage platform for consistently low-latency querying, aggregation and stateful computation against event streams and traditional data sources. (Web)
- SimpleDB - Simple database built from scratch that has some of the basic RDBMS features like a SQL query parser, transactions, and a query optimizer. (HN)
- Garage - Lightweight S3-compatible distributed object store. (Web) (Article) (HN)
- StorageTapper - Scalable real time MySQL change data streaming, logical backup and logical replication service.
- PoloDB - Embedded JSON-based database.
- TinyDB - Lightweight document oriented database optimized for your happiness.
- CloverDB - Lightweight NoSQL database designed for being simple and easily maintainable, thanks to its small code base. Inspired by tinyDB.
- Vearch - Scalable distributed system for efficient similarity search of deep learning vectors.
- RemixDB - Read- and write-optimized concurrent KV store. Fast point and range queries. Extremely low write-amplification.
- RisingLight - OLAP database system for educational purpose.
- SurrealDB - Scalable, distributed, collaborative, document-graph database, for the real time web. (Web) (HN)
- classic-level - Abstract-level database backed by LevelDB.
- Apache Druid - Database for modern analytics applications. (Code)
- EJDB - Embeddable JSON database engine.
- SQLive - General-purpose SQL database that lets you subscribe to changes to your queries.
- LotusDB - Fast k/v database compatible with LSM tree and B+ tree.
- CnosDB - Open Source Distributed Time Series Database with high performance, high compression ratio and high usability.
- Nubostore - Data store like Firestore and Algolia all in one.
- Surge - Fastest next-gen NoSQL db.
- StarfishQL - Graph database and query engine to enable graph analysis and visualization on the web. (Web)
- SingleStore - Unified database for data-intensive applications. (Twitter) (Flexible Parallelism in SingleStoreDB) (Kysely SingleStore)
- Apache Impala - Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.
- RisingWave - Cloud-native streaming database that uses SQL as the interface language.
- RunKV - Experimental cloud-native distributed KV engine for OLTP workload.
- ATE - Distributed immutable data store with strong encryption and authentication.
- YDB - Open-source Distributed SQL Database that combines high availability and scalability with strict consistency and ACID transactions. (Web) (HN) (Python SDK) (Go SDK)
- ForestDB - Fast Key-Value Storage Engine Based on Hierarchical B+-Tree Trie.
- Realm - Mobile database: an alternative to SQLite & key-value stores. (Code)
- Instant - Graph Database on the Client.
- Apache CouchDB - Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability. (Web) (Web Code)
- PranaDB - Distributed streaming database, designed from the outset to be horizontally scalable.
- eyros - Multi-dimensional interval database.
- AntidoteDB - Planet scale, highly available, transactional database built on CRDT technology. (Web)
- DarkBird - Document oriented, high concurrency in-memory Storage, also persist data to disk to avoid loss any data.
- Apache Calcite - Dynamic data management framework. (Code)
- PolarDB-X - Cloud native distributed SQL Database designed for high concurrency, massive storage and complex querying scenarios.
- SplinterDB - Key-value store designed for high performance on fast storage devices.
- jammdb - Embedded, single-file database that allows you to store key / value pairs as bytes.
- tectonicdb - Fast, highly compressed standalone database and streaming protocol for order book ticks.
- TigerGraph - Fast and scalable graph database for the enterprise.
- CeresDB - High-performance, distributed, schema-less, cloud native time-series database that can handle both time-series and analytics workloads. (Python Client)
- StoneDB - Open-source, MySQL HTAP and MySQL-native database for oltp, real-time analytics. (Web)
- ClientDB - OS in-memory database for real-time web apps. (Code)
- Kerf - Columnar tick database and time-series language for Linux/OSX/BSD/iOS/Android.
- AnnaDB - Next-generation developer-first NoSQL database.
- StarRocks - Next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query. (Web)
- TuGraph - Efficient graph database that supports high data volume, low latency lookup and fast graph analytics.
- Seafowl - Analytical database for modern data-driven Web applications.
- WooriDB - General purpose time serial database. It is schemaless, key-value storage and uses its own query syntax that is similar to SparQL.
- Apache Pegasus - Horizontally scalable, strongly consistent and high-performance key-value store. (HN)
- Tidis - Distributed transactional large-scale NoSQL database powered by TiKV.
- RonDB - Stable distribution of NDB Cluster, a key-value store with SQL capabilities.
- dobby - Homemade table-oriented (but not really relational) database engine with a modular design.
- NucliaDB - Cloud-native database for unstructured data, indexing vectors, text, paragraphs and relations.
- PhotonDB - Storage engine for modern hardware, built from scratch in Rust.
- RadonDB - Open source, Cloud-native MySQL database for unlimited scalability and performance.
- Segment - Simple and fast in-memory key-value database written in Rust.
- Reindexer - Embeddable, in-memory, document-oriented database with a high-level Query builder interface.
- AssemblageDB - Distributed Document/Graph DB for Connected Pages & Documents.
- Apache ORC - High-Performance Columnar Storage for Hadoop. (Code)
- SolomonDB - Embedded Gremlin-compatible graph database written in Rust.
- Kuzu - In-process property graph database management system (GDBMS) built for query speed and scalability.
- zgraph - Embeddable graph database for large-scale vertices and edges.
- Flink Table Store - Data lake storage for streaming updates/deletes changelog ingestion and high-performance queries in real time.
- Snowflake (Learn SnowflakeDB)
- Apache Kvrocks - Distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol. (Web Code)
- Billy - Super simple data store in Go.
- AllyDB - In-memory database similar to Redis, built using Elixir.
- RixxDB - Versioned, embedded, strongly-consistent, key-value database.
- OctoBase - Offline-available, scalable, self-contained collaborative database, which was originally designed for AFFiNE.
- Fireproof - Real time database for today's interactive applications.
- ArcticDB - High performance, serverless DataFrame database built for the Python Data Science ecosystem.
- SKDB - SQL database that tells you when your query results changed.
- HeisenbergDB - Distributed vector database.
- JunoDB - Secure, consistent and highly available key-value store. (HN)
- FlyDB - High-performance kv storage engine based on bitcask paper supports redis protocol and the corresponding data structure.
- Weaviate - Vector database.
- ReductStore - Time series database for storing and managing large amounts of blob data.
- Supabase Vector - Open source vector toolkit for Postgres.
- Yotta Store - Next generation storage system aiming to scale out to the yotta byte range and scale up to millions of concurrent read and writes per record. (Rust bindings)
- GlareDB - Fast SQL database for querying and analyzing distributed data.
Tools
- TablePlus - Modern, native, and friendly GUI tool for relational databases. (HN) (Issues)
- SQLiteStudio - Free, open source, multi-platform SQLite database manager.
- litecli - Command-line client for SQLite databases that has auto-completion and syntax highlighting.
- Beekeeper Studio - Query and manage your relational databases. (Code)
- Diwata - User-friendly database interface.
- Sequel Ace - MySQL/MariaDB database management for macOS. (Web)
- ExtendsClass - Online MySQL playground for testing.
- Dropbase - Turn offline files into live databases instantly. (HN)
- Synth - Create synthetic data environments in seconds. (HN)
- Baserow - Open source online database tool and Airtable alternative.
- SHMIG - Database migration tool written in BASH.
- goose - Database migration tool. Manage your database schema by creating incremental SQL changes or Go functions.
- migrate - Database migrations written in Go. Use as CLI or import as library.
- Flyway - Database Migrations Made Easy. (Tweet) (Code)
- Liquibase - Open Source Version Control for Your Database.
- gh-ost - GitHub's Online Schema Migrations for MySQL.
- Dbmate - Lightweight, framework-agnostic database migration tool.
- ShardingSphere - Distributed Database Middleware Ecosphere. (Web)
- ln2sql - NLP tool to query a database in natural language.
- Hue - Open source SQL Assistant for Data Warehouses. (Code)
- ley - Driver-agnostic database migrations.
- DBeaver - Free Universal Database Tool. (Code)
- Skeema - Schema management CLI for MySQL.
- noisepage-test - DBMS Performance & Correctness Testing Framework.
- erd - Translates a plain text description of a relational database schema to a graphical entity-relationship diagram.
- CloudBeaver - Database Management from Browser. (Code)
- DbGate - Database manager for MySQL, PostgreSQL, SQL Server and MongoDB. (Code)
- Condenser - Database subsetting tool.
- NocoDB - Turns your SQL database into a Nocode platform. Free & Open Source. (Code) (HN) (HN)
- Owoof - Program for querying and modifying information in a datalog-like format backed by SQLite.
- Autogenerate a CRUD app from a CSV file (HN)
- Gobang – Cross-platform TUI database management tool written in Rust. (HN)
- Jailer - Truly relational database client. (HN)
- dbcritic - Finds problems in a database schema.
- IceCream - Sync Realm Database with CloudKit.
- Kinto - Minimalist JSON storage service with synchronisation and sharing abilities. (Docs)
- SchemaCrawler - Free database schema discovery and comprehension tool. (Web)
- dbmigrate - PostgreSQL/SQLite/MySQL migration tool in rust.
- Qsh - Improved database querying from your terminal. (HN)
- trona - Write DB migrations with SQL and run them with a CLI.
- Azimutt - Entity Relationship diagram (ERD) visualization tool, with various filters and inputs to help understand your SQL schema. (Code)
- Models - Tool for automated migrations for PostgreSQL, SQLite and MySQL.
- Atlas - Set of tools designed to help companies better work with their data. It includes several components that can be used individually but are designed to work very well together. (Code) (HN)
- replikativ - Open, scalable and distributive infrastructure for a data-driven community of applications. (Web) (Unified storage IO)
- Bytebase - Web-based, zero-config, dependency-free database schema change and version control management tool for developers and DBAs. (Web)
- Sequelize-Auto - Automatically generate models for SequelizeJS via the command line.
- DrawSQL - Database schema diagrams.
- SQLize - Generate MySQL/PostgreSQL Migration from Go struct and existing SQL.
- OmniDB - Web tool for database management. (Code)
- Maxwell's Daemon - Application that reads MySQL binlogs and writes row updates as JSON to Kafka, Kinesis, or other streaming platforms. (Code)
- MaxScale - Intelligent database proxy. (Docs)
- Couchbase - Modern Database for Enterprise Applications.
- Couchbase Mobile - SQLite Alternative. (C++ Client)
- Morph - Database migration tool that helps you to apply your migrations. Written with Go.
- loadgen - Generate database load.
- Sqitch - Database change management application.
- Antares SQL - Modern, fast and productivity driven SQL client with a focus in UX. (Code)
- Jugglr - Test data management tool that enables reliable testing with a Docker containerized database.
- data-diff - Efficiently diff rows across two different databases.
- Malewicz - Hackable GUI SQL-manager written in SQL itself.
- Go Database Code Generator - Tool is to help you generate schema migrations and CRUD code in Go from an entity definition in form of JSON.
- FeatureBase - Real-time analytical database built on bitmaps. (HN)
Notes
- Database queries are especially fast if you copy the database into RAM.
- More than likely images/videos are stored in something like AWS S3 and the database would just have links to them. It is possible to store an image straight up in a database though but it would be as a blob/buffer data that is then turned back into an image on the client. The blob/base64 string way can be slower/probably not recommended. For fast load/performance they'd use caching/cdn.
- It's good to start any app design with database schema and mock the UI to connect them.
Links
- Stanford DB course
- Algebraic Query Language - Schemas as categories, DB instances as functors, provable data migration as functor composition.
- David Nolen: Out of the Tarpit, Revisited (2017)
- syncing-thesis - Syncing strategies for mobile apps.
- Storage Performance Development Kit - Provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications.
- Let's Build a Simple Database - Writing a sqlite clone from scratch in C. (Code) (In Rust)
- Readings in Databases - List of papers essential to understanding databases and building new data systems.
- Turning the database inside-out with Apache Samza (2015)
- Designing Data Intensive Applications Book - Deep dives into different types of data storage solutions, their history, and how they actually work. (Review) (Notes) (Literature References) (Review) (Notes)
- Scalable SQL and NoSQL Data Stores - Good paper that helps differentiate similar but different datastores. Really helpful when you're trying to pick a modern data solution.
- "Transactions: myths, surprises and opportunities" by Martin Kleppmann (2015)
- Using Apache Arrow, Calcite and Parquet to build a Relational Cache (2017)
- Apache Arrow - Development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. (Web) (HN) (Awesome)
- Blazer - Explore your data with SQL. Easily create charts and dashboards, and share them with your team.
- Moving on from RocksDB to something FASTER - Matthew Brookes (2019)
- List of software that turns your database into a REST/GraphQL API
- What Are Databases? (2019)
- Advanced Database Systems (2019) (Other courses) (Web) (Code)
- Facebook Scuba, MongoDB, CockroachDB (CMU Databases Systems lecture) (2019)
- DB - Version control for databases: save, restore, and archive snapshots of your database from the command line. (HN)
- Why databases use ordered indexes but programming uses hash tables (2019)
- Samuel Madden professor research page
- Curated list of resources for graph databases and graph computing tools
- Graph Databases book
- Big News In Databases — Fall 2019
- Ask HN: What are some examples of good database schema designs? (2020)
- Data flows and security architecture in CockroachDB (2020)
- Awesome Database Learning - List of learning materials to understand databases internals.
- Sharing an SQLite database across containers is surprisingly brilliant (2020)
- Your Database as an API (2020) (Lobsters)
- Elle - Black-box transactional safety checker based on cycle detection.
- Millions of Tiny Databases (2020) (Article)
- The Next 50 Years of Databases (2015) (HN)
- Interview with Noria’s creator: a promising dataflow research database implemented in Rust (2019)
- The High Cost of Splitting Related Data (2020)
- Awesome Database Tools
- Storage: Complete Overview for Developers (2020)
- Event-Reduce - Algorithm to optimize database queries that run multiple times. (HN)
- Apache Druid vs. Time-Series Databases (2019) (HN)
- Lopez: Breaking boundaries between programming languages and databases (2019)
- Declarative Frameworks and Optimization Techniques for Developing Scalable Advanced Analytics over Databases and Data Streams (2019)
- Things I Wished More Developers Knew About Databases (2020) (HN)
- toyDB - Distributed SQL database in Rust, written as a learning project.
- Gallery of 200 database schema diagrams of open-source packages (HN)
- MySQL sharding at Quora (HN)
- Database Internals book - Deep Dive Into How Distributed Data Systems Work.
- Fast and maintainable patterns for fetching from a database (2020)
- DbCleaner - Clean database for testing, inspired by database_cleaner for Ruby.
- polluter - Easiest solution to seed database with Go.
- In Search of a Local-First Database (2020)
- Local-first database: remoteStorage.js (2020)
- Jon Gjengset's PhD thesis
- Succinct Data Structures and Delta Encoding for Modern Databases (2020)
- About Pool Sizing
- I want to own the database that my apps use (2020) (Lobsters)
- Hermitage: Testing transaction isolation levels
- Amazon's Dynamo (2007)
- The Curious Case of Small Files (2020) (Lobsters)
- MiniCouchDB in Rust (2020) (HN) (Lobsters)
- Accessing SQLite, PostgreSQL and MySQL through ODBC
- 17 Things Developers Need to Know About Databases - Peter Zaitsev (2020)
- Readings in Database Systems (HN)
- Stanford Future Data Systems Research Group
- Stanford Data Management and Data Systems (2017)
- Concept-oriented model: Modeling and processing data using functions (2019) (Summary)
- DBCore - Generate applications powered by your database. (Lobsters) (Code) (Article) (HN)
- Recent database technology that should be on your radar (2020) (HN)
- DB Weekly - Weekly round-up of database technology news and articles covering new developments, SQL, NoSQL, document databases, graph databases, and more.
- Making Databases Work: The Pragmatic Wisdom of Michael Stonebraker (2018) (HN)
- SOSD: A Benchmark for Learned Indexes (Code)
- RMI - Recursive model index, a learned index structure. (Go implementation)
- SchemaHero - Kubernetes operator for declarative database schema management. (Web)
- TablaM - Practical general language but tailored for data-manipulation and database (in the broad sense of the word) coding. (Comment)
- What are databases? (2020)
- The myth of “joins don't scale” (2020) (HN)
- Your database is a distributed system (2015)
- Sieuferd - General-purpose user interface for relational databases.
- Unofficial Guide to Datomic Internals (2014) (HN)
- Lobsters: Does anyone use advanced database access control anymore?
- Introduction to database schemas
- BaseDash - Build internal tools for your database. (HN)
- Databases, Types, and the Relational Model: The Third Manifesto - Rigorously define a type-safe (and NULL-safe) data model and query language based on the relational algebra. (PDF) (HN)
- When are full database backups faster than incremental backups? (2020)
- Testing Database Engines via Pivoted Query Synthesis (2020) (Tweet)
- Splitgraph - Integrated data catalog and database proxy. (Code) (Splitgraph Data Delivery Network) (HN) )
- Database backup strategies (2019)
- Monarch: Google’s Planet-Scale In-Memory Time Series Database (HN) (Notes)
- The database I wish I had (2020) (HN) (Lobsters)
- Advanced Database Systems course by Andy Pavlo (2020) (Talks)
- sled simulation guide - Contains basic information about deterministic testing of distributed, message-based, event-driven, or actor systems.
- Database of Databases - Discover and learn about database management systems. (Code)
- DB-Engines - Knowledge Base of Relational and NoSQL Database Management Systems.
- Old, Good Database Design (2020)
- Berkeley: Introduction to Database Systems Course (Tweet)
- The Datacenter as a Computer - Introduction to the Design of Warehouse-Scale Machines.
- Database migrations lessons learned (2020)
- Delos: Simple, flexible control plane storage (2019)
- Time for a WTF MySQL Moment (2020) (HN) (Lobsters)
- The Database is on Fire (2020) (Lobsters)
- Alphora-Style Database Diagramming (2020) (Lobsters)
- Things every developer needs to know about database indexing - Kai Sassnowski (2020)
- Fizz - Common DSL for Migrating Databases.
- SQL vs NoSQL | MySQL vs MongoDB | Relational Databases vs DynamoDB, CosmosDB | When to use each (2020)
- Neural Databases (2020) (HN)
- Movine - Migration manager written in Rust that attempts to be smart yet minimal.
- We deleted the production database by accident (2020) (HN) (Lobsters)
- Cuckoo Index - Lightweight Secondary Index Structure.
- DBML - Database Markup Language. (Code)
- MindsDB - Open-Source Predictive AI layer for existing databases. (Docs) (Docs code) (Code) (Using QuestDB as a datasource for MindsDB)
- Migrating Data When You Never Erase History (2020)
- Thesis: Partial State in Dataflow-Based Materialized Views (2020)
- Universal Relation Data Modelling Considered Harmful (2020) (Lobsters)
- Helios: Hyperscale Indexing for the Cloud and Edge (2020)
- The World’s Best In-Memory Database (2020)
- Anonymized Cache Request Traces from Twitter Production - Describes the traces from Twitter's in-memory caching (Twemcache/Pelikan) clusters.
- Advanced Join Patterns for the Actor Model Based on CEP Techniques (2020) (HN)
- Testing Database Engines via Pivoted Query Synthesis (2020) (HN)
- If All You Have Is a Database, Everything Looks Like a Nail (2020) (HN)
- In-Database Machine Learning (2020) (HN)
- Feature Casualties of Large Databases (2020) (Lobsters)
- Seeing is Believing: A Client-Centric Specification of Database Isolation
- Scaling Datastores at Slack with Vitess (2020)
- dbdocs - Database Documentation and Catalog Tool.
- Fast database UPDATE/DELETE operations (2020)
- Building an Event Storage
- Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics (2020)
- clepsydra - Implementation of a core protocol for a minimalist distributed database.
- A shared database is still an anti-pattern, no matter what the justification (2013) (Tweet)
- Your legacy database is outgrowing itself (2021) (HN)
- An unlikely database migration (2021) (HN)
- Kalavar - Project attempting to bring a fast, efficient, secure, and asynchronous query model to the modern database system. (GitHub)
- But how, exactly, databases use mmap? (2021) (Lobsters)
- Building a personal data warehouse in Snowflake for fun and no profit (2021) (HN)
- ProxySQL - High performance, high availability, protocol aware proxy for MySQL and forks (like Percona Server and MariaDB).
- Implementing Data Replication in MemgraphDB (2021)
- How Buffer Pool Works: An Implementation In Go (2021) - Exploring how buffer pool management works in databases by building one.
- Database as a Queue (2021) (Lobsters)
- The Database Inside Your Codebase (2021) (Lobsters) (HN)
- DBCLI - Commandline Database Clients with Autocompletion and Syntax Highlighting. (HN)
- Evolving Schemaless into a Distributed SQL Database (2021)
- Turning the database inside out with Apache Samza by Martin Kleppmann (2014)
- Database Reliability Engineering Book (2017)
- Are graph databases worth using in 2020?
- Grouparoo - Open Source Data Synchronization Framework. (Code) (HN)
- How to Efficiently Choose the Right Database for Your Applications (2021) (HN)
- MySQL from Below (2021)
- Database Normalization in plain language for the working dev (with examples) (2021)
- Array Databases: Concepts, Standards, Implementations (HN)
- Ask HN: Best low-/no-code solution for simple web-based database front ends (2021)
- We ditched MySQL and made our dashboard really fast. Here is how we did it. (2021) (Tweet)
- Databunker - Secure storage for personal records built to comply with GDPR. (Code) (HN)
- Database options (2021)
- Database Deep Dives with Andy Pavlo (2021)
- Graphs with Spidey DB (2021) (HN)
- dbdiagram.io - Database Relationship Diagrams Design Tool.
- Foundations of Databases (1995) - Book published by Addison Wesley. (HN)
- Segcache: a memory-efficient, scalable cache for small objects with TTL (2021)
- Ask HN: How would you store 10PB of data for your startup today? (2021)
- Query Engines: Push vs. Pull (2021)
- How Litestream Eliminated My Database Server for $0.03/month (2021) (HN)
- “What Goes Around Comes Around”: A Brief History of Databases (2017) (Summary)
- How Discord Stores Billions of Messages (2017)
- Experience with using NoSQL for a startup (2021)
- Common data model mistakes made by startups (HN)
- Easily Build Advanced Similarity Search With The Pinecone Vector Database (2021)
- Minimal Modeling Blog - Dedicated to in-depth discussion of all kinds of topics related to the database modeling.
- Getting Started with IndexedDB for Big Data Storage (2021)
- The pedantic checklist for changing your data model in a web application (2021) (Lobsters) (HN)
- Introduction to Graph Databases
- Khadas - Shenzhen based hardware manufacturer.
- 7 Database Paradigms (2020)
- Ask HN: Do you self-host your database? (2021)
- How databases handle 10 million devices in high-cardinality benchmarks (2021)
- BullFrog: Online Schema Migration, On Demand (2021) (Paper)
- PathQuery, Google's Graph Query Language (2021) (HN)
- Write a time-series database engine from scratch (2021) (HN)
- Purdue CS590: Cloud-Native Database Systems (2021)
- Scylla's Approach to Improve Performance for CPU-bound workloads (2017)
- Our Airtable sync process, layer by layer (2021)
- Why are graph databases not more popular? (2021)
- The Database Ruins All Good Ideas (2021) (Lobsters) (HN)
- Old pattern powering modern tech (2021) - Or why modern storage is just a faster tape.
- Migrating Facebook to MySQL 8.0 (2021) (HN)
- Pufferfish, please scale the site! (2021)
- Your connection deserves a name (2021) - Assign a name to your RabbitMQ, redis, and PostgreSQL connection. (Code)
- What time-weighted averages are and why you should care (HN)
- Your database connection deserves a name (2021) (Lobsters)
- ZippyDB: Facebook's key value store (2021)
- Time to Retire the CSV? (2021) (Lobsters) (HN)
- How database indexing actually works internally (2021) (HN)
- Building PlanetScale with PlanetScale (2021) (HN)
- Database internals are becoming less important than developer experience (2021) (HN)
- Fastest table sort in the West – Redesigning DuckDB's sort (2021) (HN)
- Apache Iceberg - Table format for storing large, slow-moving tabular data. (Code)
- TiDB Development Guide (Code)
- Catabase: a database of categories
- Ask HN: What could a modern database do that PostgreSQL and MySQL can't (2021)
- unstorage - Universal Storage Layer.
- Procella: Unifying serving and analytical data at YouTube (2019) (Summary)
- Choosing a database model for a hierarchical content (2021)
- Real world database latency (2021) (HN)
- Best JS/TS library to use for subscribing to DB changes (2021)
- Notes on Database Normalization
- Partitioning GitHub’s relational databases to handle scale (2021) (HN)
- Cloudflare R2 Storage: Rapid and Reliable Object Storage, minus the egress fees (2021) (HN) (Pricing Analysis) (Tweet) (HN)
- Will Cloudflare R2 Win Customers from Amazon S3? (2021) (HN)
- The Reactive Monolith – How to Move from CRUD to Event Sourcing (2021) (HN)
- Ask HN: Why are relational DBs are the standard instead of graph-based DBs? (2021)
- Cheapest / fastest way to load Stripe data into a SQL database (2021)
- Pelikan - Twitter's unified cache backend. (HN)
- Relational Databases Aren’t Dinosaurs, They’re Sharks (2021)
- The One Crucial Difference Between Spanner and CockroachDB (2021)
- TAO: Facebook’s Distributed Data Store for the Social Graph (2021) (HN)
- Can you explain how a database index works in an interview? (2017)
- How Time Series Databases Work—and Where They Don't (2021) (HN)
- Awesome Database Development
- Offline-First Database Comparison (HN)
- How we built a serverless SQL database (2021) (HN)
- A Return to the General Purpose Database (2021) (HN)
- Are Stored Procedures and Triggers Anti-Patterns in the Cloud Native World? (2021)
- TimescaleDB vs ClickHouse (2021) (HN)
- OLAP Databases (2020)
- Spending $5k to learn how database indexes work (2021) (HN)
- A terrible schema from a clueless programmer (2021) (HN) (Lobsters)
- How do/would you approach storing "likes"? (2021)
- The Story behind The Truth: Designing a Data Model (2018)
- I hate databases (2021)
- Flags v. gates (2021)
- The history of Berkeley DB (2021)
- Papers for database systems powered by artificial intelligence (machine learning for database)
- Things I learned from building a production database (2021) (HN)
- LSM-Tree Key-Value Store based on RocksDB
- Build a Simple Database Tutorial
- Database Development Reddit
- Upgrading MySQL at Shopify (2021)
- DuckDB quacks Arrow: A zero-copy data integration between Apache Arrow and DuckDB (2021)
- Some indexing best practices (2021)
- Slashbase - Open-source collaborative IDE for your databases. (Code) (HN)
- Индексируем базу: как делать хорошо и не делать плохо
- Ask HN: How do you manage direct updates to databases in a production system (2021)
- Databases in 2021: A Year in Review (HN)
- TinyKV Course - Course to build distributed key-value service based on TiKV model.
- Database Systems Resources
- 2021 in Database Startups: Gold Rush (HN)
- Databass, Part 1: Queries (2021)
- The Third Manifesto: Documents and Books on Database Design
- Database System Readings
- UUIDs Are Popular, but Bad for Performance (2019) (HN)
- Acra - Database security suite for sensitive and personal data protection.
- Are You Sure You Want to Use MMAP in Your Database Management System? (CIDR 2022) (Lobsters) (HN)
- Electric Tables – an experiment in personal databases (2022) (HN)
- The Strength of the Record
- Wikipedia and irregular data: how much can you fetch in one expression? (2022)
- HTSQL - Database Query Language. (Lobsters)
- Organizational scalability and flexible database schemas (2022)
- What to use for caching DB requests? Redis? (2022)
- wait4it - Simple go application to test whether a port is ready to accept a connection or check MySQL, PostgreSQL, MongoDB or Redis server is ready or not.
- A decade of major cache incidents at Twitter (HN)
- How to store subscriptions? A practical guide and analysis of 3 selected databases (2022) (Reddit)
- Migrate To Graph - Tool to migrate an existing database to a graph database.
- Vector (approximate nearest neighbor) databases are fairly underground still
- Securely delegating trust with digital signatures and secret storage systems (2022)
- StaticBackend - No vendor-lock-in backend as a service. (Code)
- How much logic should I keep at the database vs. application layer? (2022)
- Database System Concepts Book
- How can we do migrations better?
- How Query Engines Work by Andy Grove (2022) - Introduction to the high-level concepts behind query engines and walks through all aspects of building a fully working SQL query engine in Kotlin. (Code) (In Go)
- Database as Code Manifesto
- mtail - Extract internal monitoring data from application logs for collection in a time series database.
- Yao - Go language-driven low-code application engine that writes JSON by Description can quickly create API interface,Data management system ,Command Line Toolsand other applications. (Code)
- A Gentle Introduction to Vector Databases (2021)
- Exploring a database with Datasette
- Have you tried rubbing a database on it?
- Building data-centric apps with a reactive relational database (2022) (Tweet) (Tweet) (Tweet)
- How does database indexing work? (HN)
- Database Naming Convention
- RepliByte - Tool to seed your dev database with real data. (HN) (HN)
- Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes (2022)
- SQL/NoSQL DB Guide
- IMDBench - Benchmarking ORMs with realistic queries.
- Migrations Done Well: Typical Migration Approaches
- Database Meetup by SPLVM
- Writing a document database from scratch in Go: Lucene-like filters and indexes (2022) (Lobsters) (HN) (Code)
- Architecture of a Database System (2007)
- Kerchunk - Cloud-friendly access to archival data.
- Klepto - Tool for copying and anonymising data.
- Databases to keep an eye on (2022)
- QuestDB: Fast Open Source Time Series Database (Vlad Ilyushchenko) (2022)
- Vectorization in OLAP Databases (2022)
- Nice cheap hosted databases (2022)
- Trustfall - Query engine, which can be used to query any data source or combination of data sources: databases, APIs, raw files (JSON, CSV, etc.), git version control, etc. (HN)
- DatabaseConsistency - Tool to find inconsistency between models schema and database constraints.
- There's always an events table (2022)
- Rohmu - Python library for building backup tools for databases providing functionality for compression, encryption and transferring data between the database and an object storage.
- A Decent Database Service (2022)
- Datascript and Datomic tutorial book
- Zero downtime migrations (2022) (HN)
- CaskDB - Project to teach you building a key value store. (HN)
- Database access optimization | Django Docs
- Getting started with database development (2022)
- Ask HN: Free and open source distributed database written in C++ or C (2022)
- Creating Distributed KV Database by Implementing Raft Consensus Using Go (2020)
- Super-Structured Data: Rethinking the Schema (HN)
- Husky, Datadog's Third-Generation Event Store (2022) (HN)
- Persistence Programming (2022)
- Docker DB Backup - Backup multiple databases types on a scheduled basis with many customizable options.
- Databases = Frameworks for Distributed Systems (2022) (HN)
- Ditto - Sync without Internet. (Twitter) (HN) (Docs) (Docs Code)
- DBngin - All-in-One Database Version Management Tool. (Code)
- Building a Cloud Database from Scratch: Why We Moved from C++ to Rust (2022) (HN)
- LSI: A Learned Secondary Index Structure (2022)
- Warp: Lightweight Multi-Key Transactions for Key-Value Stores (2015) (Review)
- MiniSQL - Designed to be a distributed relational database system. Final project of ZJU Database System Concept course.
- Ask HN: What are interesting new developments in databases related fields? (2022)
- TiFlow - Unified data replication platform around TiDB.
- Ideas on better database design
- Data-Parallel Actors: A Programming Model for Scalable Query Serving Systems (2022)
- Cache made consistent: Meta’s cache invalidation solution (HN)
- JOIN: The Ultimate Projection (2022) (Lobsters)
- Seeing is Believing: A Client-Centric Specification of Database Isolation (2022)
- Common DB schema change mistakes (2022)
- Let's Remix Distributed Database Design! (2022)
- SchemaSpy - Database Documentation Built Easy. (Code)
- Percona Community - Community hub for installing, running, optimizing, and learning everything around databases and software architectures. (Code)
- Things to know about databases (2022) (HN)
- GeoPub: A Multi-Model Database (2022)
- DBPack - DB proxy for distributed transaction, read write splitting and sharding! Support any language! It can be deployed as a sidecar in a pod.
- Arana - Cloud Native Database Proxy. It can also be deployed as a Database mesh sidecar.
- GitLab is splitting database into Main and CI (2022) (HN)
- Starting from Zero: Build an LSM Database with 500 Lines of Code (2021)
- Closing the B-tree vs. LSM-tree Write Amplification Gap on Modern Storage Hardware with Built-in Transparent Compression (2021)
- Log Structured Merge Trees (2015)
- Code in database vs. code in application (2022) (Lobsters)
- Dimensional Modeling Techniques - Kimball Group
- ClickBench — Benchmark For Analytical DBMS (Snowflake, Druid, Redshift) (HN)
- Comparing Popular Time Series Databases (2022)
- Offline data access: a dream come true? (2022)
- Soft Deletion Probably Isn't Worth It (2022) (HN) (Lobsters)
- Overview of Consistency Levels in Database Systems (2019) (HN)
- Join Doe - Tool for replicating database contents between environments while deidentifying sensitive data.
- GoBackup - Simple tool for backup your databases, files to FTP / SCP / S3 storages.
- ClickBench - Benchmark For Analytical Databases.
- The Slotted Counter Pattern (2022)
- Convex - Global state management platform for web developers. (GitHub)
- Assembling a Query Engine From Spare Parts (2022)
- Reddit’s database has two tables (2012) (HN)
- Ideas for DataScript 2 (HN)
- Rise of the Anti-Join (2022) (Lobsters)
- ArrayQL Integration into Code-Generating Database Systems (2022)
- How discord stores billions of messages (2017) (HN)
- Seaborn Data - Data repository for seaborn examples.
- Ask HN: Do you use foreign keys in relational databases? (2022)
- kvdbd - Daemon that enables reading/writing of flat-file key/value databases available via HTTP API, using REST/JSON or Protobufs.
- Doozer - Highly-available, completely consistent store for small amounts of extremely important data.
- dblab - Interactive client for PostgreSQL, MySQL and SQLite3.
- Cachegrand - Modern OSS Key-Value store built for today's hardware. (HN)
- Differential dataflow is the next level of query optimization (2022)
- echodb - Embedded, in-memory, immutable, copy-on-write, key-value database engine.
- Retrospection and Learnings from Dgraph Labs (2022) (HN)
- Database concepts you wish you understood better (2022)
- A minimal distributed key-value database with Hashicorp's Raft library (2022)
- Bustle - Benchmarking harness for concurrent key-value collections.
- Prequel - Data push & data warehouse integration. (HN)
- Chimp: Efficient Lossless Floating Point Compression for Time Series Databases (2022)
- The next generation of Materialize (2022) (HN)
- A Streaming Database (2021)
- Don’t make databases available on the public internet. Use pgproxy (2022)
- Hera - High Efficiency Reliable Access to data stores.
- Low-Latency Distributed Data Strategies at P99 CONF: SQL, NoSQL & Event Streaming (2022)
- A Database Without Dynamic Memory Allocation (2022) (HN)
- Real-World Engineering Challenges: Migrations (2022) (HN)
- The B-Tree, LSM-Tree, and the Bw-Tree in Between (2022)
- Big Data Storage (2022)
- Rewriting a high performance vector database in Rust | Pinecone (2022)
- Ephemeral DB, a sacrificial database line for high-throughput data (2022)
- fake2db - Generate fake but valid data filled databases for test purposes using most popular patterns.
- Serverless ETL runtime for cloud databases
- Database Review 2021 (HN)
- Ask HN: What do you use for a personal database? (2022)
- "The Evolution of a Planetary-scale Distributed Database" by Kevin Scaldeferri (2022)
- Database optimization, analytics and burnout (2022)
- Index Merges vs Composite Indexes in Postgres and MySQL (2022) (Lobsters)
- Modern data modeling: Start with the end? (2022) (HN)
- Get Rid of Your Old Database Migrations (2022)
- Ariga - New way to manage database schemas.
- EDMA - Interactive embedded database management system.
- Building a database in the 2020s (HN)
- Database Drivers: Naughty or Nice? (2022) (HN)
- How to visualize the database using Minimal Modeling (2022)
- QPML - Query Plan Markup Language.
- LSM in a Week - Tutorial of building an LSM-Tree storage engine in a week.
- Closing The Gap Between Your Users And Their Data (2022)
- Awesome Data Temporality - Curated list to help you manage temporal data across many modalities.
- Offline-First databases/tools
- Make your database tables smaller (2022) (HN)
- ULIDs and Primary Keys (2022) (HN)
- Databases in 2022: A Year in Review (HN)
- Understanding N + 1 queries problem (2023) (HN)
- Personal blog about my PostgreSQL daily learnings
- Percona Monitoring and Management - View and monitor the performance of your MySQL, MongoDB, PostgreSQL, and MariaDB databases.
- Awesome Identifiers - Pick the best database primary key. (Code)
- How Query Engines Work (HN)
- ULID Identifiers and ULID Tools Website (2023) (Lobsters)
- Bullshit graph database performance benchmarks (2023) (HN)
- Modern storage is plenty fast. It is the APIs that are bad (2020)
- Time Series Benchmark Suite - Tool for comparing and evaluating databases for time series data.
- CaskDB - Build your own disk based KV store in Go.
- Query Graphs - Visualizer for queries - Hyper, Postgres, Tableau. (Code)
- Building a Simple DB in Rust (2023) (Part 2)
- cder - Lightweight, simple database seeding tool for Rust.
- The Magic of Small Databases (2023) (HN)
- Client-side reactive databases (2023)
- 2023 State of Databases for Serverless & Edge
- How to protect your database (from yourself) (2023)
- 15 futuristic databases you’ve never heard of (2023)
- TypeORM Seeding - Delightful way to seed test data into your database.
- Efficient and Compact Spreadsheet Formula Graphs (2023)
- Our Mad Journey of Building a Vector Database in Go (2023)
- cpc - Copy tool for incremental copies of large files, such as databases.
- Are You Sure You Want to Use MMAP in Your Database Management System? (2022)
- Five Methods For Database Obfuscation (2023)
- Techniques for Scaling Applications with a Database (2023)
- Speedy Transactions in Multicore In-Memory Databases (2023) (HN)
- Dumping databases for faster furigana (2023)
- Why (Graph) DBMSs Need New Join Algorithms: The Story of Worst-case Optimal Join Algorithms (2023) (HN)
- A Relational Spreadsheet (2023) (HN)
- Database Cryptography Fur the Rest of Us (2023) (Lobsters)
- How to model one-to-one relationships - Tigris Data Modeling Series (2023)
- How Binary JSON Works in YDB (2022)
- How Did I Become Database Engineer at 23 (2022)
- Things DBs Don't Do - But Should (2023)
- A developer-driven approach to building secondary indexes presentation (2023)
- Neuledge - Universal language to model, share, and interact with databases.
- MVCC for Rust - Rust implementation of the Hekaton optimistic multiversion concurrency control algorithm.
- High-Performance Concurrency Control Mechanisms for Main-Memory Databases (2012)
- Relational Operators in Apache Calcite (2021) (HN)
- Do you need a vector database? (2023) (HN)
- LanceDB - Serverless, low-latency vector database for AI applications. (HN)
- Production grade databases in Rust (2023)
- Scaling Databases at Activision (2023) (HN)
- Build Your Own Database From Scratch (HN)
- Database branching: three-way merge for schema changes (2023) (HN)
- OtterTune - AI Powered Automatic PostgreSQL & MySQL Tuning.
- Datomic is Free (2023)
- HyperDB - Hyper-fast local vector database for use with LLM Agents.
- What is a Vector Database? (2023) (HN)
- ReefDB - Minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search. (Reddit)
- Merklizing the key/value store for fun and profit (2023) (HN)
- Ask HN: It's 2023, how do you choose between MySQL and Postgres? (2023)
- Product Quantization for Vector Search (2023) (HN)
- Kayvee - Distributed in-memory key-value store built using hashicorp/memberlist with HTTP API.
- Data wrangling with Data Wrangler (2023)
- Simplest Vector DB Implementation? (2023)
- Vemcache - In-memory vector database.
- Pinecone Python Client
- High-Performance Graph Databases That Are Portable, Programmable, and Scale to Hundreds of Thousands of Cores (2023) (HN)
- Ask HN: Suggestions to host 10TB data with a monthly +100TB bandwidth (2023)
- Qdrant under the hood: Product Quantization (2023)
- Sketch of a Post-ORM (2023) (HN)
- The growing pains of database architecture (2023) (HN)
- mutable - Database System for Research and Fast Prototyping.
- Openline customerOS - Platform that enables you to bring your customer back to the center of your work.
- Bulker - HTTP server that simplifies streaming large amounts of data into databases. It is designed to be used as a part of ETL pipelines.
- Mosaic - Extensible framework for linking databases and interactive views.
- Undb - Private first, unified, self-hosted no code database. (Code)
- Is ORM still an anti-pattern? (HN)
- Scaling Linear's Sync Engine (2023) (HN) (Tweet)
- tinyvector - Tiny nearest-neighbor embedding database built with SQLite and PyTorch.
- vlite - Fast, lightweight, and simple vector database written in less than 200 lines of code.
- tinyvector - Tiny embedding database in pure Rust.
- Migrating a 2TB database in 7.5 minutes (2023)
- VectorDB - Python vector database you just need - no more, no less.
- Awesome Vector Database
- keyva - Distributed key-value store.
- Building and operating a pretty big storage system called S3 (2023)
- Query Engines: Push vs. Pull (2021)
- Readings in Database Systems, 5th Edition
- CachewDB - Light weight, typed, in-memory, ordered, key-value database.
- Testing graph databases (2023)
- How Databases Store and Retrieve Data with B-Trees (2023)
- Artie Transfer - Real-time data replication from OLTP to OLAP dbs.
- SpacetimeDB - Multiplayer at the speed of light. (Web) (HN) (HN)
- Do we really need a specialized vector database? (2023) (HN)
- Epsilla - High performance Vector Database Management System. (Code) (HN)
- Uses and abuses of cloud data warehouses (2023) (HN)
- Vector databases: analyzing the trade-offs (2023) (HN)
- Database Performance at Scale (2023)