DevOps
I think Railway is amazing example of doing Devops right. Porter, Backstage & Gaia are nice too.
Google SRE Book is great. Airplane is nice for exposing common commands for all in team to use.
Devbox is nice for bootstrapping environments.
It's interesting seeing how much internal infrastructure Google and FB uses.
Notes
Links
- Ask HN: What is the fastest way to ramp up on DevOps, k8 and GCP? (2021)
- DevOps, SRE, and Platform Engineering (2021)
- We're Reddit's Infrastructure team, ask us anything! (2018)
- Vercel - Develop. Preview. Ship. (Web)
- Now Examples - Examples of Now deployments you can use.
- I forgot how to manage a server (2019) (HN)
- Applikatoni - Self-hosted deployment server for your team.
- Lobsters: What’s your container-less deployment process? (2019)
- A developer goes to a DevOps conference (2019) (HN)
- Deploy your side-projects at scale for basically nothing - Google Cloud Run (2020) (HN)
- DevOps Questions & Exercises
- Ops Lessons We All Learn The Hard Way (2020)
- Juju - Simple, secure devops tooling built to manage today's complex applications wherever you run your software. (Web)
- Book Recommendations for the Infrastructure Engineer
- Ask HN: How do you make sure your servers are up as a single founder? (2020)
- CTO.ai - Allows you and your software development team to implement DevOps automations in minutes rather than days.
- Deploys at Slack (2020)
- We Need DevOps for ML Data (2020) (HN)
- Awesome Pipeline - Curated list of awesome pipeline toolkits inspired by Awesome Sysadmin.
- Awesome Sysadmin - Curated list of awesome open source sysadmin resources.
- Using SRE to meet reliability challenges | Google Cloud (2020)
- Gruntwork - DevOps as a Service.
- pyinfra - Automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more. (HN) (Web) (HN)
- Testinfra - Write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
- PagerDuty Incident Response Documentation (Code)
- Building an online community around learning from incidents (2019) (HN)
- The Rise of Platform Engineering (2020) (HN)
- How we monitor our services at SourceHut (2020)
- Reference checklist for going to production
- Revolv - Create a complete cloud architecture on your Amazon Web Services, Google Cloud Platform or Microsoft Azure account. (HN)
- Clutch - Extensible platform for infrastructure management. (Announcement)
- What is DevOps? (2020)
- Sysdig - Security, Compliance & Performance for your Devops Workflows.
- A List of Skills and PracticesWe Use to Train Our DevOps Internally (2020)
- Bridgecrew - Codified cloud security for DevOps. (GitHub)
- You Reap What You Code (2020)
- How we use HashiCorp Nomad (2020) (HN)
- Ask HN: Has anyone moved from Kubernetes to Nomad? (2020)
- Qovery - Deploy your apps on any Cloud providers in just a few seconds. (Web)
- packagecloud - Private NPM registry and Maven, RPM, DEB, PyPi and RubyGem Repository.
- Gravitational - Remote Access and Secure Deployments.
- DeployHQ - Automatically build and deploy code from your repositories.
- Cooking Infrastructure by Chef (Code)
- Unleash - Open source feature toggle service. (Code) (GitHub)
- The golden age of configuration languages (2020) (HN)
- School of SRE (Code) (HN)
- Christine Dodrill: ex-SRE, Lightspeed (2020)
- driftctl - Detect, track and alert on infrastructure drift. (Code)
- Shipyard - Modern cloud native development environments. (Web)
- FAUN - DevOps community.
- DevOps Maturity Framework
- Bitnami - Packaged Applications for Any Platform - Cloud, Container, Virtual Machine. (GitHub)
- Bitnami Library for Kubernetes
- Kira - Project management framework with deep philosophy underneath.
- Site Reliability Engineer Interview Preparation Guide
- fastlane - App automation done right. (Code)
- List of Devops Resources
- werf - Git as a single source of truth. Build. Deploy to Kubernetes. Stay in sync. (Web)
- Zero-downtime deploys with DigitalOcean, GitHub, and Docker (2021)
- Running Nomad for home server (2021) (Lobsters) (HN)
- They SRE - Curated Collection on Site Reliability Engineering.
- DevOps Resources
- We are far from a better Heroku for production apps in a hyper cloud (2021) (HN)
- coolify - Open-source, self-hostable Heroku and Netlify alternative. (Code) (HN)
- CloudARK - Platform-As-Code. (GitHub)
- Meltano - ELT for the DataOps era. (Code)
- DigitalOcean Agent - Collects system metrics from DigitalOcean Droplets.
- Pulumi - Modern Infrastructure as Code. Any cloud, any language. (Code) (HN) (HN 2) (Awesome)
- Piku - Tiniest PaaS you've ever seen. Piku allows you to do git push deployments to your own servers. (GitHub)
- Awesome Incident Response
- Fleet - Open source device management. (Code)
- Reliably CLI - Reliability as Code: SRE automation at the tip of your fingers. (Web)
- To PaaS or not (2021)
- SRE at Google: Our complete list of CRE life lessons (2021)
- Bad Machinery: Managing Interrupts Under Load
- Securing DevOps: Security in the Cloud (2018)
- Craft - Universal Release Tool (And More).
- DevOps Cheat Sheets (Code)
- MegaEase - High Performance Software Architecture. (GitHub)
- Erda - Enterprise-grade application building, deploying, monitoring platform.
- DevOps Engineering Course for Beginners (2021)
- How to improve your website’s uptime (2021)
- Peanut - Deploy Databases and Services Easily for Development and Testing Pipelines. (Web)
- DevOps Engineer Crash Course (2021)
- Artillery.io - Modern load testing & smoke testing for SRE and DevOps. (Code)
- Top-10 talks of SREcon18 Europe (2018)
- The DevOps: A Concise Understanding to the DevOps Philosophy and Science. (Technical Report) (2021)
- Cachito - Caching service for source code and external dependencies.
- envsafe - Makes sure you don't accidentally deploy apps with missing or invalid environment variables.
- Uptime Kuma - Fancy self-hosted monitoring tool. (HN)
- Ask HN: Solo-preneurs, how do you DevOps to save time? (2021)
- How to Use Hydra as your Deployment Source of Truth (2021) (Lobsters)
- What to Ask in an SRE Technical Interview (2021)
- DevOps Newsletters of Note
- batou - Helps you to automate your application deployments using Python DSL. (Docs)
- Smallstep - Automated Certificate Management for DevOps. (GitHub)
- Learn-by-Doing Platforms for Dev, DevOps, and SRE Folks (2021)
- StackStorm - Platform for integration and automation across services and tools, taking actions in response to events. (Code)
- Grafana OnCall - Easy-to-use on-call management tool. (HN)
- Ironic - Service for managing and provisioning Bare Metal servers.
- Scaled Agile DevOps Maturity Framework - Enterprise transformation without the risk of culture change.
- Plunder - Single-binary server that is all designed in order to make the provisioning of servers, platforms and applications easier.
- Equinix Metal Images
- Cloud Droid - Cloud Incident and Response Simulations.
- The Reports of Devops's death are greatly exaggerated (2021)
- hcltm - Threat Modeling with HCL.
- Hyperping - Uptime monitoring with public status pages.
- hashi-up - Bootstrap HashiCorp Consul, Nomad, or Vault over SSH < 1 minute.
- faas-nomad - OpenFaas provider for Nomad.
- A Multi Cluster and Multi Orchestrator home lab (2021)
- DevOps in academic research (2021)
- Hetzner Pulumi Intro (2021)
- The Operator Pattern in Nomad (2021)
- Dev Lake - Brings all your DevOps data into one practical, personalized, extensible view. Ingest, analyze, and visualize data.
- Fastly Resource Provider
- OOPS (Learning from the incident you didn't have) writeup template (2021)
- Awesome DevOps
- Ultimate DevSecOps library
- Common Infrastructure Errors I've Made (2021) (Lobsters) (HN)
- Lightweight Experiment & Resource Monitoring
- Howie: The Post-Incident Guide
- Jeli - Dedicated Incident Analysis Platform.
- Zero - Opinionated infrastructure to take you from idea to production on day one. (Code)
- ClusterDev - Cloud Management and Automation Framework. (Code)
- Deployment from Scratch - Complete guide to web application deployment. (HN) (One year of sales)
- Awesome Event IDs - Collection of Event ID resources useful for Digital Forensics and Incident Response.
- Cloudkeeper - “housekeeping for clouds” - find leaky resources, manage quota limits, detect drift and clean up.
- Atomist - Keep Your Containerized Applications Safe. (GitHub)
- UpCheck - Declarative checker for website uptime to run continuously for monitoring.
- GRR - Incident response framework focused on remote live forensics.
- OWASP DevSecOps Guideline - Can help us to embedding security as a part of the development pipeline.
- DevOps by Example
- Brev.dev - Your local-only cloud computer. (CLI)
- Goss - Quick and Easy server testing/validation.
- FeatureHub - Cloud native feature flags, A/B testing and remote configuration service. (Code)
- waifud - Few tools to help me manage and run virtual machines across a homelab cluster. (Progress Report)
- Delayed Job vs. Sidekiq (2022) (HN)
- Cincinnati - Update protocol designed to facilitate automatic updates.
- Ministry of Justice Modernization Platform - Defined and managed in Terraform.
- fw - Workspace productivity booster.
- Motive - Programmable Task runner built with Rust and uses a special version of Lua. (Reddit)
- DevStream - Open-source DevOps toolchain manager (DTM).
- Opta - Infrastructure-As-Code framework where you work with high-level constructs instead of getting lost in low level cloud configuration.
- Yaru - Command line tool that manages simple tasks.
- Site Reliability Engineering University
- EaseProbe - Simple, standalone, and lightWeight tool that can do health/status checking, written in Go.
- Porter - Enables you to package your application artifact, client tools, configuration and deployment logic together as a versioned bundle that you can distribute, and install with a single command. (Code)
- Fiberplane - Collaborative notebooks for debugging your incidents. (GitHub) (Twitter)
- Fundamentals & Deployment (2022) (HN)
- Entropy - Framework to safely and predictably create, change, and improve modern cloud applications and infrastructure using familiar languages, tools, and engineering practices.
- Firefly - Bring your cloud up-to-code. (GitHub)
- bldr - Tool to build and package software distributions. Build process runs in buildkit (or docker buildx), build result can be exported as container image.
- Boost Note - Document driven project management tool that maximizes remote DevOps team velocity. (Code)
- arx - Bundles code and a job to run for local or remote execution.
- Open Build Service - Generic system to build and distribute binary packages from sources in an automatic, consistent and reproducible way. (CLI)
- JReleaser - Release projects quickly and easily. (Web)
- 90 Days of DevOps
- Glances - Cross-platform monitoring tool which aims to present a large amount of monitoring information through a curses or Web based interface.
- OpenStack Glance - OpenStack project that provides services and associated libraries to store, browse, share, distribute and manage bootable disk images.
- Regula - Tool that evaluates infrastructure as code files for potential AWS, Azure, Google Cloud, and Kubernetes security and compliance violations prior to deployment. (Docs)
- Awesome Site Reliability Engineering Tools
- SRE Cheat Sheet
- Massdriver - Effortless DevOps.
- Checkup - Gather static analysis insights for your projects.
- Gatus - Automated service health dashboard. (Web)
- Lightweight Cluster/Cloud VM Job Management
- Echoes HQ - Developer-friendly activity reports.
- Gasper - Intelligent Platform as a Service (PaaS) used for deploying and managing applications and databases in any cloud topology.
- Post-Incident Review on the Atlassian April 2022 Outage (Lobsters) (HN)
- Founding Uber SRE (HN)
- DevSecOps Playbook
- How we deploy to production over 100 times a day (2022)
- Release - Minimalistic, opinionated, and predictable release automation tool.
- StatusBase - Uptime monitoring tool & beautiful status pages.
- Dagu - Self-contained, standalone No-code workflow executor that runs DAGs defined in a simple, declarative YAML format that is similar to GitHub Actions or Argo Workflows with built-in Web UI.
- atmos - Universal Tool for DevOps and Cloud Automation (works with terraform, helm, helmfile, etc). (Guide)
- Delivery CLI - Command line tool for the workflow capabilities in Chef Automate.
- Spin Cycle - Automate and expose complex infrastructure tasks to teams and services.
- A review of Accelerate: The Science of Lean Software and DevOps (2022)
- sake - Command runner for local and remote hosts.
- Nomad Helper - Useful tools for working with Hashicorp Nomad at scale.
- Interval - Batteries-included approach to building rich internal tools directly in your app’s backend codebase. (Twitter) (Explained)
- Wander - Terminal application for Nomad by HashiCorp.
- Monitoring tiny web services (2022) (Lobsters) (HN)
- Updatecli - GitDevOps Automation Engine.
- Flow Distributed Workflow System - Provides a GRPC API that is used by clients to submit and manage workflows. (Docs)
- Tactical RMM - Remote monitoring & management tool, built with Django, Vue and Go.
- Using BSD make (Lobsters)
- What are the best books or resources on SRE automation that are also practical? (2022)
- Superblocks - IDE for Internal Apps, APIs and Cron Jobs. (HN)
- Multy - Easily deploy multi cloud infrastructure. Write cloud-agnostic config deployed across multiple clouds. (Web)
- Who's NOT using Kubernetes these days and want to share their exciting bit/tooling? (2022)
- Developers, please nurture your coding experience
- Excav - Tool for patching repositories in bulk.
- multi-semantic-release - Proof of concept that wraps semantic-release to work with monorepos.
- Coder - Remote development on your infrastructure. (Awesome)
- DevOps RoadMap
- Gru - Fast and concurrent orchestration framework powered by Go and Lua, which allows you to manage your UNIX/Linux systems with ease.
- Cello - Service for running infrastructure as code software tools including CDK, Terraform and CloudFormation via GitOps.
- Good CI/CD and SRE Blogs (2022)
- Shipping to Production
- Traditional packaging is not suitable for modern applications? (2022) (HN)
- Platformatic - Set a Open Source tools that you can use to build your own Internal Developer Platform.
- Internal Developer Platform - Helps Ops teams structure their setup and enable developer self-service. (Code)
- k6 - Modern load testing tool, using Go and JavaScript by Grafana. (Web) (k6 learn)
- Bucketeer - Feature Flag Management and A/B Testing platform. (Code)
- How to Build Software Like an SRE (2022) (HN) (Lobsters)
- DevOps has devolved (2022) (HN)
- Gaia - Build powerful pipelines in any programming language.
- Damon - Terminal Dashboard for HashiCorp Nomad.
- Waterwheel - Job scheduler similar to Airflow but with a very different design.
- Retool Workflows: Cron, but Better (2022) (HN)
- Hadmean - Generate powerful, fully functional, ready-to-be-deployed admin apps in seconds. (Web)
- Sidekiq Server - Sidekiq server implemented in Rust.
- Eventline - Job scheduling and execution platform. (Code)
- Infisical - Open-source, E2EE tool to sync environment variables across your team and infrastructure. (Code) (HN) (HN)
- Apex - Interface definition language (IDL) for modeling software. Generate source code, documentation, integration, everything automatically. (GitHub)
- Flipt - Open source, self-hosted feature flag solution. (Code)
- beanstalkd - Fast general purpose work queue.
- aiac - Artificial Intelligence Infrastructure-as-Code Generator.
- Ask HN: How do you deploy your side-projects? (2022)
- Things I want from Devs as SRE/DevOps (2022) (HN)
- Getting Started with Server Health Checks
- Ask HN: What is the cheapest, easiest way to host a cronjob in 2022?
- Miradors - Simple tool allowing you to monitor is your websites are up and send you an email if not.
- tandem - Parallel task runner for servers and long-running commands.
- Single-file scripts that download their dependencies (HN)
- EnvShare - Share Environment Variables Securely. (Code)
- Edge Flags - Feature flags for edge functions.
- Learn DevOps Links
- octocov - Toolkit for collecting code metrics (code coverage, code to test ratio and test execution time).
- Envless - Secure and sync your secrets. (Code) (Docs)
- Cron-Job.org - Scheduled execution of your websites and scripts. (Code)
- precloud - Dynamic tests for infrastructure-as-code.
- SRE Checklist
- vendir - Easy way to vendor portions of git repos, github releases, helm charts, docker image contents, etc. declaratively. (Web)
- Little Loadshedder - Rust hyper/tower service that implements load shedding with queuing & concurrency limiting based on latency.
- OneUptime - Open-source complete SRE and DevOps platform. (Code)
- Ask HN: Best practices for self-healing apps? (2023)
- Enrolla - Open source feature flags for B2Bs. (Code) (HN)
- Keep - Manage your alerts by code, write better more actionable and accurate alerts. (HN)
- Don't deploy manually (2023)
- xc - Simple, Convenient, Markdown defined task runner. (HN)
- Traceo - Simple platform to monitor application performance with error handling.
- Chaos - Tool can cause a chaos to running servers.
- Framework-defined infrastructure (2023)
- Featurevisor - Feature flags and experimentation management solution for developers.
- Tramline - Release apps without drowning in process. (Lobsters)
- Universal Binary Installer
- CHOMP - 'JS Make' - parallel task runner for the frontend ecosystem with a JS extension system.
- The funniest performance regression you've seen (2023)
- Dagger, a ❤️ story (2023)
- A love letter to make (Lobsters) (HN)
- hof - Framework that joins data models, schemas, code generation, and a task engine. Language and technology agnostic.
- CloudKnit - Open Source Solution for Managing Cloud Environments.
- MRSK: hot deployment tool to watch—or a total game changer? (2023)
- Opslib - Pythonic toolkit to manage infrastructure.
- DevPod - Client-only tool to create developer environments based on a devcontainer.json on any backend.
- DevOps Notes (Code)
- Preevy - Quickly deploy preview environments to the cloud.
- Hop CLI - Interact with Hop in your terminal.
- Framed - CLI tool for projects management.
- Nocalhost - Cloud Native Dev Environment.
- Levant - Open source templating and deployment tool for HashiCorp Nomad jobs.
- How to Build Software From Source - Andrew Kelley (2023)
- OpenStatus - Open-source status page. (Code)
- Load Testing Tips (2022)
- Funnel - Toolkit for distributed task execution via a simple, standard API.
- Don’t Configure Control Flow (2023) (Lobsters)
- Zeabur - Deploying your service with one click. (CLI) (Code)
- zbpack - Build your project into static assets, serverless function or container image with magic, no Dockerfile needed.
- Sly CLI - Add code, not dependencies. (Code)
- Awesome Infrastructure-from-Code
- Risor - Fast and flexible scripting for Go developers and DevOps.
- Whenever - Task Scheduler.
- System Initiative - Collaborative power tool designed to remove the papercuts from DevOps work. (Code)
- System Initiative Open Source (Lobsters)
- DevOpsGPT - AI-Driven Software Development Automation Solution.
- Bump My Version - Version-bump your software with a single command.