César D. Velandia

The Build System Illusion: What We Lose When Everything Looks Like a Cloud Deploy

César D. Velandia — Mon, 22 Dec 2025 02:34:56 GMT

Someone on my team asked yesterday: "Can we just containerize this and deploy it like everything else?"

We were talking about pushing a security agent to endpoint devices. Not EC2 instances. Not Kubernetes nodes. Actual routers sitting in living rooms, maybe getting power-cycled when someone's toddler finds the button (real story).

The question reveals something interesting about how we think about infrastructure now. We've spent a decade building elegant abstractions for cloud deployments, and somewhere along the way we convinced ourselves these abstractions are universal. They're not. And when you try to force them onto problems they weren't designed for, you start discovering all the assumptions baked into your tooling.

This isn't a tutorial. This is me documenting what breaks when you stop deploying to datacenters you control.

What We Forgot While Building CI/CD Pipelines

Every CI/CD tutorial follows the same script: code → test → build → deploy. The tutorials work because they make assumptions they never mention out loud.

Your deployment target can reach out and pull updates. It has enough bandwidth to do this efficiently. When something fails, you can just retry. If the retry fails, you can roll back by deploying something else. The network is reliable enough that these operations complete in reasonable time.

These aren't features of "deployment." These are features of deploying to infrastructure you control.

I started mapping our current pipeline to see what would need to change for distributed device deployment. Standard setup:

Git push triggers CI
Run tests, security scans, build artifacts
Push to artifact registry
Orchestrator pulls and distributes

This is elegant. It works beautifully. And it will likely fall apart when your deployment target is 10,000 clients that:

Might be offline when you push the update
Might stay offline for days
Might have 5 Mbps connections shared with a family streaming Netflix
Might fail the update halfway through and just... stay that way
Might be running old firmware you didn't even know existed

The scary part isn't that the pipeline doesn't work. The scary part is how long it took us to realize it wouldn't work. We've gotten so good at cloud deployments that we've forgotten these are solved problems for a very specific environment.

What Breaks First: The Optimistic Network Assumptions

Cloud deployments assume the network is basically a non-issue. Sure, you might have a timeout here or there, but fundamentally you trust that HTTP requests complete and your orchestrator can talk to your services.

This assumption dies immediately with distributed devices.

Picture this: You push an update at 2 AM (because that's when traffic is lowest). Of your 10,000 clients, maybe 7,000 are online. They start pulling the artifact. Except it's 50MB, and some of these devices are on rural connections, and now you're essentially DDoS'ing your own artifact registry.

The ones that do start downloading—what happens when the connection drops halfway through? In Kubernetes, a failed pull just reschedules. The container runtime handles it. With a router? You might have a partially written filesystem and a device that won't boot on the next power cycle.

"Just use resumable downloads," someone will say. Sure. But now you need:

Devices to track download state locally
Verification that partial downloads aren't corrupted
A way to clean up failed attempts that don't complete
Some kind of backoff so 3,000 devices don't hammer your CDN simultaneously when they retry

You're rebuilding a download manager. For every device type. Because you can't just docker pull anymore.

The deeper issue is that cloud infrastructure trained us to ignore these problems. When was the last time you thought about how kubectl apply actually gets your deployment to the nodes? You don't, because Kubernetes handles it. That's the abstraction working.

But abstractions leak, and when they leak at the edge, you're left rebuilding things you thought were solved problems.

The Prototype: Reinventing What We Thought Was Solved

I built a prototype because I wanted to understand what we're actually trading when we move from cloud to edge.

The architecture isn't novel:

Build system creates signed artifacts (same as before)
Artifacts get split into chunks with content-addressed storage
Regional edge nodes cache the chunks
Devices pull from nearest node when online
Each device verifies integrity before applying
Interrupted downloads can resume

This is roughly what every OTA update system does. Which raises the question: why doesn't everyone just use an existing OTA framework?

Because OTA frameworks assume you're updating firmware. They live in a world where:

Updates happen maybe monthly
You're replacing the entire system image
The device reboots as part of the process
"Rollback" means keeping two full system partitions

We're trying to update application code. Potentially daily. Without rebooting. While the application is serving traffic. With minimal storage overhead.

The existing solutions don't fit. So you end up rebuilding pieces of:

Package managers (dependency resolution, version management)
Container runtimes (layered filesystems, atomic updates)
Service orchestrators (health checking, rollback logic)
Download managers (chunking, resume, verification)

All the things we thought were solved because Kubernetes and Docker handle them for us. Except now you can't use Kubernetes or Docker because they assume resources you don't have.

What You Give Up: The Observability Black Hole

Here's where it gets uncomfortable.

In cloud deployments, observability is basically solved. Prometheus scrapes metrics. Logs stream to Elasticsearch. Traces go to Jaeger. You can see what's happening across your entire fleet in real-time.

With distributed devices? You're flying blind.

A device pulls an update. It applies the update. Maybe it works. Maybe it doesn't. You won't know unless:

The device can phone home (costs bandwidth)
It's currently online (can't assume this)
Your telemetry isn't broken (but how would you know?)
The update didn't break the telemetry itself (classic chicken-egg)

I added a lightweight telemetry system to my prototype. Devices queue status updates locally and flush when they have connectivity. Sounds reasonable until you realize:

If an update breaks the device, the telemetry can't report the failure. So you see... nothing. Just silence. Which could mean "device is offline" or "device is broken" or "device is fine but busy" or "telemetry is broken but device is fine."

You've lost the feedback loop that makes cloud development tolerable. Push a bad update and you might not know for hours or days. By then, understanding what failed requires physical access to the device.

This isn't a solvable problem. It's a fundamental trade-off. You're giving up observability for distribution. The question is whether you understand what you're trading.

The Parts Where Cloud Thinking Fails Completely

Rollback: The Illusion of Control

Cloud deployment rollback is simple. You deploy version N-1. The orchestrator handles it. Done.

This works because you control both ends of the transaction. You control when the rollback happens. You control which nodes get rolled back first. You can verify the rollback worked.

Now try rolling back 10,000 distributed devices. You push the rollback command. What happens?

3,000 devices are offline. They'll get the rollback... eventually. Maybe tomorrow. Maybe next week. 2,000 devices are in the middle of applying the broken update. Do they abort? Do they finish and then rollback? 1,000 devices already applied the update and have been running it for 6 hours. Their local state might be incompatible with the old version.

The remaining 4,000 successfully rollback. Probably. You think. You won't actually know for a while because of the observability problem.

What you've lost is the ability to reason about system state. In Kubernetes, rollback is atomic (or near enough). With distributed devices, you're managing a multi-day eventually-consistent rollback process where you can't observe success and can't guarantee completion.

The solution? Most OTA systems don't really support rollback. They support "keep the old version around and boot into it if the new version crashes repeatedly." Which helps with catastrophic failures but does nothing for subtle bugs that don't crash.

You're rediscovering why mobile app developers test so carefully before release. Because once it's deployed, "rollback" isn't really an option.

Heterogeneity: When Your Build Matrix Explodes

Cloud infrastructure is beautifully uniform. You might have different instance types, but they're all running basically the same OS, same CPU architecture, same capabilities.

Edge devices laugh at this uniformity.

Half your devices have 512MB RAM. The other half have 2GB. Some have hardware crypto. Some don't. Different CPU architectures. Different kernel versions. Different storage constraints.

Do you:

Build separate artifacts for each variant? (Your CI now builds 15 different versions)
Build one binary with feature detection? (Bloated, complex, testing nightmare)
Modular plugins? (Adds deployment complexity, version compatibility matrix)

Each choice trades something. More build complexity for smaller binaries. Simpler builds for harder testing. Runtime efficiency for deployment flexibility.

The cloud abstracted this away. EC2 instances are predictable. Kubernetes nodes are predictable. You built your tools for predictability.

Now you're remembering why embedded developers have such elaborate build systems. Because building for heterogeneous hardware is genuinely complicated, and there's no abstraction that makes it simple without hiding real constraints.

State Management: The Distributed Database You Didn't Mean to Build

Here's a fun one: device configuration.

In cloud deployments, configuration is in git. You push changes, they roll out, done. Want to update a feature flag? Change the value, deploy.

With distributed devices, configuration becomes a distributed database problem.

Device A gets new config version 5. Device B is offline, still running config version 3. Device C failed to apply version 4 and rolled back to version 3, but its local state reflects changes from version 4.

Now a user reports a bug. Which config version were they running? Which version should they be running? Is the bug because of the config, or because of the application, or because of interaction between mismatched versions?

You can't just "check the current config" because there is no current config. There are 10,000 different configs in various states of consistency.

The cloud solution—configuration management tools, service discovery, distributed consensus—doesn't work when devices are offline more often than they're online.

You're reinventing eventual consistency patterns from distributed databases. Except you're doing it for configuration management. And you're discovering why distributed databases are hard.

What We Forgot While Building Kubernetes

This started as a simple deployment problem. It turned into a reminder about how much knowledge we've lost.

A generation of developers now knows how to deploy to Kubernetes but not how to build systems that don't assume Kubernetes exists. We know how to write microservices but not how to design for intermittent connectivity. We know how to scale horizontally but not how to handle actual physical distribution.

This isn't their fault. The abstractions worked so well that we stopped teaching the fundamentals they abstracted away.

Want to deploy an update? kubectl apply. Want to rollback? kubectl rollout undo. Want observability? Install Prometheus. These are solved problems. Solved so thoroughly that we forgot what the problems were.

Edge computing is forcing us to remember.

The network isn't reliable. Deployment targets aren't uniform. State isn't eventually consistent—it's just inconsistent. Rollback isn't atomic. Observability isn't centralized. Configuration isn't unified.

These aren't bugs in edge computing. These are properties of distributed systems that cloud infrastructure successfully hid from us. We thought we'd solved distributed systems. We just built a very expensive abstraction that works when you control the datacenter.

What This Means for the Next Decade

We're about to see a lot of bad architecture. Not because people are incompetent, but because the tooling they know doesn't apply and the knowledge base to build new tooling has atrophied.

Mobile developers know these problems. Embedded systems engineers know these problems. Game developers know these problems.

But there's a whole generation of backend/cloud engineers who've never had to think about:

Offline-first operation
Bandwidth-constrained deployments
Devices that can't be SSH'd into
Updates that take days to propagate
Debugging without centralized logs
State that's distributed by default, not by choice

The industry is hiring cloud engineers to build edge systems. Then expressing surprise when the solutions look like complex, expensive reinventions of problems that were solved in embedded systems 20 years ago.

We need to rediscover institutional knowledge we thought we didn't need anymore. Or admit that "edge computing" is just going to be "cloud computing with extra steps and worse performance."

The Reading List Nobody Talks About

The resources I've found useful aren't from cloud computing thought leaders. They're from the places that never forgot these problems:

SWUpdate documentation (embedded Linux) - Atomic updates, rollback strategies
Mender architecture - OTA updates for embedded devices
Chrome OS update_engine - Delta updates, cryptographic verification
FreeRTOS OTA - Working with device constraints
Balena documentation - Fleet management for IoT

These aren't sexy. They're not written by FAANG engineers blogging about scaling to billions of requests. They're written by people who've been deploying to heterogeneous hardware with unreliable networks for decades.

The irony is that we're now reinventing their solutions, but worse, because we're trying to apply cloud patterns to problems cloud patterns don't fit.

What I'm Actually Building (And Why)

My prototype isn't trying to be production-ready. It's trying to understand the problem well enough to recognize good solutions when I see them.

Right now I'm working through:

Chunk size optimization (smaller = better resumability, larger = less overhead)
Testing at scale without 10,000 physical devices
Security/verification vs. resource constraints trade-offs
Configuration schema migrations when updates aren't atomic

These aren't new problems. But they're new to me, because cloud infrastructure let me ignore them for a decade.

The goal isn't to build another OTA framework the world doesn't need. The goal is to develop enough understanding that I can make informed architectural decisions when edge deployment becomes unavoidable.

Which, if current trends continue, will be soon.

Because the next wave of computing isn't happening in data centers. It's happening in cars, factories, homes, and cities. And all those devices need software updates.

Your CI/CD pipeline wasn't designed for this. Neither was mine.

The Tool Collector's Fallacy

César D. Velandia — Wed, 09 Jul 2025 01:54:00 GMT

Most teams have 10+ tools in their DevOps stack. Sometimes it feels like keeping up with DevOps tooling is overwhelming.

Kubernetes for orchestration. Terraform for infrastructure. ArgoCD for GitOps. Prometheus for metrics. Grafana for dashboards. Loki for logs. Jaeger for traces. Vault for secrets. Trivy for container scanning. Snyk for dependency scanning. SonarQube for code quality. Jenkins for legacy pipelines. GitHub Actions for new ones. Slack for notifications. PagerDuty for incidents. Datadog for... something, I forget what Datadog is supposed to do that Prometheus doesn't.

Each tool was added to solve a real problem. Each tool made sense at the time. Together, they've created a different problem: nobody actually understands the system anymore.

How We Became Tool Collectors

The pattern is always the same:

Act 1: The Problem
"We need better observability. Prometheus metrics aren't enough."

Act 2: The Research
"Grafana Loki looks promising. Integrates with our existing stack. Open source. Good community."

Act 3: The POC
"Loki works great in our test environment. Let's roll it out."

Act 4: The Rollout
"Now we have centralized logging. Problem solved."

Act 5: Six Months Later
"Why isn't anyone using Loki? Why are we still grepping container logs?"

Because adding the tool didn't solve the problem. It added another thing to learn, another thing to maintain, another thing to debug when it breaks. And it broke last Tuesday when the Loki cluster ran out of disk space and nobody noticed for six hours because we monitor Loki's health with... Prometheus. Which doesn't alert when Loki is down because that check was never added.

We collected a tool. We didn't solve the problem.

The Hidden Costs Nobody Talks About

Each new tool in your stack has an obvious cost (license, hosting, maintenance) and several hidden ones:

Cognitive Load Tax
Your team now needs to know 24 tools instead of 23. Each with its own:

Configuration syntax (YAML, HCL, JSON, TOML, all slightly different)
CLI interface and flags
API patterns and authentication
Debugging approaches
Failure modes
Update procedures

A new engineer joining the team used to learn Kubernetes, Terraform, and CI/CD. That was already a lot. Now they also need to learn Loki, Jaeger, Vault, ArgoCD, and whatever we add next quarter.

We're not making them more productive. We're just raising the barrier to productivity.

Integration Debt
Every tool needs to talk to every other tool. Prometheus scrapes Loki's metrics. Loki ingests logs from Kubernetes. Kubernetes pulls images scanned by Trivy. Trivy's results go to Slack via Jenkins. Jenkins auth uses Vault. Vault secrets come from Terraform.

This web of dependencies is fine until something breaks. Then you spend three hours discovering that the Slack notifications stopped because the Vault token expired, which Jenkins didn't detect properly, which meant Trivy scan results weren't being posted, which meant nobody noticed the critical CVE in the base image we've been deploying all week.

The tools work. The integrations are fragile.

The Knowledge Silo Problem
Person A knows Terraform. Person B knows Kubernetes. Person C knows Vault. Nobody knows all three well enough to debug the interaction when Kubernetes can't pull secrets from Vault because Terraform configured the IAM role wrong.

So you have a meeting. Three people spend two hours teaching each other enough about their respective tools to understand the problem. You fix it. Document it. Move on.

Next quarter, Person B leaves. New person joins. Doesn't know Kubernetes. The institutional knowledge evaporated.

We've built a system that requires tribal knowledge to operate. Then we're surprised when it's fragile.

Update Paralysis
Remember when updating your infrastructure meant updating one thing? Neither do I.

Now updates cascade:

Terraform has a new provider version
Which requires updating the Vault configuration syntax
Which breaks the Kubernetes integration
Which means ArgoCD can't sync
Which blocks deployments
Which means we defer the update

Six months later, we're three major versions behind on everything. Security vulnerabilities pile up. Features we want are in newer versions. But the cost of updating—testing 23 tools and all their integrations—is too high.

So we stay on old versions and tell ourselves we'll "catch up next quarter."

The Specialization Trap

Here's how we used to solve observability:

2015 Approach

App writes logs to stdout
Logs get aggregated
Grep for errors
Write scripts to parse patterns
Done

It was basic. It worked. Anyone on the team could debug it.

2024 Approach

Apps expose Prometheus metrics (RED method)
Prometheus scrapes and stores
Grafana visualizes
Loki ingests logs
Jaeger traces requests
Correlation between metrics/logs/traces requires:
Understanding PromQL
Writing Grafana queries
Configuring Loki labels correctly
Instrumenting code with trace IDs
Ensuring all three systems agree on time synchronization

It's sophisticated. It's powerful. It requires specialists.

We've turned observability into a discipline that requires dedicated engineers. Not because the problems got harder—debugging is still debugging—but because the tools got more complex.

The tools were supposed to make us more productive. Instead, they created new job titles.

When Solutions Become Problems

I watched a team spend two weeks setting up GitOps with ArgoCD. Their deployment process was:

Push to main
CI builds and pushes image
Kubernetes manifests update
Deploy

It worked. Nobody complained. But someone went to a conference, saw a GitOps talk, came back convinced we needed it.

Now the process is:

Push to main
CI builds and pushes image
CI updates manifests in Git
ArgoCD watches Git
ArgoCD syncs to cluster
Deploy

The new process does the same thing with an extra tool and two more failure points. The benefits:

Declarative state (we already had this, manifests were in Git)
Audit trail (we already had this, Git history)
Rollback capability (we already had this, redeploy old version)

The costs:

Another tool to learn, maintain, debug
Another place to check when deploys fail
Another integration to keep working
Another source of "why isn't this deploying?"

We solved a problem we didn't have. Then celebrated the solution.

The Vendor Love Affair

Tech Twitter convinced everyone that if you're not using the latest tools, you're falling behind. So we chase:

Last Year's Hotness

Service mesh! Install Istio!
(Six months later: why is our cluster so slow?)
(One year later: nobody remembers how Istio works)
(18 months later: we removed Istio)

This Year's Hotness

eBPF observability! Install Pixie!
(Three months later: it's using 2GB per node)
(Six months later: conflicts with our CNI plugin)
(Nine months later: we're evaluating replacements)

Next Year's Hotness

Platform engineering! Build an IDP!
(Currently in POC phase)
(Check back in 18 months)

Each wave promises to solve all our problems. Each wave adds complexity. Each wave eventually gets replaced by the next wave.

We're not building infrastructure. We're collecting tools like Pokémon.

What We Lost Along the Way

Somewhere between "deploy code" and "orchestrate 23 tools," we lost something important:

Simplicity
The ability to explain how your infrastructure works to a new team member in under an hour. Now it takes weeks.

Debuggability
The ability to trace a problem from symptom to cause without consulting five different UIs and correlating timestamps across systems.

Ownership
When everything requires specialists, nobody owns the whole system. Problems fall into gaps between tool boundaries.

Velocity
Each new tool slows down the next change. Want to add a feature? First make sure it works with all 23 existing tools.

Understanding
Most people on the team can operate the tools. Few people understand how they actually work. So when something breaks in a novel way, nobody knows how to fix it.

We traded these things for:

Better metrics (that we don't look at)
Fancier dashboards (that we don't maintain)
More automation (that we don't trust)
Modern architecture (that we don't fully understand)

The "Best Practices" Trap

Industry best practices say you should have:

Infrastructure as Code (Terraform)
Container orchestration (Kubernetes)
GitOps (ArgoCD)
Observability (Prometheus + Grafana + Loki)
Secret management (Vault)
Security scanning (Trivy + Snyk)
Service mesh (Istio/Linkerd)

So teams adopt all of them. Even when their needs are:

3 microservices
100 requests per second
2 person team

The best practices aren't wrong. They're just designed for problems most teams don't have. Netflix needs sophisticated observability. Your startup probably doesn't.

But we cargo cult the architecture because that's what "good engineering" looks like. Then we spend 60% of our time maintaining the infrastructure instead of building features.

What Actually Matters

I've been thinking about what effective DevOps actually requires. Not the tools, the capabilities:

Deployment Confidence
Can you deploy without anxiety? Do you trust your deployment process?

This doesn't require GitOps. It requires:

Automated tests that catch real issues
Rollback mechanism that works
Monitoring that detects problems quickly

You can have this with GitHub Actions and kubectl apply. Or you can not have it with ArgoCD and Flux.

Incident Response Speed
When things break at 3 AM, can you fix them?

This doesn't require Grafana + Loki + Jaeger. It requires:

Logs accessible from one place
Metrics that show what's wrong
Runbooks that work

Simple alerting and basic log aggregation often works better than a sophisticated observability stack nobody understands.

Development Velocity
Can developers ship features without waiting on platform team?

This doesn't require a sophisticated IDP. It requires:

Clear ownership boundaries
Self-service capabilities
Good documentation

Sometimes a well-documented kubectl template works better than a custom UI nobody maintains.

What I'm Actually Recommending

Not "stop using tools." But "stop collecting tools."

Before Adding a New Tool, Ask:

What problem does this actually solve?
Do we have this problem?
Can we solve it with existing tools?
What's the maintenance cost?
Who will own this long-term?
What happens if this breaks?
Can we remove it later if it doesn't work out?

When You Already Have Too Many Tools:

Audit honestly: which tools are actually used?
Which ones could be replaced by simpler alternatives?
Which ones exist because someone went to a conference?
What would break if we removed X?

Sometimes the answer is "we really do need all 23 tools." More often it's "we could probably do this with 12."

Prefer Boring Technology
Not because boring is better. Because boring is:

Well understood
Well documented
Well supported
Debuggable by your entire team
Less likely to break in novel ways

Postgres is boring. Kubernetes is becoming boring. Terraform is boring. Prometheus is boring.

Boring doesn't make exciting blog posts. But boring works at 3 AM when you're on call.

The Discipline We Need

The hard part isn't adding tools. Any team can do that. The hard part is saying no.

No to the latest hype. No to the conference talk solution. No to the tool that's technically superior but operationally complex. No to complexity for its own sake.

We need the discipline to:

Keep infrastructure boring
Add tools only for real problems
Remove tools that don't work out
Resist FOMO about latest trends
Value simplicity over sophistication
Optimize for long-term maintainability

This is unpopular. It feels like falling behind. It looks like you're not innovating.

But maintaining 12 tools well beats maintaining 23 tools poorly. And being able to debug your infrastructure at 3 AM beats having the most sophisticated architecture that nobody understands.

The goal isn't to have the best tools. The goal is to have working systems that don't require heroic efforts to operate.

Most teams would be better off with half the tools and twice the understanding.

Why Jujutsu (jj) Is Perfect for AI-Generated Code

César D. Velandia — Wed, 02 Jul 2025 07:18:00 GMT

If you're using AI to write code, you need better version control. Git wasn't designed for the iterative, experimental nature of AI-assisted development. Jujutsu is.

The Problem: Git Fights AI Workflows

When AI writes your code, your development process changes fundamentally. Instead of carefully crafted commits, you're dealing with:

Rapid iteration cycles: AI generates, you test, refine the prompt, regenerate
Experimental branches: Multiple attempts at the same problem with different approaches
Frequent rewrites: AI rarely matches your actual architecture on the first try
History chaos: Your commit log becomes a graveyard of "fix AI-generated bug" commits

Traditional Git workflows crumble under this pressure. You end up with messy histories, complex rebases, and the constant fear of losing work during cleanup. It's like trying to write a novel with a typewriter instead of a word processor (technically possible, but you're fighting your tools).

Enter Jujutsu: Version Control for the AI Age

Jujutsu (jj) reimagines version control around "changes" instead of "commits." This subtle shift unlocks workflows that are perfect for AI-assisted development.

Changes vs. Commits: The Time Machine Effect

In Git, once you commit, you're committed (pun intended). Want to fix something three commits back? Welcome to rebase hell.

In jj, every change is mutable. You can edit any change at any time, and jj automatically rebases everything downstream. It's like having a time machine for your code .

Real scenario: Your AI generates a new API endpoint, but after testing, you realize the interface needs adjustment. In Git, you'd either:

Add a "fix interface" commit (messy history)
Interactive rebase (risk breaking things)

In jj, you just edit the original change. Everything depending on it updates automatically. No git surgery required.

Perfect for AI Experimentation

AI development is inherently experimental. You'll often have multiple AI-generated solutions to compare. Jj's branching model makes this trivial:

# Create three different AI attempts at the same feature
jj new -m "AI attempt 1: REST API approach"
# ... let AI generate solution 1

jj new -m "AI attempt 2: GraphQL approach" 
# ... let AI generate solution 2

jj new -m "AI attempt 3: RPC approach"
# ... let AI generate solution 3

# Check your current status
jj st

Your history stays clean, and you can easily compare approaches without complex Git gymnastics.

Stacked Changes for Iterative Development

AI rarely gets complex features right on the first try. You'll typically iterate:

AI generates basic structure
AI adds error handling
AI adds tests
AI optimizes performance

In jj, these become a natural stack of changes:

@ Add performance optimizations
◉ Add comprehensive tests
◉ Add error handling  
◉ Basic feature implementation
◉ main

When the AI needs to fix something in the basic implementation, you edit that change, and all the dependent changes automatically rebase. It's like editing the foundation of a house and having all the floors automatically adjust (no manual reconstruction required).

Bookmarks: Git Branches That Actually Make Sense

Here's where jj really shines for GitHub workflows. Instead of Git's branch model, jj uses "bookmarks."

In Git, you have to decide on a branch name before you know what you're building:

git checkout -b feature/maybe-user-auth-or-something

In jj, you build first, name later:

# Work on several changes
jj new -m "Add user model"
jj new -m "Add authentication"
jj new -m "Add middleware"

# Later, when you know what you built:
jj bookmark create user-auth-system

Bookmarks are just pointers to changes. Multiple bookmarks can point to the same change, and they move automatically as you rebase. When you're ready for a PR:

jj git push --bookmark user-auth-system

This pushes your bookmark as a Git branch to GitHub. Your PR workflow remains unchanged, but your local development becomes infinitely more flexible.

The AI Development Workflow

Here's how jj transforms AI-assisted development:

1. Start with Architecture (Install jj first: GitHub releases)

# In your existing Git repo
jj init --git-repo .

jj new -m "Define API interface"
# You design the interface, AI fills implementation

2. Iterative Implementation

jj new -m "Implement user service"
# Let AI implement based on your interface

jj new -m "Add validation"
# AI adds validation layer

jj st  # Check your current status

jj new -m "Add error handling"
# AI improves error handling

3. Discover Issues and Fix Retroactively

# Oh no, the interface needs adjustment
jj edit "Define API interface"
# Make changes, everything downstream updates automatically

4. Ship When Ready

jj bookmark create feature-user-service
jj git push --bookmark feature-user-service

Your final history tells the story of what was built, not how many times the AI hallucinated.

Handling AI's Favorite Mistake: The Everything Sandwich

AI loves to mix concerns. It'll add logging, error handling, database migrations, and new features all in one glorious mess. With jj, you can easily split these apart after the fact:

# AI generated everything in one messy change
jj split
# Interactively split into logical pieces

This is nearly impossible in Git without performing open-heart surgery on your repository.

The Refactoring Advantage

Mitchell Hashimoto (creator of Vagrant, Terraform) notes that AI agents excel at refactoring: "Anytime I ask it to do that, it's always perfect."

Jj makes AI refactoring risk-free. Since any change is editable, you can let AI aggressively refactor, knowing you can always edit or revert specific changes without losing other work. It's like having an undo button that actually understands context.

Why This Matters for AI-Generated Code

AI is changing how we write software. The traditional Git model (linear commits, careful history curation) was designed for human development patterns.

AI generates code differently:

More experimental (they don't have egos to protect)
Rapid iteration (they don't get tired)
Frequent architectural changes (they don't fall in love with their first solution)
Multiple attempts at solutions (they're happy to start over)

Jj was built for exactly this kind of workflow. It's not just better at handling AI-generated code; it transforms how you think about version control entirely. Git feels like accounting software after using jj (technically correct but unnecessarily painful).

The Learning Curve: Hours, Not Weeks

Basic jj concepts map to Git:

git add → jj new (creates a change)
git commit → automatic (changes are always "committed")
git log → jj log
git status → jj st
git push → jj git push --bookmark
git branch → jj bookmark create

The difference is that jj's model is simpler and more forgiving. It's version control designed for the AI era.

Takeaways

Mutable Changes: Edit any change anytime, automatic downstream rebasing
Experiment Freely: Easy comparison of multiple AI solutions
Stack Naturally: Iterative development cycles become manageable
Risk-Free Refactoring: Let AI aggressively refactor with easy rollback
Bookmarks > Branches: Name things when you understand them, not before
Zero Migration Cost: Works with existing Git repos and GitHub workflows
Simple Mental Model: Changes instead of commits reduces cognitive overhead

Dive deeper

Jujutsu Strategies

Steve's Jujutsu Tutorial