Vibe Coding

I have worked on systems built through vibe coding. The problems compound faster than conventional technical debt accumulates.

Vibe coding means prompting an AI to generate code rapidly, often without reading or understanding what was produced. The output looks correct, the demo runs, and the developer moves on. The problems appear later — in production, under load, during an incident, or when someone else inherits the codebase.

The problems below are real. Two recur most often: test coverage that is weak or fake, and the false sense of competence that a working demo creates. I will address those first.

Nobody Understands the System

When something fails in a system built through iterative prompting, the developer often cannot explain what the system is doing or why. The architecture was never reasoned through; it was accumulated through successive prompts. Each prompt solved an immediate problem. Nobody thought about the system as a whole.

This creates a specific kind of incident: the failure is in production, time pressure is high, and the person responsible has no mental model to debug against. They cannot form a hypothesis about the failure mode because they never built a mental representation of the system's assumptions.

Reading code you did not write is different from understanding it. Understanding requires engaging with the reasoning behind the design. That reasoning does not exist in a vibe-coded system.

The Security Layer Is Invisible Until It Is Missing

AI-generated code is optimized for functionality. A model asked to implement an endpoint will implement an endpoint. Authentication, authorization, input validation, rate limiting, and secret handling require explicit framing to appear in the output. When the prompt does not ask for them, the model does not include them. The absence is not visible. The code compiles, the tests pass, and the feature ships.

Security holes in AI-generated code look like normal code. They pass code review from anyone reading only for surface correctness. The reviewer has no reason to suspect something is missing because the code is syntactically sound and the endpoint works.

Technical Debt at Generative Speed

Manual technical debt accumulates at the rate a developer can produce code. AI-generated technical debt accumulates as fast as prompts can be issued. Six months of AI-assisted output can contain the structural problems of a decade of unmaintained code.

AI optimizes for immediate completion. It has no stake in long-term maintainability. It cannot reference decisions made six months ago, reversals that happened in between, or invariants the system was supposed to maintain. Each prompt is answered locally. The global structure deteriorates without visibility into the accumulation until modification becomes dangerous.

Debugging Costs More Than Writing Would Have

Vibe coding assumes that rapid generation saves time overall. This often proves false. LLMs produce plausible-looking code containing subtle logic errors — off-by-one mistakes, incorrect state transitions, silent data truncation, wrong concurrency assumptions. The code is syntactically correct, passes visual inspection, and fails in specific conditions that require understanding the implementation to diagnose.

Developers routinely spend longer debugging AI-generated code than they would have spent writing the code from scratch. The time savings at generation are repaid with interest during debugging. The cost is higher because the person debugging usually did not write the code and has no intuition about failure modes.

The Codebase Becomes Incoherent

Different prompts produce different abstractions, naming conventions, and patterns for the same problem class. One session produces functional composition throughout; the next introduces class hierarchies over the same domain. There is no integration between them because there was no architect, only a sequence of disconnected prompts.

Over time the codebase becomes difficult to read as a whole. One module does not reliably inform you about how another works. The code contradicts itself in patterns that have no clean resolution short of a rewrite.

What AI Does Not Generate

AI generates the happy path. It generates the common case, the expected input, the straightforward flow. It does not reliably generate uncommon states, failure modes, recovery logic, concurrency handling, permission edge cases, or data integrity constraints that make a system function correctly under real conditions.

These are the parts that require reasoning about what can go wrong, which requires understanding the system well enough to enumerate failure modes. That understanding never develops in vibe-coded systems.

Context Window Failures

AI models have finite context. On large systems, the model answers prompts without access to the full history of the codebase: decisions that were made and reversed, APIs that were deprecated, constraints discovered after the initial design. The result is new code that conflicts with old code, duplicated logic across modules, and regressions from additions that had no visibility into what they were changing.

The model produces confident output based on incomplete context. It cannot signal what it has forgotten. The gaps only become visible when the system breaks.

False Sense of Competence

Non-programmers can reach a working demo in hours using AI tools. This creates a specific danger: the person who built the demo believes they have built a system.

A demo does not require authentication. It does not handle error states. It does not degrade gracefully under load. It has no deployment strategy, no monitoring, no incident response, no data backup policy. The person who built it in four hours often cannot reason about any of these, not because they lack ability, but because they were never required to engage with them. The output gave them no signal that anything was missing.

The surface appearance of AI-assisted output is indistinguishable from production engineering. That is what makes this dangerous. The person is not lying when they say the system works. It does — in the conditions they tested, which are the conditions the AI imagined. Everything outside that envelope is unknown, and they have no way to know what they do not know.

Test Coverage Is Weak or Fake

This is the worst one.

AI-generated tests often fail in a specific way. They test the implementation by mirroring it — they check that the code does what the code does, not what the system should do. Tests written by reading the implementation and asserting its current behavior provide no signal when that behavior is wrong. They pass when the code is broken because they were written to describe the broken code.

Some developers skip tests entirely because the generated output looks correct. Looking correct is not a property that survives edge cases, changed dependencies, or data that was not anticipated when the code was written.

The deeper problem emerges when something fails. A vibe coder who encounters a bug will paste the error into the AI and ask it to fix the problem. The AI produces a patch and says "I have fixed it." The conversation ends there. The developer marks the issue closed.

The AI cannot verify that the fix is correct. It cannot run the code. It cannot observe the system under real conditions. It cannot check whether the patch introduced a regression, whether the root cause was actually addressed, or whether the new code passes the tests that were supposed to catch exactly this kind of failure. The AI always says it has fixed the problem because that is what the model is trained to do, not verifying correctness.

This loop (bug appears, AI is asked, AI says fixed, nobody checks) is how vibe-coded systems accumulate silent failures. Each fix is an unvalidated guess. The test suite, if one exists, was written by the same process and carries the same assumptions. Nothing in the workflow catches anything.

The Review Burden Falls on Senior Engineers

AI-generated code does not eliminate code review. It increases the volume of code requiring review while reducing its average quality. Senior engineers end up reading large amounts of output that was never carefully reasoned through — looking for security holes, logic errors, architectural violations, and missing edge cases that the generating developer never considered.

The review cost is high enough that some projects have restricted or banned AI-generated contributions. The productivity gain at generation does not always survive the review overhead that follows.

The Core Problem

The failure is not AI-generated code. The failure is code with no owner — where nobody can explain the design, nobody verified the security properties, nobody thought through the failure modes, and when something breaks, the response is to paste the error into the AI and trust the response.

Strong developers using AI tools verify the output, own the architecture, reason through the edge cases, and treat the model as something that accelerates implementation of decisions they have already made. They do not ask the AI if the fix worked. They check.

That is the entire difference.

Vibe Coding

Hey! Thanks for reading. I hope to see you soon!