SYSTEM ONLINE
VER. 1.0.0
// LUCA //
← Come back home Jack!

Vibe Coding

I have had to work on fixing vibe-coded systems. The number of problems I have seen is unbelievable.

Vibe coding means prompting an AI to generate code rapidly, often without reading or understanding what was produced. The output looks correct, the demo runs, and the developer moves on. The problems appear later — in production, under load, during an incident, or when someone else inherits the codebase.

Most of the problems below are real things I have encountered. Two of them are the worst offenses I keep running into: test coverage that is weak or fake, and the false sense of competence. I will get to those.

Nobody Understands the System

When something fails in a system built through iterative prompting, the developer often cannot explain what the system is doing or why. The architecture was never reasoned through — it was accumulated. Each prompt solved an immediate problem. The emergent whole was never the object of anyone's thinking.

This creates a specific kind of incident: the failure is in production, time pressure is high, and the person responsible has no mental model to debug against. They cannot form a hypothesis about the failure mode because they never understood the assumptions the system was built on.

Code you did not write is not automatically code you understand. Understanding requires deliberate engagement with the reasoning behind the implementation.

The Security Layer Is Invisible Until It Is Missing

AI-generated code is structurally optimized toward functionality. A model asked to implement an endpoint will implement an endpoint. Authentication, authorization, input validation, rate limiting, and secret handling are ancillary concerns — they require deliberate framing to appear in the output. When the prompt does not ask for them, the model does not include them, and the absence is not visible. The code compiles, the tests pass, and the feature ships.

Security holes in AI-generated code are not dramatic. They look like normal code. They pass code review from anyone who is also just reading for surface correctness.

Technical Debt at Generative Speed

Manually written technical debt accumulates slowly, because a developer can only write so much code per day. AI-generated technical debt accumulates as fast as prompts can be issued. A codebase that absorbed six months of AI-assisted output can contain the structural incoherence of a decade of neglect.

AI optimizes for immediate completion. It does not have a stake in the long-term maintainability of the system. It does not know what the codebase looked like six months ago, what decisions were reversed and why, or what invariants the system is supposed to maintain. Each prompt is answered locally. The global structure deteriorates without anyone noticing until it becomes difficult to modify anything without breaking something else.

Debugging Costs More Than Writing Would Have

There is an implicit assumption in vibe coding that generating code quickly saves time overall. This is often false. LLMs produce plausible-looking code that can contain subtle logic errors — off-by-one mistakes, incorrect state transitions, silent data truncation, wrong concurrency assumptions. The code is syntactically correct, passes superficial inspection, and fails in specific conditions that require actually understanding the implementation to diagnose.

Developers frequently spend longer debugging AI-generated code than they would have spent writing clean code directly. The time savings at generation are paid back with interest at debugging. It is worse because the person debugging often did not write the code and has no intuition about where to look.

The Codebase Becomes Incoherent

Different prompts produce different abstractions, different naming conventions, different patterns for solving the same class of problem. One session produces functional composition throughout; the next produces a class hierarchy over the same domain. There is no negotiation between them because there was no architect — only a sequence of prompts.

Over time the codebase becomes difficult to reason about as a whole. Reading one module gives you no reliable information about how another module works. The code contradicts itself in ways that have no clean resolution.

What AI Does Not Generate

AI generates the happy path. It generates the common case, the expected input, the straightforward flow. What it does not reliably generate are the uncommon states, the failure modes, the recovery logic, the concurrency handling, the permission edge cases, and the data integrity constraints that make a system behave correctly under real conditions.

These are the parts of a system that require reasoning about what can go wrong — which requires understanding the system well enough to enumerate the ways it can fail. That understanding is exactly what vibe coding never builds.

Context Window Failures

AI models have finite context. On large systems, the model is answering prompts without access to the full history of the codebase — the decisions that were made and reversed, the APIs that were deprecated, the constraints that were discovered after the initial design. The result is new code that conflicts with old code, duplicated logic across modules, and regressions introduced by additions that had no visibility into what they were changing.

The model cannot tell you it has forgotten something. It produces confident output based on incomplete context, and the gaps are only visible when the system breaks.

False Sense of Competence

Non-programmers can reach a working demo in hours using AI tools. This creates a specific and dangerous illusion: the person who built the demo believes they have built a system.

A demo does not have authentication. It does not handle error states. It does not degrade gracefully under load. It has no deployment strategy, no monitoring, no incident response, no data backup policy. The person who built it in four hours often cannot evaluate any of these dimensions — not because they are incapable, but because they were never required to engage with them and the output gave them no signal that anything was missing.

The surface output of AI-assisted development is visually indistinguishable from production engineering. This is what makes the false sense of competence dangerous. The person is not lying when they say the system works. It does work — in the conditions they tested, which are the conditions the AI imagined when generating the code. Everything outside that envelope is unknown, and they do not know that they do not know.

Test Coverage Is Weak or Fake

This is the worst one.

AI-generated tests are often useless in a specific way. They test the implementation by mirroring it — they check that the code does what the code does, not that the system does what the system should do. Tests written by reading the implementation and asserting its current behavior provide no signal when that behavior is wrong. They pass when the code is broken because they were written to describe the broken code.

Some developers skip tests entirely because the generated output looks correct. "Looks correct" is not a property that survives contact with edge cases, changed dependencies, or data that was not anticipated when the code was written.

But the deeper problem is what happens when something fails. A vibe coder who encounters a bug will paste the error into the AI and ask it to fix the problem. The AI will produce a patch and say "I have fixed it." The conversation ends there. The developer marks the issue closed.

The AI has no way to verify that the fix is correct. It cannot run the code. It cannot observe the system under real conditions. It cannot check whether the patch introduced a regression, whether the root cause was actually addressed, or whether the new code passes the tests that were supposed to catch exactly this kind of failure. The AI always says it has fixed the problem because producing a confident response is what the model is trained to do, not verifying correctness.

This loop — bug appears, AI is asked, AI says fixed, nobody checks — is how vibe-coded systems accumulate silent failures. Each fix is a guess that was never validated. The test suite, if one exists, was written by the same process and carries the same assumptions. Nothing in the workflow catches anything.

The Review Burden Falls on Senior Engineers

AI-generated code does not eliminate the need for code review. It increases the volume of code requiring review while reducing its average quality. Senior engineers end up reading large amounts of output that was never carefully reasoned through — looking for the security holes, the logic errors, the architectural violations, and the missing edge cases that the generating developer never considered.

The review cost is real and high enough that some projects have started restricting or banning AI-generated contributions. The productivity gain at generation does not always survive the review overhead that follows.

The Core Problem

The failure is not AI-generated code. The failure is code with no owner — where nobody can explain the design, nobody verified the security properties, nobody thought through the failure modes, and when something breaks, the response is to paste the error into the AI and trust the reply.

Strong developers using AI tools verify the output, own the architecture, reason through the edge cases, and treat the model as something that accelerates implementation of decisions they have already made. They do not ask the AI if the fix worked. They check.

That is the entire difference.

Hey! Thanks for reading the page. I hope to see you soon!