Deep Engineering Specials: Vibe Coding—Promise, Pressure, and Practical Limits

What recent research tells us about vibe coding: where it accelerates, where it breaks, and how to adopt it without undermining engineering discipline

Aug 04, 2025

Welcome to this special issue of Deep Engineering.

With this special issue we go beyond the hype of vibe coding. Drawing on first-party research from Microsoft, Google, IFS, and independent academics, we examine where this paradigm helps, where it breaks, and what it asks of software teams if it scales. For architects, leads, and developers navigating a shifting toolchain, this piece aims to provide some coordinates: empirical findings, adoption thresholds, and governance strategies.

To Vibe or Not to Vibe: That is the Question

A research-based examination of vibe coding’s promises, pitfalls, and what it means for the future of software teams.

According to Stack Overflow’s 2025 Developer Survey, nearly 72% of developers said *“vibe coding” – defined as generating entire applications from prompts – is not part of their workflow, with an additional 5% emphatically rejecting it as ever becoming a part of their workflow.

Empirical research and position papers published this year provide some more context.

Sarkar, A. (University of Cambridge and University College London) and Drosos, I. (Microsoft Research) conducted an observational study (June 2025) analyzing five publicly available coding sessions from YouTube and Twitch, in which developers self-identified their activity as "vibe coding." All participants were experienced programmers using conversational, LLM-based tools such as Cursor, Claude, and GPT-4o to complete real-world programming tasks. The researchers applied framework analysis to think-aloud transcripts, tool interactions, and coding behavior to identify prompting techniques, debugging strategies, and patterns of human-AI collaboration. They found that while developers reported efficiency gains in generating and adapting standard patterns, these advantages declined in more complex scenarios, where manual expertise and intervention remained essential.

Debugging AI-generated code remained a major friction point, often requiring developers to mentally reverse-engineer the logic or manually rewrite portions of the output. Importantly, users expressed consistent uncertainty about the correctness and reliability of generated code, underscoring that trust in the AI remained limited.

Sapkota, R., et al. (2025) conducted a structured literature review and conceptual comparison of two emerging AI-assisted programming paradigms: vibe coding and agentic coding. The paper defines vibe coding as an intent-driven, prompt-based programming style in which humans interact with an LLM through conversational instructions, iteratively refining output. By contrast, agentic coding involves AI agents that autonomously plan, code, execute, and adapt with minimal human input. The authors argue that these paradigms represent distinct axes in AI-assisted development—one human-guided and interactive, the other goal-oriented and autonomous.

They propose a comparative taxonomy based on ten dimensions, including autonomy, interactivity, task granularity, execution environment, and user expertise required. They claim that vibe coding excels in creative, exploratory, and early-stage prototyping contexts, while agentic coding shows promise in automating repetitive, well-scoped engineering tasks. However, both approaches face common challenges, including error handling, debugging, quality assurance, and system integration. The authors conclude that hybrid systems combining the strengths of vibe coding and agentic coding—conversational guidance with agentic automation—may be the most practical path forward.

Stephane H. Maes, CTO and CPO at IFS & ESSEM Research, in their literature review and enterprise experience-based position paper (April 2025), state that code written through vibe coding often lacks documentation, architectural coherence, and design rationale. Without rigorous standards and tooling for verification, maintainability, and lifecycle control, the adoption of AI-generated code introduces operational risks. Maes proposes that successful adoption of vibe coding in production environments requires not just technical integration but structured governance—workflows, tooling, and cultural norms that enforce accountability, traceability, and testability. The core thesis is that “real coding is support and maintenance,” and vibe coding, in its current form, largely sidesteps these responsibilities.

And yet, despite these limitations and negative developer experience, vibe coding remains very much a part of the conversation. Why? Not because it works at scale today, but because it gestures toward a future where programming feels more like intent-driven design than manual construction. It flatters a seductive idea: that software can be summoned by describing it, rather than engineered line by line.

Gadde, A., (May, 2025), in their literature review based paper, positions vibe coding more positively as the next evolution in AI-assisted software development, arguing that it significantly lowers barriers to entry by enabling users to generate working software from natural language prompts. Gadde characterizes vibe coding as a practical middle ground between low-code platforms and agentic AI systems, combining human intent expression with generative code synthesis. Unlike traditional development workflows, Gadde claims vibe coding empowers users—even those without formal programming experience—to act as high-level specifiers, while generative models handle much of the underlying implementation.

Engineers don’t just build systems for today, they chart trajectories. And so, with today’s special feature, we aim to:

Identify where vibe coding works today (early-stage prototypes, educational contexts, speculative design),
Understand why it falls short elsewhere (debugging, integration, maintainability),
Anticipate the organizational and skill implications, so you can lead with context when the tooling matures.

Where and How Vibe Coding Helps

Vibe coding works best when the goal is to explore, not to ship; to experiment, not to scale. In these scenarios, its limitations are tolerable, and its productivity gains are real.

Contexts where vibe coding is most effective:

Rapid prototyping and ideation: The AI-assisted conversational workflow drastically accelerates early development. What once took weeks can often be scaffolded in hours. Solo developers, according to Ardor Labs, report building functional prototypes—from simple web apps to plugin systems—by iteratively prompting an LLM, adjusting results, and redeploying within a single day.
Startups and hackathons: Early-stage teams exploit vibe coding to punch above their weight. Y Combinator managing partner Jared Friedman has said that, “A quarter of the W25 startup batch have 95% of their codebases generated by AI.” In this context, code maintainability is a secondary concern; speed to demo or MVP is paramount.
Exploratory use by professionals: Developers may use vibe coding for spinning up proof-of-concepts or exploring unfamiliar frameworks, even if they ultimately rewrite the code manually. AI researcher Andrej Karpathy (the originator of the term vibe coding) himself has described this as ideal for “weekend projects” or “rapid ideation” scenarios.
One-click deployment pipelines: Google’s guide notes that coupling vibe coding with integrated cloud deployment creates “the fastest path from concept to a live, shareable application,” especially when platforms like Replit or Google Cloud streamline backend provisioning.
Lowering the barrier to entry: Because it uses natural language, vibe coding attracts those with minimal programming background. Google highlights that it makes “app building more accessible,” while Gadde frames it as the next phase in no-code evolution—enabling domain experts to act as high-level specifiers without writing syntax-bound code.
Educational and learning contexts: Sapkota et al. note that vibe coding performs well in educational and exploratory settings, particularly when the emphasis is on learning through experimentation rather than delivering production-ready systems. Students can engage in prompt-driven debugging or request scaffolded solutions to better understand programming constructs.

For all its speed and surface-level convenience, vibe coding introduces architectural liabilities that make experienced developers cautious—if not outright resistant—to using it beyond disposable or exploratory projects.

Limitations: Maintainability, Debugging, and Technical Debt

The issues with vibe coding are not liked just to code that fails to run, but about code that fails to last. Vibe coding shortcuts implementation, but often bypasses the rigor, clarity, and accountability that production-grade systems require.

Why vibe-coded software tends to erode under pressure:

Poor structural hygiene. AI-generated code often lacks internal consistency and coherent design. As Ardor Labs reports, repetitive prompting typically results in a patchwork of quick fixes, duplicated logic, and workarounds that accumulate into technical debt.
Invisible complexity. Maes notes that repeated AI-driven edits can produce systems even their authors no longer understand. Without documentation or rationale, the code becomes opaque—even to its original creator.
Debugging burdens. Because developers often see AI-generated code only after an error appears, root cause analysis becomes guesswork. IBM’s overview highlights the lack of clear architectural structure, making it harder to trace failures through unfamiliar logic paths.
Prompting is not a substitute for engineering judgment. While it's tempting to patch issues by prompting another fix, this iterative loop can obscure responsibility and create brittle dependencies. As some developers now observe, “using one AI to debug another” may sound clever but is often insufficient without human involvement.

Production pitfalls: performance, scale, and security

Scalability bottlenecks. Sarkar & Drosos observed that developers often had to switch from vibe coding to manual optimization as application complexity increased. AI-generated prototypes may appear functional but suffer from poor resource usage and brittle error handling when scaled.
Security vulnerabilities. A 2021 NYU cybersecurity study found that around 40% of GitHub Copilot’s generated code contained exploitable flaws, from SQL injection risks to use of deprecated libraries. These same vulnerabilities can silently propagate in vibe-coded applications, especially when users copy output without review.
False confidence. Vibe coding’s conversational interface can lull developers—particularly those with limited experience—into accepting functional output as production-ready. As Ardor Labs warns, this “move fast” approach may ship apps that run but cannot be maintained, audited, or secured.
Neglected lifecycle thinking. Maes (2025) captures this gap directly: “coding can be done with ‘no code’” via AI, “but such code is not maintainable”—a critical failure if the system is expected to evolve beyond a demo.

For all its promise, vibe coding comes with serious “gotchas” that make seasoned engineers hesitant to use it in production. But as all the attention the paradigm continues to attract it is still very much something developers and enterprises are not giving up on yet.

Workforce and Organizational Implications

The rise of vibe coding raises important questions about software engineering roles, required skills, and how organizations should adapt. Who stands to benefit the most, and whose work might be displaced or transformed?

Democratization vs. De-skilling

Vibe coding lowers the barriers to entry. Non-developers and junior developers can now build software that once required full-stack expertise. A solo entrepreneur, equipped only with a vision and the right AI tools, can ship a working prototype. In this framing, the AI serves as a kind of expert consultant, accelerating iteration and enabling domain specialists to turn ideas into software without hiring a team. This democratization of software creation is one of vibe coding’s most widely advertised benefits.

But this accessibility comes with a paradox. Heavy reliance on AI for everyday coding tasks can cause skills to atrophy. Ray, P. (May 2025) identifies this as a core concern: if developers grow accustomed to prompting and accepting output without deep understanding, they risk losing the foundational skills required to validate, debug, and maintain that software.

The illusion of productivity can further obscure the issue. A 2025 METR study, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity,” found that “When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.”

Without strong engineering judgment, the value of AI assistance can quickly become negative.

Who Benefits—and Who Might Be Left Behind?

In its current form, vibe coding offers the greatest leverage to small, agile teams and individuals operating under time constraints. For early-stage startups, the appeal is obvious: speed to prototype, speed to market. For these teams, robustness is a secondary concern—shipping something that works, even partially, is often enough to secure feedback, funding, or traction. Similarly, larger organizations may use vibe coding to prototype features quickly without committing senior developer time, particularly in product discovery phases.

By contrast, engineers at companies with established production systems remain cautious. The architectural demands of long-lived systems, along with maintainability and security concerns, make “pure” vibe coding untenable. Google’s guidance distinguishes between two modes: an “experimental” vibe coding mode suited to rapid ideation, and a “disciplined” mode in which the AI acts as a subordinate pair-programmer, with the human remaining accountable for quality.

This bifurcation in usage reflects a broader split in how developers perceive AI's impact on the profession. According to the 2025 Stack Overflow Developer Survey, 64% of respondents said they do not view AI tools—including coding assistants—as a threat to their employment. Instead, many see these tools as a way to offload repetitive work and focus on higher-order engineering problems. However, that figure has dropped from 68% the previous year, indicating a subtle but real shift: developers increasingly recognize that roles are evolving, and that staying competitive will require new skills.

The differentiator is not whether one uses AI, but how. Engineers who add prompt engineering, AI supervision, and LLM-aware debugging to their toolset will likely outperform those who default to traditional workflows for all tasks. Conversely, those who resist this shift entirely may find themselves outpaced—not by the AI, but by peers who know how to manage it effectively.

Leadership Response: Strategic Adoption with Guardrails

For CTOs, software architects, and engineering leads, the responsible response to vibe coding is neither rejection nor blind adoption, but strategic containment. Its introduction should be scoped to workflows where quality risk is minimal and speed adds clear value—such as internal prototypes, automated test generation, or scaffolding of non-critical features that engineers can later refactor. Governance is essential. Maes proposes structured frameworks like VIBE4M, which emphasize verification, maintainability, and monitoring as prerequisites for accepting AI-generated code into supported systems. Even in the absence of formal frameworks, the principle holds: all AI contributions must undergo human review. Review checklists may need to explicitly flag AI-authored code for scrutiny, and CI pipelines should incorporate tools like Snyk or ESLint with AI-focused rules to catch common faults. These checks inevitably introduce friction—but they are precisely what distinguish engineering from experimentation. As Maes notes, rigorous validation “goes against the trend [of] AI makes developers more productive” in the short term, but is non-negotiable for sustainable practice.

Equally critical is the cultural framing of vibe coding within teams. Leaders should position it not as a shortcut, but as a collaboration—one that still demands comprehension, accountability, and domain judgment. Encouraging developers to re-express or review AI-generated solutions—whether to a colleague or back to the model—can ensure they understand the logic they are deploying. This guards against blind acceptance and reinforces human agency. Forward-looking leaders will also recognize and reward the kinds of work AI cannot yet replicate: deep architectural reasoning, creative problem decomposition, and user empathy. These capabilities will define developer impact in a world where code generation is easy but understanding remains hard.

When it comes to delivering reliable, maintainable systems at scale, the fundamentals of software engineering still apply. The organizations that will benefit most are those that blend the “vibes” with vigilance: embracing AI-driven development to speed up outcomes, while doubling down on human expertise in architecture, validation, and security to ensure those outcomes stand the test of time. In doing so, we can harness the promise of vibe coding – conversational and intuitive development – without losing the hard-won lessons of decades of engineering practice.

Vibe Coding in Practice: The Tooling Landscape

In the pre-publication paper, "A Review on Vibe Coding: Fundamentals, State-of-the-art, Challenges and Future Directions," Ray, P., presents a qualitative, exploratory analysis of non-peer-reviewed sources such as product blogs, documentation, and public demos. The paper surveys a wide range of vibe coding tools—natural language-driven development environments—and maps them across an interaction spectrum (delegation to pairing) and a layered stack architecture extending from prompt interfaces to deployment infrastructure. It highlights the growing sophistication of both browser-native platforms and IDE-integrated agents. Here is a summary.

Browser-native platforms feature prominently. Tools such as v0 by Vercel, Bolt.new, Create, and Lazy AI allow users to scaffold, preview, and deploy full-stack applications from prompt-based workflows. These platforms commonly embed frontend frameworks like Next.js and Tailwind, along with real-time CI/CD, auth, and database orchestration. Others—Trickle AI and Napkins.dev—generate UIs from screenshots or sketches, while HeyBoss, Softgen, and Rork focus on zero-config application builds with export to GitHub or direct deployment.

IDE-integrated tools like Cursor, Cody, and Zed offer agent-assisted development with context-aware completions, semantic diffs, and local vector search. More advanced platforms such as Windsurf and Zencoder AI incorporate retrieval-augmented generation, multi-agent workflows, and enterprise readiness features. Some, including Cline and Trae AI, extend into terminal and plugin-based workflows, supporting Git integration, shell execution, and modular agent control.

Finally, autonomous coding agents—notably Devin AI and All Hands AI—aim to handle entire software lifecycles: building, testing, debugging, and deploying with minimal human intervention.

Ray’s survey suggests that these tools do not converge on a single model or interface. Instead, they reflect a broader shift: from programming as manual construction to software as orchestrated dialogue between developer intent and agentic execution. Read the complete paper.

That’s all for today. Thank you for reading this special issue of Deep Engineering. We’re just getting started, and your feedback will help shape what comes next. Just reply to this email to tell us what you think.

Stay awesome,
Divya Anne Selvaraj
Editor-in-Chief, Deep Engineering

If your company is interested in reaching an audience of developers, software engineers, and tech decision makers, you may want to advertise with us.

Discussion about this post

Ready for more?