New Tools Showboat and Rodney Help AI Coding Agents Demonstrate Their Work

February 10, 2026 – As artificial intelligence coding agents become increasingly sophisticated, a fundamental challenge has emerged: how can these AI systems effectively demonstrate that the software they've built actually works? According to developer Simon Willison, whose work on AI-assisted programming has gained significant attention, this problem goes beyond traditional automated testing and requires new approaches to verification.

Willison has just released two open-source tools aimed at this exact challenge: Showboat and Rodney. These tools represent a practical response to what he identifies as a critical gap in the AI development workflow.

The Demonstration Problem in AI-Assisted Development

According to Willison's analysis published on his weblog, the core issue isn't just about writing code that passes tests, but about creating artifacts that clearly demonstrate progress and functionality. "The more code we churn out with agents," he writes, "the more valuable tools are that reduce the amount of manual QA time we need to spend."

This challenge becomes particularly acute in environments where human code review is minimized or eliminated. Willison references the StrongDM software factory model, which maintains a policy that "code must not be reviewed by humans" and instead relies on "expensive swarms of QA agents" to exercise software through scenarios. While fascinating, this approach represents a significant investment that many developers and organizations cannot afford.

"I need tools that allow agents to clearly demonstrate their work to me," Willison explains, "while minimizing the opportunities for them to cheat about what they've done."

Showboat: Automated Documentation for AI-Generated Code

Showboat is a command-line tool written in Go that helps coding agents construct Markdown documents demonstrating exactly what their newly developed software can do. The tool isn't designed primarily for human use, but rather as a framework that AI agents can employ to create comprehensive demonstrations.

The workflow is straightforward: agents use commands like showboat init to start a document, showboat note to add explanatory text, showboat exec to run commands and capture their output, and showboat image to include screenshots. The result is a living document that shows both the commands executed and their actual outputs.

According to Willison's documentation, one of the key features is that Showboat provides comprehensive help text specifically designed for AI consumption. "The --help text acts a bit like a Skill," he notes, referring to the concept of specialized instructions for AI systems. "Your agent can read the help text and use every feature of Showboat to create a document that demonstrates whatever it is you need demonstrated."

Willison describes an interesting side benefit: developers can watch the demonstration document update in real-time as the agent works through the demo, creating an experience similar to "having your coworker talk you through their latest work in a screensharing session."

Rodney: Browser Automation for Web Interfaces

Many modern software projects involve web interfaces, and Rodney addresses the specific challenge of demonstrating these. Built as a CLI wrapper around the Rod Go library for Chrome DevTools protocol interaction, Rodney allows coding agents to automate browser sessions and capture their results.

The tool enables agents to start Chrome, navigate to URLs, execute JavaScript, click elements, and capture screenshots—all through simple command-line instructions. Like Showboat, Rodney includes comprehensive help text designed specifically for AI consumption.

Willison provides examples of Rodney being used for accessibility testing, where agents can run automated audits of web pages and document the results. This demonstrates how the tool can be used for more than just simple demonstrations, extending into quality assurance workflows.

The Limitations of Test-Driven Development with AI

Willison, who describes himself as a former skeptic of test-first development, has come to appreciate test-driven development (TDD) as a methodology for constraining AI coding agents. "Telling the agents how to run the tests doubles as an indicator that tests on this project exist and matter," he observes.

However, he emphasizes that passing automated tests doesn't guarantee that software actually works as intended. "Anyone who's worked with tests will know that just because the automated tests pass doesn't mean the software actually works!" he writes. "That's the motivation behind Showboat and Rodney—I never trust any feature until I've seen it running with my own eye."

A Development Trend: AI Tools Built by AI

In a notable reflection of current development trends, Willison reveals that both Showboat and Rodney were largely built using AI coding agents via the Claude iPhone app. "I'm still a little startled at how much of my coding work I get done on my phone now," he admits, estimating that "the majority of code I ship to GitHub these days was written for me by coding agents driven via that iPhone app."

This development approach aligns with broader industry trends toward AI-assisted programming. According to Microsoft's recent announcement about Project Opal, there's growing recognition that task-based AI assistance represents a significant shift in how work gets done across various domains, including software development.

Implications for Software Development Workflows

The release of Showboat and Rodney comes at a time when organizations are increasingly integrating AI coding assistants into their development pipelines. These tools address a practical concern that has emerged as AI-generated code becomes more prevalent: the need for transparent verification processes.

By providing structured ways for AI agents to demonstrate their work, these tools could help bridge the trust gap between human developers and AI coding assistants. They represent a pragmatic approach to quality assurance in an era where AI is responsible for an increasing share of code production.

As Willison notes, the tools are designed specifically for asynchronous coding agent environments, reflecting how many developers now interact with AI assistants—not as real-time pair programmers, but as collaborators who work on tasks independently and present their results.

Both Showboat and Rodney are available as open-source projects on GitHub, inviting other developers to experiment with and extend these approaches to AI-assisted software verification.

AI-Powered Content

Sources: simonwillison.net • techcommunity.microsoft.com