Project 01 · Live at instantpotdoc.com
Built to solve a real problem: AI that gives the right answer the first time, every time, with no drift between sessions.
About the project
Why a recipe converter?
The choice was deliberate. Recipes are formulaic — clear inputs, defined steps, measurable outputs. There is no ambiguity about whether a conversion is correct. Either the liquid level is right or it is not. Either the timing works or it does not.
That made recipes the ideal test environment. The goal was not to build a cooking tool. It was to understand how a large language model behaves when given a structured, repeatable task — and what it takes to make the output reliable every time.
No corporate complexity. No competing priorities. No stakeholder interpretation. Just the technology, a well-defined problem, and a clear definition of what correct looks like.
That focus is what made the learning transferable.
Instant Pot Doc converts stovetop recipes for electric pressure cookers. You paste a recipe, it walks you through a strict sequence — cooker model, quart size, processing mode — and produces a fully adapted recipe with timing, liquid adjustments, and scaled quantities.
The fill limit logic is what separates it from a simple conversion. It calculates whether each scale option is safe for your specific pot size, flagging anything that would exceed two-thirds capacity for standard recipes or half capacity for foamy foods. Every output follows the same format, every time.
It runs as a single HTML file in any browser. Users supply their own Anthropic API key — there is no subscription, no account, and no cost beyond what the API itself charges.
Practitioner AI
Each post documents a phase of the build — published weekly as part of the six-part Practitioner AI series.
The Origin Story
How a practical experiment with recipe conversion became a lesson in AI consistency.
The Victory Lap That Wasn't
Prompt drift, ChatGPT diagnosing its own ceiling, and the pivot.
The Pivot
I brought everything I had built to a different AI. What happened next surprised me.
I Asked AI to QA Itself
The tool was live. But had I actually solved the consistency problem?
The Governance Question
AI needs the same discipline any well-run project needs.
The Wrap
Three layers, one recipe converter, and why AI is not an unattended panacea.
Behind the Build
Thirteen narrative stories about specific moments in the build — written in plain language for anyone who wants to go deeper than the surface story.
Instant Pot Doc — reference architecture
01
From Conversation to System
The moment the approach changed
My first attempts at using AI to convert recipes were just conversations. I described what I wanted and got back something useful — sometimes. But every session started from scratch. No memory of how I liked things done. No fixed sequence. No consistency.
That is when I made my first real shift. I stopped having conversations and started building a GPT — a customized version of ChatGPT with its own instructions, its own flow, its own rules. That shift changed how I thought about the problem. For the first time I was not asking questions. I was defining a system. I published the result to the ChatGPT GPT Store under the name Instant Pot Doc. For someone who had never built anything technical, that felt like a real milestone.
Then I actually started using it. Regularly. In my own kitchen. And something was wrong. The UI kept changing. Session to session the experience was different. I spent hours refining the prompt, adding guardrails, tightening every instruction. It got better. It never got reliable. At some point I recognized the feeling from every technology project I had ever run — diminishing returns. More effort going in, less improvement coming out.
So I asked ChatGPT directly: am I using the right tool for this? The answer was honest. Even with strict prompting it told me: I am still probabilistic, context-sensitive, and slightly variable by nature. You can reduce inconsistency. You cannot guarantee it. Then: you have outgrown prompt-only control. You are trying to build a tool, not just a chatbot.
I took the entire GPT prompt and started a new conversation with Claude. Within that first conversation Claude analyzed the prompt, identified exactly where the instructions were ambiguous, and rewrote the entire system. The drift problem disappeared.
Knowing when to switch tools is a leadership skill, not a technical one. And the work you did on the wrong tool is not wasted — it is the specification for what you build next.
01b
The State Machine Solution
How Claude rewrote everything — and why it worked
When I brought the GPT prompt to Claude and explained the drift problem, I expected suggestions. What I got was a diagnosis.
Claude read through the entire prompt and identified the problem immediately. The instructions were written as a list of rules the model was expected to follow. But a language model does not follow rules the way a computer program does. It interprets them. And interpretation, under enough conversational pressure, drifts.
The solution Claude proposed was a state machine. In plain English: instead of giving the model a list of things to remember, you give it a sequence of locked steps. Step one must complete before step two can begin. Step two must complete before step three. No skipping. No reordering. No interpretation of what comes next — because the structure itself determines what comes next.
I recognized the concept immediately. Every enterprise system I had ever implemented worked the same way. An order cannot be shipped before it is approved. A lab result cannot be released before it is verified. The sequence is the control.
Claude rewrote the entire prompt section by section, explaining each change as it went. Where the original said 'always ask about substitutions,' the new version said 'display the full ingredient list, then present exactly these three options, wait for selection, do not proceed until one is chosen.' Every step explicit. Every transition defined. Every recovery path specified. The difference in behavior was immediate.
The problem was never the AI. It was the instruction design. Vague instructions produce variable results. Precise sequences produce consistent ones. That principle applies to every system — human or artificial — that has ever been built.
01c
From File to Live Website in an Afternoon
How Claude built the app, recommended the tools, and walked every step of deployment
ChatGPT had told me I needed a front end. That sounded like a developer project. I do not write code. I have never built a web application. I had no idea where to start.
I described what I wanted to Claude in plain English. A shareable tool that anyone could open in a browser without a ChatGPT account. A clean landing page with my logo and my name. The full recipe conversion flow underneath. Within the same conversation — maybe forty minutes — Claude had produced a single HTML file. Branded landing page. Logo embedded. Green and cream color palette. The complete state machine conversation flow. One file. Fully working.
How it connects to Claude on the back end is simpler than it sounds. The app does not store anything. Every time someone uses it, their message goes directly to Anthropic's servers, Claude processes it against the system prompt, and the answer comes back. No middleman. No database. No backend server. The intelligence lives in the prompt. The file just provides the interface.
Then came deployment. Claude recommended two specific platforms — not categories, specific tools with specific reasons. Namecheap for domain registration: straightforward, affordable, with a promo code that brought the domain to $6.79 for the first year. Netlify for hosting: drag-and-drop deployment — literally drag the HTML file onto a webpage and the site goes live. Free tier. No configuration required.
Every step came with exact instructions. When the screen I saw did not match the instructions — which happened several times — I took a screenshot and shared it directly in the conversation. Every time, within seconds, Claude identified exactly where I was and what to do next.
The full sequence:
From the first message describing what I wanted to a branded tool live at a real domain: one conversation, one afternoon, $6.79.
The barrier is not technical. It is knowing what to ask for, being specific about what you want, and sharing a screenshot when reality does not match the instructions. Those are communication skills. Every executive already has them.
02
The Ingredient List Moment
One small change that made everything work
This one happened before the switch to Claude — back when I was still building in ChatGPT. I mention it here because it is the moment I understood something that has shaped every decision since.
I had added an ingredient substitution feature to the GPT. It seemed genuinely useful — before converting a recipe, suggest swaps for common ingredients. But every time I used it I had to stop, leave the chat, go find the original recipe, and remind myself what was actually in it. The feature that was supposed to make things easier was creating extra friction.
I described the problem. The suggestion came back immediately: show the ingredient list first. Before asking about substitutions, before doing anything else, display exactly what is in the recipe. Let the user see it. Then ask. No new model. No new capability. One additional step in the sequence. Five minutes to implement. The entire flow suddenly worked the way it was supposed to.
That moment carried straight through into the Claude rebuild. The ingredient check step stayed — not because it was technically impressive, but because it was right. The learning from the ChatGPT phase became the foundation of the Claude version.
UX matters more than raw output. A brilliant result delivered in the wrong order is still a bad tool. You do not need a UX designer to figure this out. You need to use your own tool honestly and describe what is frustrating you.
03
The QA Conversation
Asking AI to test itself
After the app was live I wanted to know if I had actually solved the drift problem — or just moved it from one platform to another. I asked Claude directly: can you test this for consistency?
Claude was immediately honest about its limitations. It could not run automated scheduled tests or persist between sessions. But within the same message it proposed a workaround — run the same recipe through the system prompt five times with identical inputs in a single session, compare every output element, and produce a drift report.
The report came back as a clean comparison table. Every critical element — timing, release method, output format order, fill level confirmation, manufacturer directions — was identical across all five runs. Minor variation appeared only in stylistic phrasing. Nothing that affected safety or accuracy. The state machine was working.
AI is not just a building tool. It is a testing tool. You do not need a QA team. You need a clear definition of what correct looks like — and the willingness to ask for a check.
04
The Fill Limit Problem
What the original tool had that the new one was missing
Weeks into using the rebuilt app I noticed something missing. The original ChatGPT GPT had asked for the size of my Instant Pot in quarts and used that to warn me when a scaled recipe would overfill the pot. The Claude version did not have this yet.
I described what the original had done. Claude's response was immediate and specific — it did not just add a quart size question, it designed a complete fill safety system. The quart size question became a mandatory step in the startup sequence. Fill limits were defined precisely: two thirds capacity for standard recipes, half capacity for foamy foods like beans and grains. The scaling step was redesigned to calculate estimated volume at every scale factor and label each option SAFE, OVER LIMIT, or UNDER MINIMUM before the user chose. The entire change happened in one conversation.
Describe the problem precisely and AI solves it precisely. Vague requests get vague results. The more specifically you can articulate what is missing, the more complete the solution you get back.
05
The Screenshot Method
The technique that made everything else possible
Every step-by-step guide assumes the screen you see matches the screen being described. It rarely does. Interfaces change. Options move. Buttons get renamed.
I discovered early in this process that the fastest way to resolve that gap was a screenshot — shared directly into the conversation. Not a typed description. Not an explanation of where I was stuck. A screenshot. During the domain and hosting setup alone I shared more than a dozen screenshots. Each one got an immediate, specific response: you are on the wrong tab, click Advanced DNS. That button is hidden under the dropdown on the right. That error means your API key needs credits — here is exactly where to add them.
The screenshot became my standard method for any technical process throughout this entire project. It works because AI can see what you see and respond to what is actually on your screen rather than what it assumes is there.
When instructions and reality diverge, do not type a description — take a screenshot. It is the single most effective technique a non-technical person can use when working with AI on any process that involves navigating an interface.
06
The Domain Decision
Six dollars and seventy-nine cents for a credential
Once the app was working I had a choice. Share it with a free Netlify URL — something forgettable and random — or buy a real domain. I asked Claude directly: does the domain actually matter?
The answer was unambiguous. For a personal project used privately, no. For something being referenced in a professional LinkedIn series aimed at senior executives, yes. A random Netlify URL signals prototype. A real domain signals intentional. Claude recommended Namecheap, provided the exact search to run, and flagged a promo code that brought the price to $6.79 for the first year. I checked availability, added it to cart, and completed the purchase in under five minutes.
Sometimes the six-dollar decision is the most important one. Commitment changes how you treat a project — and how others perceive it.
07
What ChatGPT Told Me
The most useful conversation I had with the tool I was leaving
After weeks of trying to fix the UI drift — tightening the prompt, adding guardrails, testing and adjusting — I recognized a feeling I had experienced many times in my career. Diminishing returns. More effort, less improvement.
So I asked ChatGPT directly: is GPT the right tool for this? The answer was honest and specific. Even with strict prompting the model told me it was still probabilistic, context-sensitive, and slightly variable by nature. You can reduce inconsistency. You cannot guarantee it. Then: you have outgrown prompt-only control. You are trying to build a tool, not just a chatbot. You need a front end.
That answer led directly to everything that came next — the switch to Claude, the web app, the domain, the LinkedIn series. A tool that honestly diagnosed its own ceiling was more valuable in that moment than one that kept trying to be something it was not.
Ask direct questions. AI systems will often give you honest answers about their own limitations if you ask plainly. That honesty is a feature, not a flaw.
08
The First LinkedIn Post
1,299 impressions, 780 members reached, and a first post ever
I had been on LinkedIn for years. I had never posted original content. Not once.
Claude drafted the full post from the bullet outline we built together, then produced a pull quote graphic — a clean image with a green bar, italic serif type, and the sharpest line from the whole story: Chat is good for answers. It is not good for consistency. The graphic was built as an HTML file, opened in a browser, and screenshotted for LinkedIn — because Claude flagged that LinkedIn does not accept SVG files and solved the format problem in the same conversation.
Even the posting process required real-time help. LinkedIn collapses blank lines on paste. Bullet points from external sources disappear. Every spacing decision has to be made manually inside the editor. I shared screenshots of the draft at each stage and got immediate formatting corrections.
By the end of the first week: 1,299 impressions, 780 members reached, 45 reactions, 5 comments, 4 saves. The audience was 26% senior level, 22% IT services and consulting, with CEOs in the mix. For a first post ever with no prior content history and no ad spend, the algorithm had picked it up and pushed it to exactly the right people.
AI can help you publish, not just build. From drafting to formatting to graphics to real-time troubleshooting — the same tool that built the app helped get the story of building it in front of the right audience.
09
One Chat Spawns Another
How to manage sprawl and keep AI conversations focused
A single AI conversation has limits — not just technical ones, but practical ones. As a chat grows longer it carries more context, more history, more accumulated decisions. At some point the history is longer than the task.
The Instant Pot Doc conversation eventually held everything — the app build, the deployment, the LinkedIn series, the website concept, the briefing documents, the snippets. When it came time to build davidsevans.com, starting a fresh chat with a clean specific mandate was the right move.
Claude was asked to extract the highlights from the Instant Pot Doc conversation, identify what a website would need, and write a structured briefing document. That document became the instruction set for a completely separate session — which built this site without needing to know everything that came before. Claude wrote the requirements. A second Claude built the site.
One chat to think. One chat to build. One chat to publish. The sprawl stays in the original conversation. The new chat gets only what it needs.
Summarize, brief, and spawn. When a conversation gets heavy, take its output and hand it to a new one with a clean specific mandate. That is a workflow pattern any executive can use today.
10
Why I Didn't Use Skills
And why the state machine was the right call
Someone suggested I should have used skills to build Instant Pot Doc. Skills are modular AI capabilities — discrete functions that each do one specific thing, chained together into a workflow. A recipe parsing skill. A scaling skill. A substitution skill. You connect them in sequence.
In theory it would work. In practice it would have required either a developer or a low-code platform with a significant learning curve. It would have added infrastructure without adding capability.
What I built instead — a single system prompt acting as a state machine — achieves the same result with far less complexity. The prompt is the skill chain. Each step in the state machine is effectively a discrete function running in sequence, enforced by the structure of the instructions rather than by code.
The skills approach makes sense when you need enterprise scale, team collaboration, or reusability across multiple applications. For a single-user tool built in one conversation, it would have been the right architecture for the wrong problem.
The most sophisticated solution is not always the most appropriate one. Match the architecture to the scale of the problem.
11
When the App Got It Wrong
The safety error that exposed a gap in the system
The QA test proved the state machine was consistent. What it did not prove was that the logic inside each step was correct. That gap surfaced in a real conversion session with a Korean short ribs recipe.
The recipe had only a quarter cup of soy sauce as its liquid. The Instant Pot requires at least one cup of liquid to build pressure safely. The app produced a complete, confident conversion without flagging the deficit. It scaled the recipe, annotated the scaling options as SAFE, and delivered a finished output — all without catching a fundamental safety error.
The error was caught by the user — not the system. When challenged, the app acknowledged the mistake immediately and corrected it. But it should never have reached the output stage with insufficient liquid in the first place.
The root cause was that the liquid check was calculating fill volume — whether the recipe would overflow the pot — but not separately auditing whether the total pourable liquid met the minimum threshold. Grated apple was being counted toward liquid. Solid ingredients were inflating the estimate. A SAFE label was being applied based on incomplete logic.
The fix required rewriting the liquid audit rules with explicit definitions: count only pourable liquids — water, broth, soy sauce, wine, vinegar, juice. Do not count grated fruit, vegetables, or anything not pourable regardless of moisture content. Apply the audit proactively at the ingredient check, not reactively at scaling. A SAFE label now requires passing both the fill limit check and the minimum liquid check simultaneously.
A consistent system is not the same as a correct system. QA proves reliability. It does not prove the logic is right. Both checks are necessary — and the user is often the last line of defense.
12
The Series Was Written With AI
A practitioner using the tool to document the practice
The six LinkedIn posts in the Practitioner AI series were not written independently. They were developed with AI reviewing the conversation that built Instant Pot Doc and helping identify what the story arc should be, what the key moments were, and how to structure each post.
Claude was asked to extract the highlights from the build conversation, identify the narrative arc, suggest a six-post structure, draft outlines for each post, and then draft the posts themselves. The voice, the decisions about what to include and what to leave out, the revisions — those came from David. But the scaffolding was AI-assisted throughout.
This is what prompt engineering looks like in practice. Not generating content and publishing it unchanged. Using AI as a thinking partner to structure, draft, and refine — then applying human judgment to everything it produces. The content did not write itself. Someone still had to decide what mattered, what to say, and how to say it.
AI is not an unattended panacea. The most useful application of it is as a structured collaborator — one that does the scaffolding work so the human can focus on the judgment work.