Co-creating Code with LLMs: A Practical Workflow
In the rapidly evolving field of AI engineering, we're constantly seeking ways to move beyond simple code completion and leverage Large Language Models (LLMs) for more complex, project-specific development tasks. The challenge isn't just about generating snippets of code; it's about integrating the LLM into our development lifecycle as a true collaborator.

For a recent project, we were tasked with building a system to help streamline the review of social security benefit applications. This process involves complex business rules, detailed evidence review, and strict data consistency requirements. Manually building the boilerplate for data models, APIs, and tests for each type of benefit and condition would be time-consuming and error-prone. This challenge was a perfect candidate for our LLM co-creation workflow, allowing us to rapidly generate robust and consistent code based on a set of well-defined patterns.
This post introduces a structured, iterative workflow that positions the engineer as an architect and the LLM as a highly-skilled pair programmer. The core of this collaboration hinges on a simple but powerful mantra: "Apply Pattern X to our specific Context Y." By defining clear patterns and providing focused context, we can unlock a new level of productivity and shift our focus from writing boilerplate to designing elegant systems.
This workflow is a "happy path" that has worked well for us. We encourage you to experiment with it and adapt it to your own needs.
Our Development Workflow: A 5-Step Cycle
We've distilled our LLM-driven development process into a five-step cycle. This cycle is designed to take an idea from a rough concept to a fully implemented and tested feature, all in close collaboration with an LLM.
- Create a Blueprint: Translate project requirements into a formal, machine-readable schema.
- Define the Data Model: Apply a consistent architectural pattern to generate data models from the schema.
- Implement the Endpoint: Use the data models to create API endpoints.
- Implement the Test: Generate property-based tests to ensure the endpoint is correct and robust.
- Iterate and Expand: Re-apply the established patterns to build out the rest of the application.
Let's walk through each step.
Step 1: Create a Blueprint From The Key Ideas (The "Memory")
Every project starts with ideas, often scattered across meeting notes, design documents, or whiteboard sketches. The first step is to consolidate these unstructured thoughts into a formal database schema. This schema becomes the foundational "memory" document for all subsequent steps.
- Goal: Translate project requirements into a formal database schema.
- Pattern: We use a general LLM capability: generating an entity-relationship diagram (ERD) in Mermaid-markdown from unstructured text.
- Context: Our project-specific notes about the required entities and their relationships.
- Example Prompt: "Given these notes about benefit types, applications, and eligibility criteria, generate a Mermaid ERD for a database schema."
The outcome is a database_schema.md
file that serves as a single source of truth for our data structures.
Example Blueprint: The Schema
The machine readable version
erDiagram
BENEFIT_TYPES {
UUID id PK
STRING name
TEXT description
}
APPLICATIONS {
UUID id PK
UUID benefit_type_id FK
ENUM status
TIMESTAMP submitted_at
}
ELIGIBILITY_CRITERIA {
UUID id PK
UUID benefit_type_id FK
STRING criterion
}
APPLICATION_DATA {
UUID id PK
UUID application_id FK
STRING data_type
STRING value
}
BENEFIT_TYPES ||--o{ APPLICATIONS : "are for"
BENEFIT_TYPES ||--o{ ELIGIBILITY_CRITERIA : "have"
APPLICATIONS ||--o{ APPLICATION_DATA : "contain"
The human readable version

Step 2: Define The Data Model Using The Command-Query Model Pattern
With a blueprint in hand, the next step is to implement the database models. To ensure consistency and quality, we use a predefined architectural pattern for our SQLModel classes.
- Goal: Implement a single, high-quality database model.
- Pattern: The Command-Query Model Pattern (
Base
,Create
,Table
,Update
), an architectural pattern we defined for SQLModel. - Context: A table definition from our
database_schema.md
. - Example Prompt: "Using the 'EligibilityCriteria' table from
@database_schema.md
and our documented Command-Query model pattern, generate the corresponding SQLModel classes in@models.py
."
For this to work, the LLM needs access to both the database schema and a clear explanation of our model pattern. This pattern is heavily inspired by the official SQLModel tutorial on multiple models with FastAPI. This tutorial is a perfect document to provide as context for the LLM.
Example Command-Query Model
The LLM might initially produce code that is correct but verbose. For example, it might place fields like id
, created_at
, and updated_at
directly into each model.
LLM's First Pass
# LLM's first pass is functional, but repetitive.
class EligibilityCriterionBase(SQLModel):
criterion: str
benefit_type_id: uuid.UUID = Field(foreign_key="benefit_types.id")
class EligibilityCriterionCreate(EligibilityCriterionBase):
pass
class EligibilityCriterion(EligibilityCriterionCreate, table=True):
__tablename__: str = "eligibility_criteria"
id: uuid.UUID = Field(default_factory=uuid.uuid4, primary_key=True, index=True, nullable=False)
created_at: datetime | None = Field(default_factory=lambda: datetime.now(UTC), nullable=False)
updated_at: datetime | None = Field(default_factory=lambda: datetime.now(UTC), nullable=False)
"""For a PATCH request, all fields should be optional."""
class EligibilityCriterionUpdate(SQLModel):
criterion: str | None = None
benefit_type_id: uuid.UUID | None = None
This works, but it violates the Don't Repeat Yourself (DRY) principle. If we need to change how IDs or timestamps are handled, we'd have to edit every model. A better approach is to extract these common fields into a BaseRecord
model. We can instruct the LLM to do this, or do it ourselves.
Human-Refined Code
Here's an example of the pattern applied to an EligibilityCriteria
model. It separates the base fields, the creation schema, the database table model, and the update schema.
class BaseRecord(SQLModel):
id: uuid.UUID = Field(default_factory=uuid.uuid4, primary_key=True, index=True, nullable=False)
created_at: datetime | None = Field(default_factory=lambda: datetime.now(UTC), nullable=False)
updated_at: datetime | None = Field(default_factory=lambda: datetime.now(UTC), nullable=False)
class EligibilityCriterionBase(SQLModel):
criterion: str
benefit_type_id: uuid.UUID = Field(foreign_key="benefit_types.id")
class EligibilityCriterionCreate(EligibilityCriterionBase):
pass
class EligibilityCriterion(EligibilityCriterionCreate, BaseRecord, table=True):
__tablename__: str = "eligibility_criteria"
"""For a PATCH request, all fields should be optional."""
class EligibilityCriterionUpdate(SQLModel):
criterion: str | None = None
Step 3: Building the Endpoints
Once we have our data models, we can create the API endpoints to interact with them. This step also follows a standard pattern.
- Goal: Create generic CRUD functions and a specific API endpoint for a model.
- Pattern: A standard FastAPI router with GET, POST, PATCH, and DELETE endpoints.
- Context: Our Command-Query
EligibilityCriterion
models frommodels.py
. - Example Prompt: "Implement standard RESTful CRUD operations in a FastAPI router for the 'EligibilityCriterion' table. Use the appropriate Schemas from
@models.py
."
The SQLModel tutorial on integrating with FastAPI provides a complete example of how to connect these models to CRUD endpoints. Of course, this requires some initial setup, like a database dependency for FastAPI. We recommend using an in-memory SQLite database during development for speed and simplicity.
Example CRUD-Router: From Specific to Generic
This initial output defines the API endpoints, but the implementation is missing. When asked to implement them, an LLM might generate specific, verbose logic for each function.
LLM's First Pass: Specific CRUD Logic
@criterion_router.post("/")
async def create_criterion(
session: AsyncSessionDep, criterion: EligibilityCriterionCreate
) -> EligibilityCriterion:
"""Creates a new criterion. This is verbose and will be repeated for every model."""
db_criterion = EligibilityCriterion.model_validate(criterion)
session.add(db_criterion)
await session.commit()
await session.refresh(db_criterion)
return db_criterion
# ... (and imagine similar verbose implementations for read, update, and delete)
This is highly repetitive. A much cleaner approach is to define generic functions that can handle CRUD operations for any SQLModel table. The human programmer can create those generics.
Human-Refined Code: Generic CRUD Functions
The programmer can write a set of generic functions to handle the core operations, and then use them in the specific endpoints.
T = TypeVar("T", bound=SQLModel)
async def generic_create(schema: type[T], data: BaseModel, session: AsyncSession) -> T:
new_data = data.model_dump(exclude_unset=True)
try:
insert = schema.model_validate(new_data)
except ValidationError as e:
raise HTTPException(status_code=422, detail=str(e)) from e
session.add(insert)
await session.commit()
await session.refresh(insert)
return insert
# ... imagine similar generics for Read, Update, Delete ...
With these generics in place, the API router becomes much simpler and easier to maintain.
Refactored Router
criterion_router = APIRouter(prefix="/criteria", tags=["criteria"])
@criterion_router.post("/")
async def create_criterion(
session: AsyncSessionDep, criterion: EligibilityCriterionCreate
) -> EligibilityCriterion:
return await generic_create(EligibilityCriterion, criterion, session)
@criterion_router.get("/")
async def read_criteria(
session: AsyncSessionDep, skip: int = 0, limit: int = 100
) -> list[EligibilityCriterion]:
return await generic_get_all(EligibilityCriterion, session, skip, limit)
@criterion_router.get("/{criterion_id}")
async def read_criterion(
session: AsyncSessionDep, criterion_id: UUID
) -> EligibilityCriterion:
return await generic_get(EligibilityCriterion, criterion_id, session)
@criterion_router.patch("/{criterion_id}")
async def update_criterion(
session: AsyncSessionDep,
criterion_id: UUID,
criterion_update: EligibilityCriterionUpdate,
) -> EligibilityCriterion:
return await generic_update(EligibilityCriterion, criterion_id, criterion_update, session)
@criterion_router.delete("/{criterion_id}")
async def delete_criterion(
session: AsyncSessionDep, criterion_id: UUID
) -> EligibilityCriterion:
return await generic_delete(EligibilityCriterion, criterion_id, session)
Step 4: Implement High-Fidelity, Property-Based Tests
A robust API needs robust tests. We use property-based testing to ensure our endpoints are reliable and correct under a wide range of inputs.
- Goal: Ensure the API is robust, reliable, and correct.
- Pattern: Property-based testing using
pytest
andHypothesis
. - Context: The 'create criterion' endpoint, its
EligibilityCriterionCreate
schema, examples of usinghttpx.AsyncClient
, and examples ofHypothesis
. - Example Prompt: "Write tests for the criterion router as defined in
@crud.py
. Use Hypothesis to generate test data based on the Schemas defined in@models.py
and use the asynctest_client
to call the API."
This step relies on having testing infrastructure in place, like a pytest
fixture for the httpx.AsyncClient as a test client.
Example Property-Based Testing
"""Generate Objects to Match the `Create` Model"""
@st.composite
def criterion_strategy(draw: st.DrawFn) -> dict[str, Any]:
"""Generate valid EligibilityCriterion data."""
return {
"criterion": draw(st.text(min_size=1, max_size=200)),
"benefit_type_id": uuid.uuid4(),
}
"""
In a production test suite, we'd replace uuid.uuid4()
with a pytest fixture that creates a BENEFIT_TYPES record
and provides its ID, ensuring our foreign key constraint
is always satisfied during testing.
"""
"""Supply the Object factory to the test to quickly test all properties"""
@given(new_criterion=criterion_strategy())
@pytest.mark.asyncio
async def test_create_criterion(
self, new_criterion: dict[str, Any], test_client: AsyncClient
) -> None:
"""Test creating an EligibilityCriterion."""
# In a real test, you'd ensure the benefit_type_id exists.
validated_input = models.EligibilityCriterionCreate.model_validate(new_criterion)
response = await test_client.post(
"/criteria/", content=validated_input.model_dump_json()
)
assert response.status_code == HTTPStatus.OK
created = response.json()
assert "id" in created
assert "created_at" in created
assert "updated_at" in created
assert created["criterion"] == new_criterion["criterion"]
assert created["benefit_type_id"] == str(new_criterion["benefit_type_id"])
Example Test Client Fixture
import pytest_asyncio
from httpx import AsyncClient, ASGITransport
from main import app, get_db
# ... other fixtures or test DB setup etc. ...
@pytest_asyncio.fixture
async def test_client() -> AsyncGenerator[AsyncClient, None]:
app.dependency_overrides[get_db] = get_test_db
transport = ASGITransport(app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
app.dependency_overrides.clear()
Step 5: LLM Goes Brrrr (Iterate and Expand)
Here's where the magic happens. Having established our patterns by building one high-quality implementation, scaling becomes incredibly efficient. We can now ask the LLM to apply those same patterns to the other tables in our schema.
Prompt 1: Create Models
- Pattern: Our Command-Query Model Pattern.
- Context: All other models in
@database_schema.md
, using@models.py
as an example. - Prompt: "Based on the examples in
@models.py
, implement all other models as defined in@database_schema.md
."
Prompt 2: Create CRUD Logic
- Pattern: Our FastAPI router pattern.
- Context: All other models in
@models.py
, using@crud.py
as an example. - Prompt: "Based on the examples in
@crud.py
, implement CRUD logic for all models defined in@models.py
."
Prompt 3: Create Tests
- Pattern: Our property-based testing strategy.
- Context: All new endpoints in
@crud.py
, using@test_crud.py
as an example. - Prompt: "Based on the examples in
@test_crud.py
, implement tests for all endpoints defined in@crud.py
."
A few well-crafted prompts can generate a significant portion of our application's boilerplate, giving us more time to focus on more complex logic.
Guiding Principles for LLM Co-Creation
This workflow is supported by a few key principles that we've found essential for success.
Design Code, Don't Write It
Your role shifts from writing code to designing it. By establishing a high-quality example of a pattern ("First One, Then Many"), you provide the LLM with a template to follow. Your job becomes reviewing, refining, and guiding the LLM, rather than typing out boilerplate. You spend more time thinking about architecture and robust patterns.
Control Context and Use Sources
LLMs perform best with small, focused contexts. Start each new step with a "clean room" context window. Use external documents (like our database_schema.md
) as a persistent "memory" that you can feed to the LLM. When implementing a library, feed the LLM its official documentation and tutorials. For instance, when asking it to generate SQLModel classes and FastAPI endpoints, we provide it with the official tutorial on multiple models. And don't be afraid to restart a chat if the LLM gets sidetracked.
Let the AI Clean Its Own Mess
Use static analysis and automated testing to your advantage. An LLM can generate code that looks right but fails under scrutiny.
- DLTLLMRI: Don't Let the LLM Repeat Itself. If you see the LLM writing repetitive code, stop and work with it to create a generic function or a factory instead.
- Instruct the LLM to verify its own work. A simple prompt addition like "Verify your work by running
make ci
" can work wonders.
Here's the Makefile
we use to run our static analysis suite:
.PHONY: static_analysis
static_analysis:
@uvx ruff format .
@uvx ruff check . --fix
@uvx complexipy --details low src/app tests/
@uv run --all-groups --with pip-audit pip-audit -l
@uv run --all-groups --with pyright pyright src/app tests/
.PHONY: test
test:
@uv run --group testing pytest -n auto -m "not slow" --cov=src/app --cov-report=xml
@uvx diff-cover coverage.xml --fail-under=80 --compare-branch=main
ci: | static_analysis test
This makefile allows to run all analysis at once with make ci
, but it also lets you run only static analyis with make static_analysis
, which is useful when the tests are not written yet. Note that we make use of uv
and its tool runner uvx
, but this can just as easily be done with equivalent tools.
Conclusion
This 5-step workflow for co-creating code with LLMs represents a significant shift in the development process. By acting as architects who define patterns and guide the LLM, we can automate the generation of high-quality boilerplate code. This allows us to dedicate our expertise to the most complex and unique aspects of our applications.
Adopting a structured approach like this one transforms the LLM from a simple code completion tool into a powerful and productive development partner, bridging the gap between AI capabilities and the demands of real-world software engineering.
“This Vincent guy really, really knows his shit!”
As stated by one happy customer