Part 5: Case Studies and Templates¶

21. Case Study: AI-Powered Mockup-to-Code Tool¶

“A mockup isn’t just a picture.
It’s a latent interface waiting to be decoded.”

Overview¶

The AI-Powered Mockup-to-Code Tool is a frontend-first AI application that transforms UI mockups (Figma exports, PNGs, screenshots) into structured HTML/CSS or React components.

The pipeline involves:

Uploading a mockup image
Performing layout analysis using vision models
Extracting UI elements and classifying them (buttons, input fields, headers, etc.)
Generating code snippets (React or Tailwind-based) via GPT
Displaying a live preview and downloadable scaffold

This project demonstrates:

✅ Cross-domain AI integration (CV + LLM)
✅ Highly interactive frontend logic
✅ Modular and layered backend orchestration
✅ Clear separation of vision pipeline and code generation

Initial Modular Structure¶

The project started modular for experimentation, especially for isolating layout detection and prompt engineering.

Frontend (React + Vite)¶

src/
└── features/
    ├── upload/
    ├── preview/
    └── codegen/

Backend (FastAPI)¶

app/
└── features/
    ├── layout/
    ├── codegen/
    └── upload/

Final Scalable Hybrid Structure¶

As accuracy and user demand increased, shared services emerged across modules, triggering a hybrid structure shift.

Backend:¶

app/
├── api/
│   ├── upload.py
│   ├── layout.py
│   └── codegen.py
├── services/
│   ├── image_processing.py
│   ├── layout_detector.py
│   ├── gpt_prompt_builder.py
│   └── html_generator.py
├── schemas/
│   ├── layout.py
│   └── codegen.py
├── core/
│   └── config.py

Frontend:¶

src/
├── components/
│   ├── MockupCanvas.tsx
│   ├── LayoutBoundingBox.tsx
│   └── CodePreview.tsx
├── services/
│   └── codegenClient.ts
├── hooks/
│   └── useCodeGeneration.ts
├── features/
│   ├── upload/
│   ├── detect/
│   └── render/

Core AI Flow¶

[Image Upload]
    ↓
CV Layout Detection (YOLO or Detectron2)
    ↓
Component Classification (headers, buttons, divs)
    ↓
Prompt Assembly → GPT-4
    ↓
Code Output (React + Tailwind / HTML + CSS)
    ↓
[Live Preview + Download]

CV + GPT Model Integration¶

Step	Tech
Object detection	YOLOv8 or Detectron2 (trained on UIs)
Component label classification	Custom CV classifier
Prompt injection	GPT-4 via OpenAI API
Code formatting	Prettier (client-side)
Model inference	Hosted on Replicate (for YOLO) or local PyTorch

Testing Approach¶

Layer	Strategy
Layout detector	Snapshot-based CV test (bounding boxes)
GPT prompt	Deterministic unit tests with static inputs
Frontend	Visual regression tests (Storybook + Playwright)
Upload/codegen endpoints	End-to-end integration with sample assets

Key Problems & Fixes¶

Problem	Resolution
LLM-generated code sometimes invalid	Added static HTML validator + auto-fix layer
Misclassified layout regions	Introduced confidence thresholding + post-processing rules
Slow image inference on render	Added async loading + placeholder thumbnails
GPT prompt bloated for large mockups	Chunked component zones and stitched final code
Loss of user layout intent	Allowed user to manually adjust box types before codegen

Frontend Interaction Flow¶

User uploads mockup image
Bounding boxes appear with labels (adjustable)
User selects preferred framework (HTML/CSS or React/Tailwind)
“Generate Code” triggers GPT prompt call
Code appears in preview pane + download button enabled

Tech Stack Summary¶

Layer	Tool
CV Inference	YOLOv8 (Replicate or local Torch)
GPT Codegen	GPT-4 (OpenAI API)
Frontend	React + Tailwind CSS
State	Zustand (for image + code state)
Formatter	Prettier (optional: ESLint)
Deployment	Netlify + Railway
Previews	Monaco Editor + HTML live iframe

Outcomes¶

Converted mockups to usable React code in <15 seconds
Allowed manual overrides on labels before GPT input
Reduced layout hallucination via hybrid prompt template
Supported 2 output modes: HTML/CSS and React + Tailwind
Used by junior frontend devs as a scaffolding tool

What This Case Demonstrates¶

✅ Vision pipeline and GPT codegen must be decoupled
✅ Hybrid prompting is more stable than pure freeform GPT
✅ Component previewing and user correction is vital
✅ Modular frontend separation (upload / detect / render) improves UX
✅ Shared CV and GPT logic should live in services/ not features/