Part 5: Case Studies and Templates¶

19. Case Study: Smart Receipt / Invoice Analyzer¶

“OCR is messy. Accounting rules are worse.
But structure makes sense of both.”

Overview¶

The Smart Receipt/Invoice Analyzer is a full-stack AI application that:

Accepts images or PDFs of receipts and invoices
Performs OCR (Optical Character Recognition)
Extracts structured fields (merchant, total, date, tax)
Parses text into key-value data using GPT
Allows frontend edits + corrections
Stores parsed data in a database

This is a perfect case to demonstrate:

Modular architecture for isolated AI pipelines
Clean foldering for OCR, GPT, and DB services
How to evolve from modular → hybrid structure

Initial Modular Structure¶

We began with a feature-first modular approach.

Frontend (React + Vite)¶

frontend/
└── src/
    └── features/
        └── invoice/
            ├── components/
            │   ├── UploadForm.tsx
            │   └── ParsedPreview.tsx
            ├── hooks/
            │   └── useUpload.ts
            ├── services/
            │   └── invoiceApi.ts
            └── types/
                └── InvoiceFields.ts

Backend (FastAPI)¶

backend/
└── app/
    └── features/
        └── invoice/
            ├── routes.py
            ├── services/
            │   ├── ocr_service.py
            │   └── gpt_parser.py
            ├── schemas/
            │   └── invoice.py
            └── models.py

Each feature folder handled everything it needed:

Frontend: form, preview, upload logic
Backend: OCR, parsing, validation, and response shaping

Evolution to Hybrid Structure¶

Once we added:

Chatbot module that also needed OCR
Document classification logic for PDFs

We extracted shared logic into global services.

New Shared Backend Structure¶

backend/
├── app/
│   ├── features/
│   │   └── invoice/
│   ├── services/
│   │   ├── ocr_engine.py         # Shared OCR logic (Tesseract, PaddleOCR)
│   │   ├── gpt_client.py         # Central GPT communication layer
│   │   └── file_utils.py
│   └── schemas/
│       └── shared.py

Now:

Any feature can call ocr_engine.extract_text(file)
GPT prompts are templated and reused across modules

Testing Breakdown¶

Unit Tests¶

tests/
└── invoice/
    ├── test_parser.py
    ├── test_ocr.py
    └── test_routes.py

Mocks¶

shared/test_helpers/
├── fake_invoice_image.png
└── mock_gpt.py

Mocking GPT + OCR was essential to keep test speed fast and predictable.

Tech Stack Snapshot¶

Layer	Tool
OCR	Tesseract / PaddleOCR
LLM	OpenAI GPT-3.5 / GPT-4
Backend	FastAPI
Frontend	React + Vite
State	React Context
Deployment	Railway + Vercel
DB	Supabase (PostgreSQL)
Auth	JWT (Backend) + Supabase Auth (Frontend)

Key Lessons¶

Challenge	Resolution
GPT prompts reused in multiple modules	Moved prompt logic to `services/gpt_client.py`
OCR used across invoice/chatbot features	Created `ocr_engine.py` in shared service layer
Uploads had differing file handling logic	Created `file_utils.py` with unified image/PDF handlers
Type duplication between front and back	Introduced `packages/shared-schemas/` in monorepo
Components growing too large	Split into `UploadForm`, `FieldPreview`, `ErrorBanner`
Test bloat in one location	Adopted hybrid test structure (feature + root tests)

Final Hybrid Folder Snapshot¶

Backend:¶

app/
├── features/
│   └── invoice/
│       ├── routes.py
│       ├── services/
│       └── schemas/
├── services/
│   ├── gpt_client.py
│   ├── ocr_engine.py
│   └── file_utils.py
├── schemas/
│   └── shared.py

Frontend:¶

src/
├── features/
│   └── invoice/
│       ├── components/
│       ├── services/
│       └── hooks/
├── shared/
│   └── utils/
│       └── filePreview.ts

Project Outcomes¶

100+ invoices processed in a demo database
Accurate OCR + GPT parsing of total/tax/date in 90% of test cases
Scalable architecture allowed new features (chatbot, classifier) to be added without breaking invoice logic
Deployed to Vercel + Railway with monorepo CI/CD

What This Case Proves¶

This case demonstrates:

✅ How to bootstrap modularly with minimal complexity
✅ When and how to extract shared AI logic (OCR, GPT)
✅ How to support multi-feature AI pipelines in a clean structure
✅ The power of hybrid folder strategies for AI-driven apps