Skip to content

Part 5: Case Studies and Templates

19. Case Study: Smart Receipt / Invoice Analyzer

“OCR is messy. Accounting rules are worse.
But structure makes sense of both.”


Overview

The Smart Receipt/Invoice Analyzer is a full-stack AI application that:

  • Accepts images or PDFs of receipts and invoices
  • Performs OCR (Optical Character Recognition)
  • Extracts structured fields (merchant, total, date, tax)
  • Parses text into key-value data using GPT
  • Allows frontend edits + corrections
  • Stores parsed data in a database

This is a perfect case to demonstrate:

  • Modular architecture for isolated AI pipelines
  • Clean foldering for OCR, GPT, and DB services
  • How to evolve from modular → hybrid structure

Initial Modular Structure

We began with a feature-first modular approach.

Frontend (React + Vite)

frontend/
└── src/
    └── features/
        └── invoice/
            ├── components/
               ├── UploadForm.tsx
               └── ParsedPreview.tsx
            ├── hooks/
               └── useUpload.ts
            ├── services/
               └── invoiceApi.ts
            └── types/
                └── InvoiceFields.ts

Backend (FastAPI)

backend/
└── app/
    └── features/
        └── invoice/
            ├── routes.py
            ├── services/
               ├── ocr_service.py
               └── gpt_parser.py
            ├── schemas/
               └── invoice.py
            └── models.py

Each feature folder handled everything it needed:

  • Frontend: form, preview, upload logic
  • Backend: OCR, parsing, validation, and response shaping

Evolution to Hybrid Structure

Once we added:

  • Chatbot module that also needed OCR
  • Document classification logic for PDFs

We extracted shared logic into global services.

New Shared Backend Structure

backend/
├── app/
│   ├── features/
│      └── invoice/
│   ├── services/
│      ├── ocr_engine.py         # Shared OCR logic (Tesseract, PaddleOCR)      ├── gpt_client.py         # Central GPT communication layer      └── file_utils.py
│   └── schemas/
│       └── shared.py

Now:

  • Any feature can call ocr_engine.extract_text(file)
  • GPT prompts are templated and reused across modules

Testing Breakdown

Unit Tests

tests/
└── invoice/
    ├── test_parser.py
    ├── test_ocr.py
    └── test_routes.py

Mocks

shared/test_helpers/
├── fake_invoice_image.png
└── mock_gpt.py

Mocking GPT + OCR was essential to keep test speed fast and predictable.


Tech Stack Snapshot

Layer Tool
OCR Tesseract / PaddleOCR
LLM OpenAI GPT-3.5 / GPT-4
Backend FastAPI
Frontend React + Vite
State React Context
Deployment Railway + Vercel
DB Supabase (PostgreSQL)
Auth JWT (Backend) + Supabase Auth (Frontend)

Key Lessons

Challenge Resolution
GPT prompts reused in multiple modules Moved prompt logic to services/gpt_client.py
OCR used across invoice/chatbot features Created ocr_engine.py in shared service layer
Uploads had differing file handling logic Created file_utils.py with unified image/PDF handlers
Type duplication between front and back Introduced packages/shared-schemas/ in monorepo
Components growing too large Split into UploadForm, FieldPreview, ErrorBanner
Test bloat in one location Adopted hybrid test structure (feature + root tests)

Final Hybrid Folder Snapshot

Backend:

app/
├── features/
│   └── invoice/
│       ├── routes.py
│       ├── services/
│       └── schemas/
├── services/
│   ├── gpt_client.py
│   ├── ocr_engine.py
│   └── file_utils.py
├── schemas/
│   └── shared.py

Frontend:

src/
├── features/
│   └── invoice/
│       ├── components/
│       ├── services/
│       └── hooks/
├── shared/
│   └── utils/
│       └── filePreview.ts

Project Outcomes

  • 100+ invoices processed in a demo database
  • Accurate OCR + GPT parsing of total/tax/date in 90% of test cases
  • Scalable architecture allowed new features (chatbot, classifier) to be added without breaking invoice logic
  • Deployed to Vercel + Railway with monorepo CI/CD

What This Case Proves

This case demonstrates:

✅ How to bootstrap modularly with minimal complexity
✅ When and how to extract shared AI logic (OCR, GPT)
✅ How to support multi-feature AI pipelines in a clean structure
✅ The power of hybrid folder strategies for AI-driven apps