1.1 Understanding of Problem & Objectives¶
Theme Statement¶
Automated AI-Based Financial Transaction Categorisation
Background/Motivation¶
Modern financial applications—ranging from personal budgeting tools to business accounting platforms—require robust systems for classifying raw transaction strings (such as "Starbucks," "Amazon.com," or "Shell Gas") into meaningful categories ("Coffee/Dining," "Shopping," "Fuel") for budgeting, analytics, or reporting purposes.
Today, many developers rely on expensive, third-party APIs to achieve this, resulting in: - High scaling costs - Limited flexibility - Suboptimal user experience
There is a pressing need for cost-effective, in-house AI solutions that empower developers with: - Rapid transaction categorisation - Enhanced control - Full customisability
Problem Statement¶
Building a scalable transaction categorisation system is essential for seamless financial management. Reliance on external APIs introduces: - Recurring costs - Network latency - Limits in customising the categorisation logic
Developing an internal AI or ML-based solution enables: - Granular control - Cost savings - Improved responsiveness
However, this also raises new challenges: - Need for high-accuracy - Adaptability to user-defined categories - Rigorous evaluation - Explainable outcomes
Challenge¶
Build a standalone, high-performance transaction categorisation system that achieves business-grade accuracy and transparency while eliminating external service dependencies.
Key Considerations¶
1. End-to-End Autonomous Categorisation¶
- The system must ingest raw financial transaction data and output a category and confidence score based on a predefined, user-configurable taxonomy
- All categorisation logic and inference must take place within the team's environment—no third-party API calls
2. Accuracy & Evaluation¶
- Deliver a macro F1-score of at least 0.90 on the dataset used for demonstration
- Submissions should include a detailed evaluation report with:
- Confusion matrix
- Macro and per-class F1 scores
- End-to-end reproducibility from data processing to inference
3. Customisable & Transparent¶
- The category taxonomy must be easily updated via a configuration file (e.g., JSON, YAML)
- Support admin-driven changes without direct code edits
- Bonus points for explainability: Provide insights or feature attributions explaining classification decisions
- Incorporate a simple feedback loop mechanism for users to review and correct low-confidence predictions
4. Robustness & Responsible AI¶
- Handle noisy, variable transaction strings robustly
- Address ethical AI aspects, particularly in mitigating biases (e.g., based on merchant, region, or transaction amount)
- Strict adherance to global financial regulations (GDPR, CCPA) and industry standards (SOC 2). We prioritize data sovereignty by ensuring no sensitive financial data leaves our controlled infrastructure.
Annexure¶
Out of Scope¶
- Full production deployments
- CI/CD pipelines
- Extensive user interfaces
- Real-time streaming
- Fraud/anomaly detection
- Financial advice features
Performance Measurement¶
- Extra credit for providing throughput and latency benchmarks
- Transparent measurement notes
Resources¶
- No official dataset is provided
- Teams should source or generate their own data (e.g., public datasets or synthetics)
- Document data acquisition process clearly
Deliverables¶
Required¶
- Source code repository
- README with setup instructions
-
Dataset documentation
-
Metrics report
- Macro/per-class F1 scores
- Confusion matrix
-
Accuracy metrics
-
Short demo
- Pipeline execution
- Evaluation results
- Sample predictions with confidence scores
- Demo of taxonomy modification via config
Bonus Objectives¶
- Explainability UI
- Robustness to input noise
- Batch inference performance metrics
- Simple human-in-the-loop feedback
- Bias mitigation discussion
Our Solution Approach¶
This project implements a hybrid ensemble system that combines multiple classification methods to achieve superior accuracy and transparency:
Architecture Components¶
- MCC (Merchant Category Code) Classifier - ISO 18245 standard codes
- Rule-based Engine - Deterministic pattern matching
- ML Classifier - LightGBM with sentence-transformers embeddings
- LLM Fallback - Ollama (local) for edge cases
Key Features¶
- 98.43% validation accuracy (exceeds 90% F1 requirement)
- Configurable taxonomy via YAML (28 balanced categories)
- Explainable results with method attribution and confidence scores
- Feedback loop with corrections mechanism
- Zero external API costs (fully autonomous)
- Production-ready with Docker deployment
Performance Metrics¶
- Macro F1 Score: 0.9842 (98.42%)
- Accuracy: 98.43%
- Average Latency: <100ms per transaction
- Batch Processing: 1000+ transactions supported
See detailed implementation documentation in subsequent sections.