Expense Classification. How it started. How it's going.
A full blown tool might not be the best approach. Let's stick to spreadsheets instead.
Yeah, I’m still going down the expense classification rabbit hole. I do it every month, so simplifying the process is a problem I really want to solve.
I’ve slightly changed my mind about what I want to build since writing my last piece.
What is it that I’m trying to do?
Every month I export a bunch of transactions from credit card/bank accounts and want to know which category they fall into. I’m really happy with my current process. It’s rules-based, meaning that I’ll define something like *Coles* = Groceries i.e. if it finds Coles in the description, it sets the category to groceries.
The problem. Well, if it’s your first time at KMART, you’ll have to set up a new rule.
Sounds like AI could help, so I started using some language models and built this custom app (details here).
Upon second thought though
I figured that these types of custom apps are nothing that I’d want to use.
Copilot, RocketMoney, MonarchMoney, etc. all lock you in. It’s often hard to move to a different service, even harder to import legacy data manually and most of them are just so overwhelming that I feel only an accountant can stomach the amount of numbers.
So if I wouldn’t want to use my own tool, then why build it at all?
What I do like using is a spreadsheet. Ideally, I’d keep the spreadsheet and make my current classification smarter.
Just like Tiller allows you to keep a spreadsheet and automatically imports transaction data from your bank and credit card accounts.
It’s a nice little spreadsheet addon that lets you keep what you know - the spreadsheet - and just makes it smarter.
The Problem. They do expense classification exactly the same as I do. Rules-based.
Anyway, the idea that stuck with me was the spreadsheet add-on.
Building a Google Sheet addon
I quickly found this great repo for a react-based sheets add-on.
This should allow me to call some sort of API/backend service to categorise transactions.
Transaction classification API
As always, there is no need to reinvent the wheel. This guy here did something very similar to what I’m trying to do. I rewrote my sentence transformer BERT script to run as a Google Cloud Function and got it to work with a small dataset.
The whole function is currently very slow and I had to give it about 2GB of memory to work at all.
Here is what the current setup looks like.
There are still a bunch of problems to be solved, which I’ll hopefully be able to work through soon.
What’s next?
The model is still quite slow, so I’ll have to find out how to speed it up. I found some resources here:
The part in the architecture that is still missing is the connection between Google Sheet Addon and Google Cloud Function
I also stumbled across this interesting model on Replicate, which is a redeployed SentenceTransformer. Going to try this out as well.