What is Nostr?
jb55 / Will
npub1xts…kk5s
2025-02-20 20:58:17

jb55 on Nostr: big parser doesn't want you to know that building parsers is actually incredibly ...

big parser doesn't want you to know that building parsers is actually incredibly easy. for some reason parsers are scary to programmers.

here's a parser I just built that takes a list of text tokens out of visa pdf statements and converts them to ledger-cli (plain text accounting) format... because apparently csv exports from my bank is too much to ask for.

the basic idea is that you have one large array of strings extracted from the pdf. this is an unreadable stream of unstructured text chunks called tokens.

for each token you see if you can parse the next N tokens into a transaction line item. if it fails you just advance the parser by one and try again. if it succeeds then the parser would have eaten all of the tokens it needed so it can continue from there.

you can build this kind of parser in *any* language, it would look identical. it's just a few simple functions and data.

after doing these kinds of things at a record label for 6 years I'm convinced that accounting is more of a software engineering job, can't imagine going through pdfs manually each month.

I'm not sure what the point of this post was other than to say I like parsing things and that banks suck.

carry on.

Author Public Key
npub1xtscya34g58tk0z605fvr788k263gsu6cy9x0mhnm87echrgufzsevkk5s