mleku on Nostr: i'm not talking about recognising blabla.com i'm saying a correctly formed URL, very ...
i'm not talking about recognising blabla.com i'm saying a correctly formed URL, very easy to write a regex for them, but i don't have to because the stdlib in golang already has a full URL parsing library in net/url
i think you haven't read the RFC for it because the accepted ciphers in a URL are very clear, as is the structure of proto://someth-ing.some_thing.sometld/path/to/thing?something and the occasional same thing except with a # based section identifier is pretty clear and simple and the only character in standard printable ascii not permitted in parameter key and values are mostly just the & separator
i'm quite sure there is a javascratch library for doing this correctly according to every possible nonsense that might show up in an RFC compliant implementation
the most efficient solution will be a state machine parser, but they are a little bit of work to get right... i basically wrote such things for json last year but part of the trick to why it saves time is it assumes it's minified and a couple other things, there is a whole swathe of invalid constructions it would let past but miraculously nobody bothers to mangle them anyhow (i don't even handle whitespace except to skip it until the next expected type of token)
i think you haven't read the RFC for it because the accepted ciphers in a URL are very clear, as is the structure of proto://someth-ing.some_thing.sometld/path/to/thing?something and the occasional same thing except with a # based section identifier is pretty clear and simple and the only character in standard printable ascii not permitted in parameter key and values are mostly just the & separator
i'm quite sure there is a javascratch library for doing this correctly according to every possible nonsense that might show up in an RFC compliant implementation
the most efficient solution will be a state machine parser, but they are a little bit of work to get right... i basically wrote such things for json last year but part of the trick to why it saves time is it assumes it's minified and a couple other things, there is a whole swathe of invalid constructions it would let past but miraculously nobody bothers to mangle them anyhow (i don't even handle whitespace except to skip it until the next expected type of token)