What is Nostr?
Pavel Korytov :emacs:☮️ /
npub15zl…frsl
2024-11-09 21:01:01

Pavel Korytov :emacs:☮️ on Nostr: I've tried making a full-text #RSS feed for the websites of ScienceX, the parent org ...

I've tried making a full-text #RSS feed for the websites of ScienceX, the parent org for Phys.org and Tech Xplore.

The webpages are very straightforward, so the bridge (for #rssbridge) took just about 200 LoC. But! #CloudFlare is super zealous there.

Even with the following parameters:
- 3 feeds
- fetch every hour
- cache webpages for 7 days (= fetch each webpage only once, for all intents and purposes)
I already got 429'ed. I'll try fetching every 4 hours, I guess...

W-why such extreme measures to prevent parsing? I'm sure #AI corps or whoever needs their data will just hire a bunch of people to solve CloudFlare's CAPTCHAs, but everyone else will be left behind.

Just give me the damn full-text RSS, I'd even pay for it... if I could sign up, the signup form returns 503 for me.
Author Public Key
npub15zlt94rw03ze79fe2r8n4u7xu2d6r5ck6zxeaykfx97qdp7fnd2svxfrsl