Ric on Nostr: nprofile1q…86naz GPTBot is the most aggressive content scraper I've come across in ...
nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpq5w5xwgp4jy94hc7d5pt6hm7dgeg460csypvu34e7ygcejnw5eupqn86naz (nprofile…6naz) GPTBot is the most aggressive content scraper I've come across in decades of server management. Totally ignores any crawl limits that you set in your robots.txt, and they operate on enough IPs to make even nginx configured rate limiting a bit futile.
You can, though, block them (and others) by their useragent string. Add this to your .htaccess to block both GPTBot and Claude, for example:
SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER
You can, though, block them (and others) by their useragent string. Add this to your .htaccess to block both GPTBot and Claude, for example:
SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER