nprofile1q…5v5sj my doctrine is that abuse shouldn’t prevent legitimate use. The ...

Fabio Manganiello /

npub13uu…pvgs

2025-02-16 19:04:43

in reply to nevent1q…54x4

nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqjl0wt67um0zeynsj434uwxul8vrxlhm4xl3n832w60d9nyayxzhs35v5sj (nprofile…v5sj) my doctrine is that abuse shouldn’t prevent legitimate use.

The vision behind the (original) Web 3.0 was to have a Web that was as much machine-readable as human-readable. Giving up on that vision because of a small percentage of abusers is like giving up on hosting a website because it may be subject to DDoS attacks.

From a more practical perspective, as someone who has been hosting sites for a couple of decades (without even having Cloudflare and friends in front of them), I’ve noticed a couple of scrape abuse patterns:

Genuinely misconfigured scripts (it also happened a few times with Fediverse instances). In those cases, it’s usually quite easy to find the culprit and urge them to fix their logic.

Malignant actors: in that case it has nothing to do with scraping (they don’t want your content, they just want to take down your website).

Some mitigation actions that I’ve found useful:

Cache your static content.

Provide your content in a machine-readable format too (RSS, JSON-LD, RDF, some API…). Legitimate scrapers prefer it much more when the data is already machine-ready rather than having to implement their own brittle HTML scraper. And non-HTML content is usually also much lighter from a content size and load perspective.

Author Public Key

npub13uunvh7djw9ep54nswkuxlneyee7ehcpc7e53t68krykrdeg6j4qrdpvgs

Seen on

wss://relay.nostr.band

Show more details

Published at

2025-02-16 19:04:43

Kind type

1 Short Text Note

Event JSON

{ "id": "adea9bf6aa629621b68a66711250b77e68d760319dd829c6a29d98c65732d267", "pubkey": "8f39365fcd938b90d2b383adc37e792673ecdf01c7b348af47b0c961b728d4aa", "created_at": 1739732683, "kind": 1, "tags": [ [ "p", "97dee5ebdcdbc5924e12ac6bc71b9f3b066fdf7537e333c54ed3da5993a430af", "wss://relay.mostr.pub" ], [ "p", "5d1a78e1f379d9227edd39fd50c6f4cbca5494d5054f33e6deacc0e63565fdd5", "wss://relay.mostr.pub" ], [ "e", "c9155f0e31569d68a928e5b5f739e0f9984b21f75d623f0ff85657a336fcb872", "wss://relay.mostr.pub", "reply" ], [ "proxy", "https://manganiello.social/objects/a37d9ca5-17a0-4281-be15-382c69295210", "activitypub" ] ], "content": "nostr:nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqjl0wt67um0zeynsj434uwxul8vrxlhm4xl3n832w60d9nyayxzhs35v5sj my doctrine is that abuse shouldn’t prevent legitimate use.\n\nThe vision behind the (original) Web 3.0 was to have a Web that was as much machine-readable as human-readable. Giving up on that vision because of a small percentage of abusers is like giving up on hosting a website because it may be subject to DDoS attacks.\n\nFrom a more practical perspective, as someone who has been hosting sites for a couple of decades (without even having Cloudflare and friends in front of them), I’ve noticed a couple of scrape abuse patterns:\n\n\n\nGenuinely misconfigured scripts (it also happened a few times with Fediverse instances). In those cases, it’s usually quite easy to find the culprit and urge them to fix their logic.\n\n\nMalignant actors: in that case it has nothing to do with scraping (they don’t want your content, they just want to take down your website).\n\nSome mitigation actions that I’ve found useful:\n\n\n\nCache your static content.\n\n\nProvide your content in a machine-readable format too (RSS, JSON-LD, RDF, some API…). Legitimate scrapers prefer it much more when the data is already machine-ready rather than having to implement their own brittle HTML scraper. And non-HTML content is usually also much lighter from a content size and load perspective.", "sig": "12b1a739c3a2564117f0dbec6ed8779f5a45701ad03b865782731018bce8aa9e1d07c26b4f723c5d682ea7385aa8cb5d76dd377ea2b9fafb4cb020ecfaafd425" }