Jim Winstead on Nostr: Not to defend the shitstorm that is going to descend on them, but it sounds like what ...
Not to defend the shitstorm that is going to descend on them, but it sounds like what Automattic/WordPress.com was trying to do is provide feeds/dumps of content being posted publicly rather than just making OpenAI crawl everything. (Sounds like they screwed up providing non-public content, which is very bad if true.)
This has echoes of when Google started to do more real-time integration of content into their search indexes instead of just periodic crawls. I don’t know much about where LLM tech is going, but I assume that feeding additional data into the models closer to real-time is on the horizon.
Has anyone looked at what crawlers come to your your blog when it pings blo.gs or you use pingomatic.com?
(Firehoses of data are valuable, and sometimes I am curious about how much of a money spigot I sold to Yahoo. It’s one of the distinguishing architectural differences between BlueSky and ActivityPub.)
This has echoes of when Google started to do more real-time integration of content into their search indexes instead of just periodic crawls. I don’t know much about where LLM tech is going, but I assume that feeding additional data into the models closer to real-time is on the horizon.
Has anyone looked at what crawlers come to your your blog when it pings blo.gs or you use pingomatic.com?
(Firehoses of data are valuable, and sometimes I am curious about how much of a money spigot I sold to Yahoo. It’s one of the distinguishing architectural differences between BlueSky and ActivityPub.)