The fundamental conclusion emerging from AI security research seems to be that the ...

The fundamental conclusion emerging from AI security research seems to be that the system's output should be considered as sensitive as the most sensitive data in the training set (everything leaks) and as untrustworthy as the least trustworthy data in the training set (everything contaminates). This has echoes of the Bell-LaPadula and Biba models. Of course, these models failed because even if they were theoretically sound, they were unworkable in practice. The workaround was to insert humans in the loop to make authorization decisions. But the whole point of AI is to take the human out of the loop. Interesting dilemma.

Alex Gantman on Nostr: The fundamental conclusion emerging from AI security research seems to be that the ...