Lunch quote; Lunch break & bored— so I’m rereading a section of Bostrom’s ...

Lunch quote; Lunch break & bored— so I’m rereading a section of Bostrom’s book— obviously comprehending this from a philosophical point of view vs. someone who creates in the field— but I find it deeply engrossing.
“Another point, which counts against some types of oracles and genies, is that there are risks involved in designing a superintelligence to have a final goal that does not fully match the outcome that we ultimately seek to attain. For example, if we use a domesticity motivation to make the superintelligence want to minimize some of its impacts on the world, we might thereby create a system whose preference ranking over possible outcomes differs from that of the sponsor. The same will happen happen if we build the AI to place a peculiarly high value on answering questions correctly, or on faithfully obeying individual commands. Now, if sufficient care is taken, this should not cause any problems: there would be sufficient agreement between the two rankings—at least insofar as they pertain to possible worlds that have a reasonable chance of being actualized—that the outcomes that are good by the AI’s standard are also good by the principal’s standard. But perhaps one could argue for the design principle that it is unwise to introduce even a limited amount of disharmony between the AI’s goals and ours. (The same concern would of course apply to giving sovereigns goals that do not completely harmonize with ours.)”

https://a.co/ds1tjc9

katrintheresa on Nostr: Lunch quote; Lunch break & bored— so I’m rereading a section of Bostrom’s ...