John Regehr on Nostr: sigh.... my group's big machine has started rebooting every couple days with a fatal ...
sigh.... my group's big machine has started rebooting every couple days with a fatal CPU error. I guess I should run a memory checker on it for a while? also maybe clock down the RAM and see if that helps?
https://gist.github.com/regehr/41c99b95a2e1fec6ac2ef31d2c340024
https://gist.github.com/regehr/41c99b95a2e1fec6ac2ef31d2c340024