Stefano Marinelli on Nostr: Monitoring shouts at me: "This server is DOWN!" I immediately check - it doesn’t ...
Monitoring shouts at me: "This server is DOWN!"
I immediately check - it doesn’t respond to ping requests. I try to reboot it remotely - no luck.
I attempt to request a remote console; after more than 45 minutes, there’s still no reply.
I check the logs: the last ZFS send/receive based backup occurred just 23 minutes before the outage (it's an hourly backup).
I call the client to explain the situation: we can either wait or restore from a backup. They express a preference to get back to work after lunch (13:30).
I set up a VPS, install FreeBSD and some packages, then connect to the backup server:
zfs send -RLvw [mybckdataset]/bastille@lastSnap | pigz - | mbuffer -m512M | ssh destserver "pigz -d - | zfs receive -x canmount -x readonly zroot/bastille"
After a few minutes (50 GB later):
zfs load-key -r zroot/bastille (since they’re encrypted)
zfs mount -a
service bastille start
Everything's up and running. DNS record changed - disaster recovered. Time: 12:48.
I call the client and say, "Hey, you’re back up. Now we’ll wait for the original server to come back, and then we’ll resync the datasets."
The customer, with a witty remark that cleverly shows gratitude without being direct, replies, "Oh come on, and I was hoping to extend my lunch break! 😆"
FreeBSD, jails, and ZFS have, once again, done an excellent job.
Now, I can have my lunch.
#FreeBSD #ZFS #jails #RunBSD #IT #SysAdmin #DisasterRecovery
I immediately check - it doesn’t respond to ping requests. I try to reboot it remotely - no luck.
I attempt to request a remote console; after more than 45 minutes, there’s still no reply.
I check the logs: the last ZFS send/receive based backup occurred just 23 minutes before the outage (it's an hourly backup).
I call the client to explain the situation: we can either wait or restore from a backup. They express a preference to get back to work after lunch (13:30).
I set up a VPS, install FreeBSD and some packages, then connect to the backup server:
zfs send -RLvw [mybckdataset]/bastille@lastSnap | pigz - | mbuffer -m512M | ssh destserver "pigz -d - | zfs receive -x canmount -x readonly zroot/bastille"
After a few minutes (50 GB later):
zfs load-key -r zroot/bastille (since they’re encrypted)
zfs mount -a
service bastille start
Everything's up and running. DNS record changed - disaster recovered. Time: 12:48.
I call the client and say, "Hey, you’re back up. Now we’ll wait for the original server to come back, and then we’ll resync the datasets."
The customer, with a witty remark that cleverly shows gratitude without being direct, replies, "Oh come on, and I was hoping to extend my lunch break! 😆"
FreeBSD, jails, and ZFS have, once again, done an excellent job.
Now, I can have my lunch.
#FreeBSD #ZFS #jails #RunBSD #IT #SysAdmin #DisasterRecovery