Whoops, broke the server 13/3/16
9 years 3 weeks ago - 9 years 3 weeks ago #1926
by mw0uzo
Whoops, broke the server 13/3/16 was created by mw0uzo
Hi all, sorry for the outage over Sunday evening and Monday morning.
I needed to power the machine down to move a power cable, so did some maintenance to remove old kernels filling up /boot and updated some packages. I didn't check at the end for the latest installed kernel. It was missing!
So after reboot it didn't come back up.
Installing the kernel again turned out to be a nasty crash course in reassembling RAID, mounting the partitions in the right place, enabling networking, chrooting and installing the kernel. Urrrghgh.
Then when installing the kernel, one of the drives suffered an error resulting in a RAID rebuild while in recovery mode which had to be left to complete before rebooting.
The server is back up now
But there is a problem to be tracked down with one of the drives. Its SMART status is OK, no reallocated sector count. So could be SATA cable or the drive interface electronics.
I expect the easiest way to solve the problem is to get an identical drive and swap out. The drive is £80, so any donations would be very welcome at this point.
I needed to power the machine down to move a power cable, so did some maintenance to remove old kernels filling up /boot and updated some packages. I didn't check at the end for the latest installed kernel. It was missing!
So after reboot it didn't come back up.

Installing the kernel again turned out to be a nasty crash course in reassembling RAID, mounting the partitions in the right place, enabling networking, chrooting and installing the kernel. Urrrghgh.
Then when installing the kernel, one of the drives suffered an error resulting in a RAID rebuild while in recovery mode which had to be left to complete before rebooting.
The server is back up now

But there is a problem to be tracked down with one of the drives. Its SMART status is OK, no reallocated sector count. So could be SATA cable or the drive interface electronics.
I expect the easiest way to solve the problem is to get an identical drive and swap out. The drive is £80, so any donations would be very welcome at this point.
Last edit: 9 years 3 weeks ago by mw0uzo.
Please Log in or Create an account to join the conversation.
- ThibmoRozier
-
- Offline
- Elite Member
-
9 years 2 weeks ago #1927
by ThibmoRozier
Replied by ThibmoRozier on topic Whoops, broke the server 13/3/16
Happens to the best of us.
I am going to use LXC soon on my own server to prevent oopsies like these for my future, as mine also went black for about 5 hours today.
Might be worth checking out for you?
https://linuxcontainers.org/
I am going to use LXC soon on my own server to prevent oopsies like these for my future, as mine also went black for about 5 hours today.
Might be worth checking out for you?
https://linuxcontainers.org/
Please Log in or Create an account to join the conversation.
9 years 2 weeks ago #1931
by mw0uzo
Replied by mw0uzo on topic Whoops, broke the server 13/3/16
Thanks for the info Thimo.
There haven't been any more errors while the server has been running. I wonder if the problem is just the cable or contacts /socket/soldering through the HD backplane.
Time will tell.
There haven't been any more errors while the server has been running. I wonder if the problem is just the cable or contacts /socket/soldering through the HD backplane.
Time will tell.
Please Log in or Create an account to join the conversation.
Moderators: Gamma-Man
Time to create page: 0.156 seconds