In the early hours of Monday morning Gray posted an update to shed some light on the situation right now with the server, having been offline since Saturday night.
Broken down, the important details:
- For those who remember a few weeks ago when the server was down for about a week while Gray was out-of-town, a Seagate Barracuda 2TB drive in the hardware raid, failed.
- In the process of trying to populate the new drive with data it was determined that the raid configuration and technology wasn't going to work, and new technology was needed.
- Instead Gray went with a mdadm software raid, which was decidedly better because drives could be added or removed from an array at any time.
- For some time, it worked well.
- On Saturday night, the server ran out of memory and crashed.
- Gray says he suspects a memory leak, as this has never happened before and should not, because the server had more then enough RAM.
- Gray attempted a reboot, but it didn't work.
- This meant one of two things happened, either another drive failed (which was disproven when Gray went to the physical server at the data centre or - and is in reality what actually happened - one of several updates being applied during the reboot process messed up the grub2 configuration.
- Gray decided to take all the data and freshly install the operating system.
- All the data was properly backed up, so Gray stresses there will not be another flash
In conclusion, the server's boot configuration was messed up by an update, Gray copied his backups and will attempt to re-install the operating system today which should have the server back to operation by tonight.
Broken down, the important details:
- For those who remember a few weeks ago when the server was down for about a week while Gray was out-of-town, a Seagate Barracuda 2TB drive in the hardware raid, failed.
- In the process of trying to populate the new drive with data it was determined that the raid configuration and technology wasn't going to work, and new technology was needed.
- Instead Gray went with a mdadm software raid, which was decidedly better because drives could be added or removed from an array at any time.
- For some time, it worked well.
- On Saturday night, the server ran out of memory and crashed.
- Gray says he suspects a memory leak, as this has never happened before and should not, because the server had more then enough RAM.
- Gray attempted a reboot, but it didn't work.
- This meant one of two things happened, either another drive failed (which was disproven when Gray went to the physical server at the data centre or - and is in reality what actually happened - one of several updates being applied during the reboot process messed up the grub2 configuration.
- Gray decided to take all the data and freshly install the operating system.
- All the data was properly backed up, so Gray stresses there will not be another flash
In conclusion, the server's boot configuration was messed up by an update, Gray copied his backups and will attempt to re-install the operating system today which should have the server back to operation by tonight.