Server issues the past week
Posted: Fri Jul 26, 2013 10:08 pm
I just wanted to make this thread to key you all in to some of the problems we are facing.
A few days ago our main SSD drive failed that is in the server. I ordered a new SSD but I have not put it in to the server yet.
The reason for that is the SSD in my desktop system also failed at the same time so I used the one I ordered for the server in my desktop system.
Now that my Desktop is not crashing any more I have requested an RMA number for our failed Server SSD and I am just going to get them to replace it. That could take a few days.
Today the server was down for several hours because the Modem that supplies internet to our server locked up and I was not around to physically restart the modem as I was at the hospital with my daughter (She has an ear infection and we had to get her seen and some meds).
Now when I came home the server had reached 50 Days of uptime. This brings with it another issue, on Windows a lot of software uses a 32-bit Tick Counter. In simple terms this is a clock in the operating system that counts up from 0. Due to it being 32-bit the clock runs out of calculable space at 49.7 days and resets to 0. Why is this a problem? Because lots of software relies on this counter to keep incrementing ever higher and when it goes to 0 lots of things fuck up.
Google Drive Sync which I have installed on the server is one such program and it lost connectivity with Googles servers and instead of just turning off it decided to try and open a new socket every second on a different port number. Once it reached the ports our servers used (and our webserver) those services stopped accepting new connections. It took me about 30 minutes to diagnose these issues which is why you may have not been able to access the game server with an error like "Authentication Servers are Down".
Now as if that wasn't all, we now have yet another disk failure. Not an SSD this time but a 2TB Hard Disk inside the server which is just about 2.7 Years old.
And so that is everything I apologize for all the shutdowns and craziness recently I am working to get us back on track stable and functional it may take a few more weeks as I'm going to be very busy with work but I'm committed to fixing everything.
Thanks for reading.
A few days ago our main SSD drive failed that is in the server. I ordered a new SSD but I have not put it in to the server yet.
The reason for that is the SSD in my desktop system also failed at the same time so I used the one I ordered for the server in my desktop system.
Now that my Desktop is not crashing any more I have requested an RMA number for our failed Server SSD and I am just going to get them to replace it. That could take a few days.
Today the server was down for several hours because the Modem that supplies internet to our server locked up and I was not around to physically restart the modem as I was at the hospital with my daughter (She has an ear infection and we had to get her seen and some meds).
Now when I came home the server had reached 50 Days of uptime. This brings with it another issue, on Windows a lot of software uses a 32-bit Tick Counter. In simple terms this is a clock in the operating system that counts up from 0. Due to it being 32-bit the clock runs out of calculable space at 49.7 days and resets to 0. Why is this a problem? Because lots of software relies on this counter to keep incrementing ever higher and when it goes to 0 lots of things fuck up.
Google Drive Sync which I have installed on the server is one such program and it lost connectivity with Googles servers and instead of just turning off it decided to try and open a new socket every second on a different port number. Once it reached the ports our servers used (and our webserver) those services stopped accepting new connections. It took me about 30 minutes to diagnose these issues which is why you may have not been able to access the game server with an error like "Authentication Servers are Down".
Now as if that wasn't all, we now have yet another disk failure. Not an SSD this time but a 2TB Hard Disk inside the server which is just about 2.7 Years old.
And so that is everything I apologize for all the shutdowns and craziness recently I am working to get us back on track stable and functional it may take a few more weeks as I'm going to be very busy with work but I'm committed to fixing everything.
Thanks for reading.