Author Topic: Master Server Outage  (Read 3500 times)

Recently, throughout the day, the master server has died for half an hour. For unknown reasons, this has happened 4 times in the same day.

Times:
12:00 - 12:20
01:30 - 01:50
03:00 - 03:30
17:40 - 21:30 on and off
(times are in EST)
« Last Edit: October 20, 2010, 10:20:43 PM by Kalphiter »

IRC spam
IRC spam everywhere.

Recently, throughout the day, the master server has died for half an hour. For unknown reasons, this has happened 4 times in the same day.

Times:
12:00 - 12:20
01:30 - 01:50
03:00 - 03:30
18:40 - 19:15
I just heard about it from Kiwi..
   New Updates? Crashes? What.

Huh. I wonder why.

But damn, IRC was going nuts.

IRC was fun while it was down, though.

2012 Preparation Day  :cookieMonster:

Plus a chance to view intense IRC spam in progress.

This will probably get about 10 pages within 5 minutes.


Hooray! There's only 15 servers up though...

This will probably get about 10 pages within 5 minutes.
nah, it's not going that fast

The website (and forums) just was down a couple minutes ago. Did this happen during the other masterserver crashes? I guess all of Badspot's other website's went down too.

The website (and forums) just was down a couple minutes ago. Did this happen during the other masterserver crashes? I guess all of Badspot's other website's went down too.
No, it's caused by high CPU usage for some reason.
And this happened once before, in which it crashed once and was fine from then on.

About time the Master server is back up.

Badspot

  • Administrator
No, it's caused by high CPU usage for some reason.
And this happened once before, in which it crashed once and was fine from then on.

I don't know what the problem is but it's not CPU usage.  The server has like 8 cores and I've never seen it go over 10% usage.  Most freezing on the forum is related to sql usage and the general crappiness of smf. 

There are a large number of connections open on the server but I don't know what the normal numbers are like.   

I randomly poked some numbers in the apache config and now it's working but that could easily be a coincidence considering it's been up and down all day. 

There are a large number of connections open on the server but I don't know what the normal numbers are like.
Could you describe how many? Could you describe what Apache changes were made?

I would put a cap on 2,000 being too many but that's commonly found in long-living connections.

I assumed it was high CPU because this has happened before and it was the cause. This also explained why pings were perfect yet HTTP stuff was a mess.