| Subject: Re: seti groups |
| From: "Mike Bader" <mdbader@flash.net> |
| Date: 14/05/2004, 19:43 |
May 5, 2004
Re: Chronic "500" errors when using our web site.
The SETI@home web site gets over two million hits a day. A large chunk of
these hits are statistics queries which load down the server. At peak times
(usually in the middle of the U.S. work day) the server can't gracefully
handle the load and requests fail with an ugly "500" internal server error
(which shows up in different ways depending on your browser).
To help alleviate this problem, we set up a second web server just to handle
stats requests (called setiathome2). However, a lot of people (and pieces of
software) have setiathome.ssl.berkeley.edu bookmarked as the site to look up
statistics. These links still work, but still occasionally get overloaded.
So, if you are getting such errors, check your URL - make sure that the
beginning is: http://setiathome2.ssl.berkeley.edu, and not
http://setiathome.ssl.berkeley.edu.
Q: Why do you use two separate names (setiathome and setiathome2) as opposed
to using some form of automatic load balancing?
A: Well, because in theory we want these web servers to do two entirely
different things. The first (setiathome) would host the home page (and all
the other static pages). The second (setiathome2) would host all the
dynamically generated pages, handle forms, etc. When everything was on one
server, heavy loads due to excess queries prevented other people from
loading the home page and downloading the software, and this was quite
embarrassing. This scheme would keep this activity localized to the second
server, so the informational pages on the first would always load (and load
quickly).
In practice, however, this proved difficult. As stated above, many people
bookmarked (and pieces of software hardwired) the first server as the stats
server, and we couldn't just break all those links. So to be nice we left
those channels open on the first server, but enough activity (via links on
the web page) was shoveled over to the second server to vastly reduce the
load.
We are working on the next generation website (to be launched in tandem with
the next generation of software) where most everything is dynamically
generated. At this point we'll (hopefully) throw more web servers into the
fold and use some automatic load balancing technique which won't require us
to advertise different server names for different kinds of web activity.