Subject: Re: one MB and two MB L2 cache processors
From: f/f george
Date: 08/10/2004, 20:40
Newsgroups: alt.sci.seti

On Fri, 8 Oct 2004 16:16:40 +0100, John
<fredclark@consltec.demon.co.uk> wrote:

In article <8kqcm096hph7outtl7t5vagq7hqfvlockj@4ax.com>, f/f george
<george@yourplace.com> writes
Please do the math on this.

I have 2 CPU's *EACH* of which crunch a single WU in 6 hours+. (Known fact -
that was what I got *before* installing SETI driver).

So, each CPU does 4 WU (average) a day. Therefore 2 CPU's do 8 WU a day.
Which is what I get. !!

But your implication is that each CPU with 1Mb cache can do 2 WU's
(processes) at once. So, instead of 4 WU per day, each CPU would do 8
(assuming 100% perfect world, which it isn't !!)

So a dual CPU with 4 processes should do 16 WU a day.

But I only get (at most) 8.

Sure, up the numbers in SetiDriver to reflect 1 unit for each 512 meg
of ram. If you have 2 cpus you can do 2 at one time. If each cpu has
an L2 cache of 1 meg you can do 4 at 1 time. If each cpu has an L2
cache of 2 meg than each cpu can do 4 at 1 time.
This is set in the setting "maximum processes". You MUST also have at
least that number of units in your "desired cache size" also. It is
preferred to have double the number of currently running units in the
"desired cache size".

The key is the L2 cache size, if you only have 512 meg than you are
limited to only running one unit per cpu. If you have 1 meg of L2
cache than you can double it to 2 per cpu at one time. If you have 2
meg of L2 cache, per cpu, than you can run 4 units at one time per
cpu. You MUST be running the CLI version for this to work too. The
Screen Saver version will not do this.


Certainly the key to "greatest output per day" seems to be the size of
the L2 cache. Explained as you have, just above this comment. 

However, there are other considerations, including the efficiency of the
FPU, CPU pipeline length and CPU/FSB clock speed.

I sport 2 dual processor systems in my SETI smallholding. Both systems
have processors with 512Kb L2.

My oldest system is a dual 933 MHz P3 system, and I run 2 CLI WUs at a
time. I control this through SETI Driver, with WUs locked to each
processor and - cache size set to 2 and processes set to 2.

Originally, when I ran the system a single processor would finish a WU
in an average 6 - 6.25 hours. But due to shared RAM and FSB the time 2
WUs take to each complete is 7.33 - 7.75 hours. SETI Queue tells me this
PC averages about 6.25 - 6.33 WUs per day.

My newest system is a dual 2.8 Ghz Xeon with hyperthreading active. So,
despite the 512 Kb L2 cache this machine processes 4 WUs at a time. But,
again, the poor FPU (compared to an AMD processor), long pipeline
(compared to AMD CPUs), shared RAM and FSB I get the following results -

1 WU on 1 processor averages 2.35 hours:        daily output = 10.2;
2 WUs, 1 on each proc, averages 3.35 hours:     daily output = 14.3;
4 WUs, 1 per proc + 1 per HT, av. 5.25 hours:   daily output = 18.2.

I started this thread with a speculation on a dual processor Xeon system
with HT active and 2 Mb L2 cache. Such a system, if clocking at 3.6 Ghz,
should be able to crunch 8 WUs simultaneously and average what I do.
This suggests a daily output of -

       24/5.25 x 8 = 4.57 x 8 = 36.5 WUs per day

Speculating even further in to the future (say 18 months ahead), when
the dual core 800 FSB 4.0 Ghz Xeons are available. I assume these will
have HT active (only possibly) and at least 2 Mb L2 cache. 

Now assuming a dual processor 4 Ghz dual core Xeon with HT active and 2
Mb L2 available, such a monster will be able to run -

   2 SETI Classic WUs on each Xeon CPU core 
       (remember there is 2 CPUs, which means 4 WUs just here)
   2 SETI Classic WUs on each Xeon CPU HT 
       (remember there is 2 cores & 2 CPUs, which means 4 WUs here)
   2 SETI Classic WUs on each Xeon CPU core per 2 Mb L2 cache
       (remember there is 2 CPUs with 2 cores each = 8 here)

This argument, if true, suggests this system will be able to run 16 WU
at the same time.

I will assume an improved FPU, higher clock speed, shorter pipeline, but
still a lot of sharing of FSB and RAM. This will slow the average WU
crunch time down to about 5.5 hours. Assuming this is a given, then the
system daily WU output could be = 24/5.5 x 16 = 69.8 WUs per day.

Any thoughts about this being possible, but at what cost?
I did a look up a couple of days ago on www.pricewatch.com and they
are selling 3.2ghz Xeon processors with the 800mhz FSB and 2 meg L3
cache, L2 and L3 work similarly, they were just over $1000.00 EACH,
processor ONLY!