| Subject: Re: Algo question |
| From: david@djwhome.demon.co.uk |
| Date: 25/09/2005, 15:15 |
in article <1127600441.999886.123200@g49g2000cwa.googlegroups.com>,
ccristian@gmail.com wrote:
Can anyone point me to where I can find detailed informations about
seti algorithm?
I assume you meant the SETI@Home client algorithms, not the generic Search
for Extra Terrestial Intellgence one, or even the total S@H project one.
(The general SETI process can monitor any physical parameter for
variations from random that are unlikely to be produced by natural
mechanisms, and where those variations are originating from somewhere
not under human control.)
The source code for the BOINC version is published; that is your best place
for *details* of the algorithm. This is essentially the same as S@H classic
but may have some improvements. Some of the server algorithms, for BOINC,
will be there, but some parts of the process are manual and some will be
adapted as time progresses.
There are also papers on the setiathome web site that cover the various
detection strategies at a high level.
At a high level the client algorithm is (excluding data communication
related aspects):
Convert to frequency domain
High pass filter *frequency* domain spectrum (I think I have this right -
note this is not high pass filtering the time domain signal by manipulating
it in the frequency domain)
Convert back to time domain.
For each chirp rate that produces sufficiently different results
Multiply complex time domain signal by complex time domain chirp
For each power of 2 data sample length from 128K down to where the result
is not significantly different from the last for that length
Compute set of, non-overlapping in time, power spectrum estimates
For each run of data and each frequency within it
If estimated power > 22 * work unit mean power for such a combination
Report a spike
End if
Next frequency / run combination
If sufficient number of runs across beam transit time
For each possible complete beam transit
For each frequency
Try to match a gaussian to the power spectrum
If there is a sufficiently good combination of
low mean square error and high peak
(Sufficiently here means that there is a fairly
even spread of false positives and a whole work
unit produces very roughly one false positive.)
Report a gaussian
End if
Next frequency
End if
If sufficient number of runs across transit time
For each frequency
If there are three sufficiently large samples for runs which
are equally spaced in time
(It's possible that they are also checked for being within
as single beam transit width.)
Report a triplet
End if
Next frequency
End if
If sufficient number of runs across transit time
For a sufficient number of intervals
If there are a sufficient number of that interval in the
beam transit time
For each frequency
For each distinct set of runs with a spacing of that
interval (I think the actual algorithm interpolates
and doesn't require runs actually exist at each step)
Average the power
If the average is sufficiently large for one
starting point (relative to some mean)
Report a pulse
(The actual algorithm is more efficient than
this, and it is possible that the possible
transit boundaries are taken into account)
End if
Next set of runs
Next frequency
End if
Next inteval length
End if
Next frequency resolution
Next chirp correction
This is largely from memory. The actual order of operations may differ
and some algorithms are computationally more efficient than as described.
You will need to see the source code for the exact definitions of
"sufficiently", but the basic principle is to generate about one false
positive per type per work unit and to spread the false positives, in
some way, evenly across the parameter space.