Mail::SpamAssassin::BayesStore::SQL(3pm)
NAME
Mail::SpamAssassin::BayesStore::SQL - SQL Bayesian Storage Module
Implementation
SYNOPSIS
DESCRIPTION
This module implementes a SQL based bayesian storage module.
METHODS
new
public class (Mail::SpamAssassin::BayesStore::SQL) new (Mail::Spamassassin::Bayes $bayes)
Description: This methods creates a new instance of the Mail::SpamAssassin::BayesStore::SQL object. It expects to be passed an instance of
the Mail::SpamAssassin:Bayes object which is passed into the Mail::SpamAssassin::BayesStore parent object.
This method sets up the database connection and determines the username
to use in queries.
tie_db_readonly
public instance (Boolean) tie_db_readonly ();
Description: This method ensures that the database connection is properly setup and working. If necessary it will initialize a user's bayes
variables so that they can begin using the database immediately.
tie_db_writable
public instance (Boolean) tie_db_writable ()
Description: This method ensures that the database connetion is properly setup and working. If necessary it will initialize a users bayes
variables so that they can begin using the database immediately.
untie_db
public instance () untie_db ()
Description: This method is unused for the SQL based implementation.
calculate_expire_delta
- public instance (%) calculate_expire_delta (Integer $newest_atime,
- Integer $start,
Integer $max_expire_mult) - Description: This method performs a calculation on the data to determine the optimum atime for token expiration.
- token_expiration
- public instance (Integer, Integer,
- Integer, Integer) token_expiration(\% $opts,
Integer $newdelta, @ @vars)
- Description: This method performs the database specific expiration of
tokens based on the passed in $newdelta and @vars. - sync_due
- public instance (Boolean) sync_due ()
- Description: This method determines if a database sync is currently
required. - Unused for SQL based implementation.
- seen_get
- public instance (String) seen_get (string $msgid)
- Description: This method retrieves the stored value, if any, for
$msgid. The return value is the stored string ('s' for spam and 'h'
for ham) or undef if $msgid is not found. - seen_put
- public (Boolean) seen_put (string $msgid, char $flag)
- Description: This method records $msgid as the type given by $flag.
$flag is one of two values 's' for spam and 'h' for ham. - seen_delete
- public instance (Boolean) seen_delete (string $msgid)
- Description: This method removes $msgid from the database.
- get_storage_variables
- public instance (@) get_storage_variables ()
- Description: This method retrieves the various administrative variables used by the Bayes process and database.
- The values returned in the array are in the following order:
- 0: scan count base
- 1: number of spam
- 2: number of ham
- 3: number of tokens in db
- 4: last expire atime
- 5: oldest token in db atime
- 6: db version value
- 7: last journal sync
- 8: last atime delta
- 9: last expire reduction count
- 10: newest token in db atime
- dump_db_toks
- public instance () dump_db_toks (String $template, String $regex, Array @vars)
- Description: This method loops over all tokens, computing the probability for the token and then printing it out according to the passed in
token. - set_last_expire
- public instance (Boolean) set_last_expire (Integer $time)
- Description: This method sets the last expire time.
- get_running_expire_tok
- public instance (String $time) get_running_expire_tok ()
- Description: This method determines if an expire is currently running
and returns the last time set. - There can be multiple times, so we just pull the greatest (most recent) value.
- set_running_expire_tok
- public instance (String $time) set_running_expire_tok ()
- Description: This method sets the time that an expire starts running.
- remove_running_expire_tok
- public instance (Boolean) remove_running_expire_tok ()
- Description: This method removes the row in the database that indicates that and expire is currently running.
- tok_get
- public instance (Integer, Integer, Integer) tok_get (String $token)
- Description: This method retrieves a specificed token ($token) from the database and returns it's spam_count, ham_count and last access time.
- tok_get_all
- public instance (\@) tok_get (@ $tokens)
- Description: This method retrieves the specified tokens ($tokens) from
storage and returns an array ref of arrays spam count, ham acount and
last access time. - tok_count_change
- public instance (Boolean) tok_count_change (Integer $spam_count,
- Integer $ham_count,
String $token, - String $atime)
- Description: This method takes a $spam_count and $ham_count and adds it to $tok along with updating $toks atime with $atime.
- multi_tok_count_change
- public instance (Boolean) multi_tok_count_change (Integer $spam_count,
- Integer $ham_count,
\% $tokens,
- String $atime)
- Description: This method takes a $spam_count and $ham_count and adds it
to all of the tokens in the $tokens hash ref along with updating each
tokens atime with $atime. - nspam_nham_get
- public instance ($spam_count, $ham_count) nspam_nham_get ()
- Description: This method retrieves the total number of spam and the
total number of ham learned. - nspam_nham_change
- public instance (Boolean) nspam_nham_change (Integer $num_spam,
- Integer $num_ham)
- Description: This method updates the number of spam and the number of
ham in the database. - tok_touch
- public instance (Boolean) tok_touch (String $token,
- String $atime)
- Description: This method updates the given tokens ($token) atime.
- The assumption is that the token already exists in the database.
- tok_touch_all
- public instance (Boolean) tok_touch (\@ $tokens
- String $atime)
- Description: This method does a mass update of the given list of tokens $tokens, if the existing token atime is < $atime.
- The assumption is that the tokens already exist in the database.
- We should never be touching more than N_SIGNIFICANT_TOKENS, so we can
make some assumptions about how to handle the data (ie no need to batch like we do in tok_get_all) - cleanup
- public instance (Boolean) cleanup ()
- Description: This method peroms any cleanup necessary before moving
onto the next operation. - get_magic_re
- public instance get_magic_re (String)
- Description: This method returns a regexp which indicates a magic
token. - Unused in SQL implementation.
- sync
- public instance (Boolean) sync (\% $opts)
- Description: This method performs a sync of the database
- perform_upgrade
- public instance (Boolean) perform_upgrade (\% $opts);
- Description: Performs an upgrade of the database from one version to
another, not currently used in this implementation. - clear_database
- public instance (Boolean) clear_database ()
- Description: This method deletes all records for a particular user.
- Callers should be aware that any errors returned by this method could
causes the database to be inconsistent for the given user. - backup_database
- public instance (Boolean) backup_database ()
- Description: This method will dump the users database in a marchine
readable format. - restore_database
- public instance (Boolean) restore_database (String $filename, Boolean
$showdots) - Description: This method restores a database from the given filename,
$filename. - Callers should be aware that any errors returned by this method could
causes the database to be inconsistent for the given user. - db_readable
- public instance (Boolean) db_readable()
- Description: This method returns a boolean value indicating if the
database is in a readable state. - db_writable
- public instance (Boolean) db_writeable()
- Description: This method returns a boolean value indicating if the
database is in a writable state.
Private Methods
_connect_db
private instance (Boolean) _connect_db ()
Description: This method connects to the SQL database.
_get_db_version
private instance (Integer) _get_db_version ()
Description: Gets the current version of the database from the special
global vars tables.
_initialize_db
private instance (Boolean) _initialize_db ()
Description: This method will check to see if a user has had their
bayes variables initialized. If not then it will perform this initialization.
_put_token
- private instance (Boolean) _put_token (string $token,
- integer $spam_count,
integer $ham_count, - string $atime)
- Description: This method performs the work of either inserting or
updating a token in the database. - _put_tokens
- private instance (Boolean) _put_tokens (\% $tokens,
- integer $spam_count,
integer $ham_count, - string $atime)
- Description: This method performs the work of either inserting or
updating tokens in the database. - _get_oldest_token_age
- private instance (Integer) _get_oldest_token_age ()
- Description: This method finds the atime of the oldest token in the
database. - The use of min(atime) in the SQL is ugly and but really the most efficient way of getting the oldest_token_age after we've done a mass
expire. It should only be called at expire time. - _get_num_hapaxes
- private instance (Integer) _get_num_hapaxes ()
- Description: This method gets the total number of hapaxes (spam_count + ham_count == 1) in the token database for a user.
- _get_num_lowfreq
- private instance (Integer) _get_num_lowfreq ()
- Description: This method gets the total number of lowfreq tokens
(spam_count < 8 and ham_count < 8) in the token database for a user - _token_select_string
- private instance (String) _token_select_string
- Description: This method returns the string to be used in SELECT statements to represent the token column.
- The default is to use the RPAD function to pad the token out to 5 characters.