Project: Mailvisa
Code Location: git://repo.or.cz/mailvisa.gitmaster
Browse
/
Download File
mailvisa.1
.TH mailvisa 1 2005-10-23 mailvisa "Mailvisa Documentation"
.SH NAME
.B mailvisa
\-
simple bayesian spam filter
.SH SYNOPSIS
.PP
.B mailvisa
.I command
[\fIoptions\fR]
.SH DESCRIPTION
.PP
Mailvisa is a simple but effective Bayesian spam filter,
inspired by Paul Graham's \fIA Plan For Spam\fR. It's main features are 
simplicity (so it's easy to tune), accuracy (high percentage of spam 
caught, no false positives), and speed, listed in order of priority.
.SH OPERATION
.PP
The basic usage of \fBmailvisa\fR is checking whether a message is spam 
or not. By default, \fBmailvisa\fR reads a message on standard input, 
and writes it out on standard output with an \fBX-Spam:\fR header 
prepended that is set to \fBtrue\fR when Mailvisa thinks the message is 
spam, and to \fBfalse\fR otherwise; also, the exit status will be 0 for 
non-spam, 1 if an error occured, and 160 if the message is spam.
.LP
Internally, Mailvisa works by maintaining a database of good words, a 
database of 
bad words, and a score file, listing the scores for each word. The score 
file is generated from the word databases by calculating scores based on 
how often words occur in good and in bad messages. Messages are 
classified as spam or ham based on the scores file.
.LP
To gain performance, 
the score file is not loaded once for every invocation of 
\fBmailvisa\fR, but rather loaded once by a daemon process. To score a 
message, \fBmailvisa\fR connects to the daemon process, which does the 
actual scoring.
.LP
The following section details the commands you can use to manage the 
word databases, generate the score file, start the daemon, and check a 
message.
.SH COMMANDS
.PP
To use \fBmailvisa\fR, you invoke it with a command indicating the 
action to perform. Available commands are:
.TP
\fBadd\fR
Add messages to a message database.
This command is used to add the words from a message to either the list 
of good words or the list of bad words.
.TP
\fBcalculate\fR
Calculate scores and update score file.
This command is used to calculate the scores for each word in the good 
and bad databases, and store the scores in a score file.
.TP
\fBcheck\fR
Check if a message is spam.
This is probably the command you will end up using most often.
.TP
\fBhelp\fR
Display a help message.
.TP
\fBremove\fR
Remove messages from a database.
This command scans the given messages for words, and decrements the 
count for these words in the given database. This can be used to negate
the effects of a previous \fBadd\fR command (e.g. if you accidentally 
added words to the wrong database).
.TP
\fBstart\fR
Start the daemon process.
The daemon process must be started before the \fBcheck\fR command can be 
used succesfully.
.TP
\fBview\fR
View the spam scores associated with words.
This command can be used to find out which words have high spam scores, 
and which ones have low spam scores.
.LP
Each of the commands can be followed by the \fB-h\fR option to get a 
list of available options for that command.
.SH OPTIONS
.PP
This section lists the options that can be passed to \fBmailvisa\fR. 
Some options are common to all commands, whereas others only apply to 
specific commands.
.SS Common Options
.TP
\fB-c\fR \fIpath\fR
Look for configuration files in \fIpath\fR. This includes the word 
lists, the score file, the socket to connect to the daemon, and the pid 
file. All these will be looked for in the directory specified by 
\fIpath\fR, unless the specified filenames contain slashes.
.TP
.B -h
List the options specific to the given command.
.SS Options to mailvisa add
.TP
.B -i
Include \fBX-Spam:\fR headers in the analysis of messages. Normally, 
these headers are skipped when analyzing messages.
.TP
\fB-w\fR \fInum\fR
Weed the wordlist every \fInum\fR words. This removes rare words from 
the list, so that it doesn't become polluted with useless items (such as 
message ids, for example). A value of \fB0\fR disables weeding.
.TP
\fB-t\fR \fInum\fR
Weed words that occur fewer than \fInum\fR times. The default is 
\fB1\fR.
.SS Options to mailvisa calculate
.TP
\fB-g\fR \fIfile\fR
Load good words from \fIfile-fR (the default is \fBgood\fR).
.TP
\fB-b\fR \fIfile\fR
Load bad words from \fIfile-fR (the default is \fBbad\fR).
.TP
\fB-f\fR \fIfile\fR
Write scores to \fIfile-fR (the default is \fBscores\fR).
.TP
\fB-m\fR \fInum\fR
Multiply the number of good occurrences by \fInum\fR. This can be used 
to bias the scores towards judging a message ham (for multipliers > 1.0) 
or spam (for multipliers < 1.0). The default is \fB1.0\fR.
.SS Options to mailvisa check
.TP
.B -q
Do not output the message or \fBX-Spam:\fR header. Only indicate the 
decission in the exit status (0 for ham, 160 for spam).
.TP
.B -e
Do not indicate whether a message is spam in the exit status.
.TP
\fB-b\fR \fInum\fR
Read \fInum\fR bytes at a time. The default is 16384.
.TP
\fB-t\fR \fInum\fR
Threshold for flagging messages as spam. This can be used to bias the 
check in favor of judging messages as spam (for values < 0.5) or ham 
(for values > 0.5). Useful values range between 0.0 and 1.0, the default 
is 0.5.
.TP
\fB-m\fR \fIcommand\fR
Pipe the output to \fIcommand\fR (analogous to \fBfetchmail(1)\fR's 
option of the same name).
.TP
\fB-s\fR \fIpath\fR
Use \fIpath\fR to connect to the daemon. The default is 
\fBmailvisad.sock\fR.
.SS Options to mailvisa remove
.TP
.B -i
Include \fBX-Spam:\fR headers in the analysis of messages. Normally, 
these headers are skipped when analyzing messages.
.SS Options to mailvisa start
.TP
\fB-f\fR \fIfile\fR
Use \fIfile\fR as the score file. Defaults to \fBscores\fR.
.TP
\fB-l\fR \fIfile\fR
Log to \fIfile\fR. Default: \fBmailvisad.log\fR.
.TP
\fB-p\fR \fIfile\fR
Use \fIfile\fR to store the pid (process id) of mailvisad. Defaults to 
\fBmailvisad.pid\fR.
.TP
\fB-s\fR \fIpath\fR
Open a socket for \fBmailvisa check\fR at \fIpath\fR. The default is 
\fBmailvisad.sock\fR.
.SS Options to mailvisa view
.TP
\fB-f\fR \fIfile\fR
Use \fIfile\fR as the score file. Defaults to \fBscores\fR.
.SH EXAMPLES
.PP
Add all messages from the directory \fBmail/inbox\fR to the database of 
good words:
.IP
.B mailvisa add good mail/inbox/*
.PP
Add all messages from the directory \fBmail/spam\fR to the database of 
bad words:
.IP
.B mailvisa add bad mail/spam/*
.PP
Calculate word scores and store them in the score file (using the 
defaults of bad, good, and scores for the files containing bad words, 
good words, and word scores, respectively):
.IP
.B mailvisa calculate
.PP
Start the daemon:
.IP
.B mailvisa start
.PP
Check whether the message stored in \fBfoo\fR is spam:
.IP
.B mailvisa check < foo
.PP
The same, but suppressing the exit code:
.IP
.B mailvisa check -e < foo
.PP
The same, but suppressing the output (\fBX-Spam:\fR header and message) 
instead:
.IP
.B mailvisa check -q < foo 
.PP
Spam check a message from standard input and send it to 
\fBprocmail(1)\fR for further processing (suppressing the exit code):
.IP
.B mailvisa check -e -m procmail
.SH COPYRIGHT
.PP
Mailvisa is open source, under the terms of the MIT license. A copy of 
this license is contained in the file LICENSE in the source 
distribution. Mailvisa was written by Robbert Haarman. See 
\fIhttp://inglorion.net/\fR for contact information.