Project:
Mailvisa
Code Location:
git://repo.or.cz/mailvisa.gitmaster
mailvisa.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
.TH mailvisa 1 2005-10-23 mailvisa "Mailvisa Documentation" .SH NAME .B mailvisa \- simple bayesian spam filter .SH SYNOPSIS .PP .B mailvisa .I command [\fIoptions\fR] .SH DESCRIPTION .PP Mailvisa is a simple but effective Bayesian spam filter, inspired by Paul Graham's \fIA Plan For Spam\fR. It's main features are simplicity (so it's easy to tune), accuracy (high percentage of spam caught, no false positives), and speed, listed in order of priority. .SH OPERATION .PP The basic usage of \fBmailvisa\fR is checking whether a message is spam or not. By default, \fBmailvisa\fR reads a message on standard input, and writes it out on standard output with an \fBX-Spam:\fR header prepended that is set to \fBtrue\fR when Mailvisa thinks the message is spam, and to \fBfalse\fR otherwise; also, the exit status will be 0 for non-spam, 1 if an error occured, and 160 if the message is spam. .LP Internally, Mailvisa works by maintaining a database of good words, a database of bad words, and a score file, listing the scores for each word. The score file is generated from the word databases by calculating scores based on how often words occur in good and in bad messages. Messages are classified as spam or ham based on the scores file. .LP To gain performance, the score file is not loaded once for every invocation of \fBmailvisa\fR, but rather loaded once by a daemon process. To score a message, \fBmailvisa\fR connects to the daemon process, which does the actual scoring. .LP The following section details the commands you can use to manage the word databases, generate the score file, start the daemon, and check a message. .SH COMMANDS .PP To use \fBmailvisa\fR, you invoke it with a command indicating the action to perform. Available commands are: .TP \fBadd\fR Add messages to a message database. This command is used to add the words from a message to either the list of good words or the list of bad words. .TP \fBcalculate\fR Calculate scores and update score file. This command is used to calculate the scores for each word in the good and bad databases, and store the scores in a score file. .TP \fBcheck\fR Check if a message is spam. This is probably the command you will end up using most often. .TP \fBhelp\fR Display a help message. .TP \fBremove\fR Remove messages from a database. This command scans the given messages for words, and decrements the count for these words in the given database. This can be used to negate the effects of a previous \fBadd\fR command (e.g. if you accidentally added words to the wrong database). .TP \fBstart\fR Start the daemon process. The daemon process must be started before the \fBcheck\fR command can be used succesfully. .TP \fBview\fR View the spam scores associated with words. This command can be used to find out which words have high spam scores, and which ones have low spam scores. .LP Each of the commands can be followed by the \fB-h\fR option to get a list of available options for that command. .SH OPTIONS .PP This section lists the options that can be passed to \fBmailvisa\fR. Some options are common to all commands, whereas others only apply to specific commands. .SS Common Options .TP \fB-c\fR \fIpath\fR Look for configuration files in \fIpath\fR. This includes the word lists, the score file, the socket to connect to the daemon, and the pid file. All these will be looked for in the directory specified by \fIpath\fR, unless the specified filenames contain slashes. .TP .B -h List the options specific to the given command. .SS Options to mailvisa add .TP .B -i Include \fBX-Spam:\fR headers in the analysis of messages. Normally, these headers are skipped when analyzing messages. .TP \fB-w\fR \fInum\fR Weed the wordlist every \fInum\fR words. This removes rare words from the list, so that it doesn't become polluted with useless items (such as message ids, for example). A value of \fB0\fR disables weeding. .TP \fB-t\fR \fInum\fR Weed words that occur fewer than \fInum\fR times. The default is \fB1\fR. .SS Options to mailvisa calculate .TP \fB-g\fR \fIfile\fR Load good words from \fIfile-fR (the default is \fBgood\fR). .TP \fB-b\fR \fIfile\fR Load bad words from \fIfile-fR (the default is \fBbad\fR). .TP \fB-f\fR \fIfile\fR Write scores to \fIfile-fR (the default is \fBscores\fR). .TP \fB-m\fR \fInum\fR Multiply the number of good occurrences by \fInum\fR. This can be used to bias the scores towards judging a message ham (for multipliers > 1.0) or spam (for multipliers < 1.0). The default is \fB1.0\fR. .SS Options to mailvisa check .TP .B -q Do not output the message or \fBX-Spam:\fR header. Only indicate the decission in the exit status (0 for ham, 160 for spam). .TP .B -e Do not indicate whether a message is spam in the exit status. .TP \fB-b\fR \fInum\fR Read \fInum\fR bytes at a time. The default is 16384. .TP \fB-t\fR \fInum\fR Threshold for flagging messages as spam. This can be used to bias the check in favor of judging messages as spam (for values < 0.5) or ham (for values > 0.5). Useful values range between 0.0 and 1.0, the default is 0.5. .TP \fB-m\fR \fIcommand\fR Pipe the output to \fIcommand\fR (analogous to \fBfetchmail(1)\fR's option of the same name). .TP \fB-s\fR \fIpath\fR Use \fIpath\fR to connect to the daemon. The default is \fBmailvisad.sock\fR. .SS Options to mailvisa remove .TP .B -i Include \fBX-Spam:\fR headers in the analysis of messages. Normally, these headers are skipped when analyzing messages. .SS Options to mailvisa start .TP \fB-f\fR \fIfile\fR Use \fIfile\fR as the score file. Defaults to \fBscores\fR. .TP \fB-l\fR \fIfile\fR Log to \fIfile\fR. Default: \fBmailvisad.log\fR. .TP \fB-p\fR \fIfile\fR Use \fIfile\fR to store the pid (process id) of mailvisad. Defaults to \fBmailvisad.pid\fR. .TP \fB-s\fR \fIpath\fR Open a socket for \fBmailvisa check\fR at \fIpath\fR. The default is \fBmailvisad.sock\fR. .SS Options to mailvisa view .TP \fB-f\fR \fIfile\fR Use \fIfile\fR as the score file. Defaults to \fBscores\fR. .SH EXAMPLES .PP Add all messages from the directory \fBmail/inbox\fR to the database of good words: .IP .B mailvisa add good mail/inbox/* .PP Add all messages from the directory \fBmail/spam\fR to the database of bad words: .IP .B mailvisa add bad mail/spam/* .PP Calculate word scores and store them in the score file (using the defaults of bad, good, and scores for the files containing bad words, good words, and word scores, respectively): .IP .B mailvisa calculate .PP Start the daemon: .IP .B mailvisa start .PP Check whether the message stored in \fBfoo\fR is spam: .IP .B mailvisa check < foo .PP The same, but suppressing the exit code: .IP .B mailvisa check -e < foo .PP The same, but suppressing the output (\fBX-Spam:\fR header and message) instead: .IP .B mailvisa check -q < foo .PP Spam check a message from standard input and send it to \fBprocmail(1)\fR for further processing (suppressing the exit code): .IP .B mailvisa check -e -m procmail .SH COPYRIGHT .PP Mailvisa is open source, under the terms of the MIT license. A copy of this license is contained in the file LICENSE in the source distribution. Mailvisa was written by Robbert Haarman. See \fIhttp://inglorion.net/\fR for contact information.
