Message classifier
In the following you'll find some information about the MessageClassifier from the 2nd edition of the Data Mining book by Witten and Frank.
Source code#
Depending on the version of the book, download the corresponding version (this article is based on the 2nd edition):
- 1st Edition: MessageClassifier
- 2nd Edition: MessageClassifier (book, stable-3.8, developer)
Compiling#
- compile the source code like this, if the
weka.jar
is already in your CLASSPATH environment variable:
/path/to/
with the correct path on your system):
Note: The classpath handling is omitted from here on.
Training#
If you run the MessageClassifier
for the first time, you need to provide labeled examples to build a classifier from, i.e., messages ("-m
") and the corresponding classes ("-c
"). Since the data and the model are kept for future use, one has to specify a filename, where the MessageClassifier
is serialized to ("-t
").
Here's an example, that labels the message email1.txt as miss:
Repeat this for all the messages you want to have classified.Classifying#
Classifying an unseen message is quite straight-forward, one just omits the class option ("-c
"). The following call