Using the mathexpression filter
The filter MathExpression can be found in this package:
+, -, *, /, (, ),
pow, log, abs, cos, exp, sqrt, tan, sin, ceil, floor, rint,
MEAN, MAX, MIN, SD, COUNT, SUM, SUMSQUARED, ifelse
A.
Manual discretization#
One can even use the filter for manually discretizing numeric attributes, if the other Discretize filters (supervised and unsupervised) cannot be used. This works thanks to the ifelse operator.
It is basically a two-step-process:
- run
MathExpressionto turn all the values into discrete ones - run
NumericToNominalto turn the numeric values then into nominal labels
Here's an example:
- a dataset where the first attribute needs to be discretized into 3 bins
- the bins need to be as follows
- using
MathExpressionto create discrete values
weka.filters.unsupervised.attribute.MathExpression \
-E "ifelse(A>20, ifelse(A>80, 3, 2), 1)" \
-V \
-R 1
Note:
-V -R 1means we only want to transform the first attribute. Without-Vall the numeric attributes would be transformed according to this expression. * this results in the following transformation
- using
NumericToBinaryto create a nominal attribute from the numeric one
- optional: if one wants to rename those labels, one can use the class listed in the Rename Attribute Values article for that
Note: the "\" at the end of the lines tell a *nix bash to continue on the next line.