Using the mathexpression filter
The filter MathExpression
can be found in this package:
+, -, *, /, (, ),
pow, log, abs, cos, exp, sqrt, tan, sin, ceil, floor, rint,
MEAN, MAX, MIN, SD, COUNT, SUM, SUMSQUARED, ifelse
A
.
Manual discretization#
One can even use the filter for manually discretizing numeric attributes, if the other Discretize
filters (supervised and unsupervised) cannot be used. This works thanks to the ifelse
operator.
It is basically a two-step-process:
- run
MathExpression
to turn all the values into discrete ones - run
NumericToNominal
to turn the numeric values then into nominal labels
Here's an example:
- a dataset where the first attribute needs to be discretized into 3 bins
- the bins need to be as follows
- using
MathExpression
to create discrete values
weka.filters.unsupervised.attribute.MathExpression \
-E "ifelse(A>20, ifelse(A>80, 3, 2), 1)" \
-V \
-R 1
Note:
-V -R 1
means we only want to transform the first attribute. Without-V
all the numeric attributes would be transformed according to this expression. * this results in the following transformation
- using
NumericToBinary
to create a nominal attribute from the numeric one
- optional: if one wants to rename those labels, one can use the class listed in the Rename Attribute Values article for that
Note: the "\" at the end of the lines tell a *nix bash to continue on the next line.