Skip to content

Unofficial

There are a number of packages for WEKA 3.8 on the internet that are not listed in the "official" WEKA package repository. These packages can nevertheless be easily installed via the package manager in WEKA 3.8 (available via the Tools menu in WEKA's GUIChooser) by providing the URL for the package .zip file.

Below is an (incomplete list) of packages that are available.

Input/outout#

Preprocessing#

  • dataset-weights -- filters for setting attribute and instance weights using various methods.
  • missing-values-imputation -- various methods for imputing missing values using a filter.
  • mxexpression -- filter for updating a target attribute using a mathematical expression.

Classification#

  • Java neural network package -- Java (convolutional or fully-connected) neural network implementation with plugin for Weka. Uses dropout and rectified linear units. Implementation is multithreaded and uses MTJ matrix library with native libs for performance.
  • HMMWeka -- This library makes Hidden Markov Model machine learning available in Weka.
  • Collective classification -- Algorithms around semi-supervised learning and collective classification.
  • Bagging ensemble selection -- Bagging Ensemble Selection - a new ensemble learning strategy.
  • DataSqueezer -- Efficient rule builder that generates a set of production rules from labeled input data. It can handle missing data and has log-linear asymptotic complexity with the number of training examples.
  • miDS -- mi-DS is a multiple-Instance learning supervised algorithm based on the DataSqueezer algorithm.
  • LibD3C -- Ensemble classifiers with a clustering and dynamic selection strategy.
  • ICRM -- An Interpretable Classification Rule Mining Algorithm.
  • tclass -- TClass is a supervised learner for multivariate time series, originally developed by Waleed Kadous.
  • wekaclassalgos -- collection of artificial neural network (ANN) algorithms and artificial immune system (AIS) algorithms, originally developed by Jason Brownlee.
  • mxexpression -- classifier for making predictions using a mathematical expression.

Clustering#

  • APCluster -- Affinity propagation algorithm for clustering, used especially in bioinformatics and computer vision.
  • Fast Optics -- Fast Implementation of OPTICS algorithm using random projections for Euclidean distances.

Similarity functions#

  • wekabiosimilarity -- implements several measures to compare binary feature vectors; and, additionally, extrapolates those measures to work with multi-value, string and numerical feature vectors.

Discretization#

  • ur-CAIM -- Improved CAIM Discretization for Unbalanced and Balanced Data.
  • CAIM -- Class-Attribute Interdependence Maximization algorithm: discretizes a continuous feature into a number of intervals. This is done by using class information, without requiring the user to provide this number.

Feature selection#

Frequent pattern mining#

  • XApriori --Available case analysis modification of Apriori frequent pattern mining algorithm.

Stemming#

Text mining#

  • nlp -- Contains components for natural language processing, eg part-of-speech tagging filter and Penn Tree Bank tokenizer. Makes use of the Stanford Parser (parser models need to be downloaded separately).

Visualization#

  • graphviz-treevisualize -- Generating nice graphs in the Explorer from trees (eg J48) using the GraphViz executables.
  • confusionmatrix -- Various visualizations of confusion matrices in the Explorer.
  • serialized-model-viewer -- Adds a standalone tab to the Explorer that allows the user to load a serialized model and view its content as text (simply uses the objects' toString() method).

Parameter optimization#

  • multisearch -- Meta-classifier similar to GridSearch, but for optimizing arbitrary number of parameters.

Others#

  • screencast4j -- Allows you to record sound, webcam and screen feeds, storing them in separate files to be combined into a screencast using a video editor. This screencast you can then share on YouTube, for instance.
  • command-to-code -- Turns command-lines (eg of classifiers or filters) into various Java code snippets.
  • jshell-scripting -- Allows scripting in Java, using jshell