Oversampling and Undersampling
A frequent question of Weka users is how to implement oversampling or undersampling, which are two common strategies for dealing with imbalanced classes in classification problems. This post provides some explanation.
A frequent question of Weka users is how to implement oversampling or undersampling, which are two common strategies for dealing with imbalanced classes in classification problems. This post provides some explanation.
One role of the Weka software is to provide users with the opportunity to implement machine learning algorithms without having to deal with data import and evaluation issues: when a classifier has been written as a Java class that implements a couple of standard methods defined in the Weka framework, all the goodies that come with Weka are automatically applicable to it, and it will automatically show up in Weka's graphical user interfaces. To see what needs to be done, read on!
Kotlin is a statically typed programming language for modern multiplatform applications running on the Java Virtual Machine. Its design philosophy is to be a concise and type-safe (e.g. support for non-nullable types) programming language. It supports both object-oriented and functional constructs. Other features include smart casting, operator overloading, higher-order functions, lambdas, and extensions. The latter has led to the formation of a group of Kotlin extension libraries that primarily focuses on the syntactic improvement of other libraries' usages. A key example is Android KTX library, developed by Google, which provides extensions for the Android framework. The purpose of this is to make Android development with Kotlin more concise, pleasant and idiomatic.
In order to make Weka more accessible for people that like having access to the latest code, we have been looking into ways of bringing Weka to Github. Especially, since our faculty already has an organizational account there.
However, Weka was original stored in a CVS repository and then got migrated to Subversion. Not only has it a lot of branches, not all of them are publicly available, like commercial ones. This all complicated the migration to Github.