DataSink
DataSink#
The easiest way to save an weka.core.Instances
object to a file is by using
the weka.core.converters.ConverterUtils.DataSink class.
import weka.core.converters.ConverterUtils.DataSink;
import weka.core.Instances;
Instances dataset = ...
String outputFilename = ...
try {
DataSink.write(outputFilename, dataset);
}
catch (Exception e) {
System.err.println("Failed to save data to: " + outputFilename);
e.printStackTrace();
}
Converter#
You can use the ArffSaver
class (weka.core.converters.ArffSaver
) for saving a weka.core.Instances
object to a file.
Here is the snippet :
Instances dataSet = ...
ArffSaver saver = new ArffSaver();
saver.setInstances(dataSet);
saver.setFile(new File("./data/test.arff"));
saver.writeBatch();
Notes:
- using the converter approach, one can easily swap the
ArffSaver
with another saver, e.g., theCSVSaver
to output the data in a different format. - The Weka Examples collection dedicates quite a few examples to the use of converters in the
wekaexamples.core.converters
package (stable, developer)
Java I/O#
You can also save the weka.core.Instances
object directly using Java I/O classes:
import java.io.BufferedWriter;
import java.io.FileWriter;
...
Instances dataSet = ...
BufferedWriter writer = new BufferedWriter(new FileWriter("./data/test.arff"));
writer.write(dataSet.toString());
writer.flush();
writer.close();
Note: using the toString()
of the weka.core.Instances
doesn't scale very well for large datasets, since the complete string has to fit into memory. It is best to use a converter, as described in the previous section, which uses an incremental approach for writing the dataset to disk.