Go to file
2015-04-30 10:50:26 +01:00
.gitignore initial 2015-04-29 18:26:09 +01:00
README.md Merge branch 'master' of github.com:flyingsparx/WekaGo 2015-04-30 10:50:26 +01:00
wekago.go Added licensing info 2015-04-30 10:50:13 +01:00

WekaGo 1.0

A simple Go wrapper around the Weka Java command-line interface supporting basic classification tasks.

The library abstracts the use of ARFF files and allows users to integrate some of Weka's features programmatically and directly within a Go project.

Installation

To install WekaGo, retrieve it with:

$ go get github.com/flyingsparx/wekago

and then import it into your project:

import "github.com/flyingsparx/wekago"

Dependencies

No further Go libraries are required, but you will need working Go and Java environments.

You will also need the Weka jar file, downloadable from their website. Currently WekaGo is configured to look for weka.jar in the root of your Go project, and will fail if it cannot find it.

Using

Getting started

Most of the functionality provided by WekaGo is achieved through functions exposed by its API and centred around a Weka model. Models can be created using the NewModel() function, to which you should pass your desired classifier:

model := wekago.NewModel("bayes.BayesNet")

Model training

To train a model, model needs to be supplied with a series of feature instances. Individual features can be created by identifying a feature name, value, and the ARFF representation of the feature data type (e.g. 'real' '{true, false}', etc.):

feature1 := wekago.NewFeature("rain", "true", "{true, false}")
feature2 := wekago.NewFeature("grass", "wet", "{wet, dry}")
...

You can then declare the features that belong together as an instance. The outcome feature should be added last:

instance1 := wekago.NewInstance()
instance1.AddFeature(feature1)
instance1.AddFeature(feature2)

instance2 := wekago.NewInstance()
...

Once you have a series of training feature instances, they can be added directly to the model:

model.AddTrainingInstance(instance1)
model.AddTrainingInstance(instance2)
...

Then training the model is simple:

err := model.Train()

Any errors will be reported inside err, which will be nil if all is OK.

Alternatively, if you already have a model you have built before then you can load that:

model.LoadModel("/path/to/model")

Loading models directly means you do not need to create and add training feature instances.

Model testing

Adding test features is almost exactly the same as adding training features, except, if the value of the outcome feature (the last one added to an instance) is unknown, then replace its value with a "?". Then use model's AddTestingInstance() method to add instances of such feature sets:

test_feature1 := wekago.NewFeature("rain", "true", "{true, false}")
test_feature2 := wekago.NewFeature("grass", "?", "{wet, dry}")
...

test_instance1 := wekago.NewInstance()
test_instance1.AddFeature(test_feature1)
test_instance1.AddFeature(test_feature2)
...

model.AddTestingInstance(test_instance1)
...

Finally, the feature instances can be classified by calling Test():

err := model.Test()

As before, any errors will be included in err.

Testing the model populates its Predictions array.

Examining classification output

Once you've classified your test set, you can examine the predictions made by your chosen classifier through your model's Prediction slice:

for _, prediction := range model.Predictions{
    fmt.Printf("%s\n", prediction.Predicted_value)
}

In addition to Predicted_value, there are other fields are available in this struct:

  • Index - the number of the prediction corresponding to the position of the corresponding input test eature instance
  • Observed_value - the observed value of the corresponding test instance (equal to "?" where this was unknown)
  • Predicted_value - the predicted value of this instance

Licensing

The software in this repository is released under the terms of the Apache License Version 2.0. For the full text of the license, please see apache.org/licenses/LICENSE-2.0.