Naive Bayes

In this project you will implement the Naive Bayes algorithm.

Step 1: Implement Naive Bayes

Implement Naive Bayes as discussed in class and in the textbook. You only need to handle categorical (descriptive and target) features (such as in the golf dataset).

Step 2: Equivalent sample size / Laplace Smoothing

To deal with the potential problem of sometimes getting probability estimates equal to zero, use an additive term as discussed in class (this is called Laplace Smoothing in the textbook). The effective sample size should be small. For example, instead of using 3/20 as the probability estimate for Pr[cold | PlayGolf = "Yes"], you would use (3 + α)/(20 + 4α), where α is a small constant that you choose (perhaps α=1) and 4 comes from the fact that Temperature has 4 possible values.

Step 3: Main

Your main program should be able to run your Naive Bayes algorithm on any given .arff dataset, provided that all features (including the target feature) are categorical. You should be able to run your program in a manner similar to this:

Assignment8.exe golfTrain.arff golfTest.arff

The program would then train on golfTrain.arff and test on golfTest.arff. Your program would then output accuracy on the test set.

Step 3: Bonus features

You will get bonus credit for implementing neat extras. The following are some suggestions.

Step 4: Handing it in

In Educat, hand in the following