In this project you will implement the Naive Bayes algorithm.
Implement Naive Bayes as discussed in class and in the textbook. You only need to handle categorical (descriptive and target) features (such as in the golf dataset).
To deal with the potential problem of sometimes getting probability estimates equal to zero, use an additive term as discussed in class (this is called Laplace Smoothing in the textbook). The effective sample size should be small. For example, instead of using 3/20 as the probability estimate for Pr[cold | PlayGolf = "Yes"], you would use (3 + α)/(20 + 4α), where α is a small constant that you choose (perhaps α=1) and 4 comes from the fact that Temperature has 4 possible values.
Your main program should be able to run your Naive Bayes algorithm on any given .arff dataset, provided that all features (including the target feature) are categorical. You should be able to run your program in a manner similar to this:
Assignment8.exe golfTrain.arff golfTest.arff
The program would then train on golfTrain.arff and test on golfTest.arff. Your program would then output accuracy on the test set.
You will get bonus credit for implementing neat extras. The following are some suggestions.
In Educat, hand in the following
READ_ME.txt
file indicating what bonus features you are claiming, what language (and version of that language) you used, how to compile and run it, and any other information that you want to include.