CS 470 FINAL PROJECT

Start with the notebook titled "2.1-a-first-lool-at-a-neural-network.ipynb" which you should be able to find in your DeepTeacher account in ~userid/Documents/CS470 (where "~userid" is your home directory/folder; the absolute path for that should be /home/userid, where of course userid is your NMU userid which I re-used on the server for your username).

Make the following modifications (Tasks 1-5) to the notebook, save it, and submit that file to me. Actually you can just tell me the name of the file (and where it is if you moved it from /Documents/CS470). I'll find it on the server! (The tasks below are numbered one through five. To be clear, I want you to copy code snippets from this document and paste them into your notebook.)

(1) First task: First run all the cells in the notebook as is. Make sure all of the code cells get run, down to the last one, which has the code: print('test_acc:', test_acc). (You'll know they've run if you see input and output numbers for them.)

(2) Next, add a new cell to the notebook, anywhere you'd like, to look at a particular example image, preferably one of the TEST images (in test_images array) by pasting in the following code and running it.

Note: pick your own index into the test set. (The index in the example above is i=20.) The test set has 10000 images so choose your index from 0 to 9999. Note also that the test images must be in the raw format and not the preprocessed, reshaped format that happens just before the network is trained and tested. To make sure the test images are in their original raw format just read them in (again) with this code below:

(3) Great, now you've looked at the inputs (picking one in particular). Next I want you to modify the network architecture, designing your own network. Locate the following code in the notebook:

This code creates a neural network with one input layer of 784 neurons, implicit in the code "input_shape=(28 * 28)", followed by a hidden layer with 512 neurons, densly connected (i.e., completely connected) to all of the input neurons, followd by an output layer of ten neurons, each representing a class or category (digits 0..9). I want you to make this a deeper network by adding at least TWO layers between the hidden layer and the output. (Technically your layers will also be "hidden" layers.) Do this using "network.add()" but note that you do not have to specify input shape. The "shape" of your input (basically number of incoming connections) is determined by the shape (number of connections) output by the previous layer. So your network.add() call should look like this:

network.add(layers.Dense(350, activation='relu'))

Note that you can vary the number of neurons. As a matter of fact I WANT you to do this. I want your added layers to have progressively FEWER neurons to try to force greater generalization (learning of deeper features) but watch not to go with too few or your accuracy on the test set will really suffer! I would NOT recommend experimenting with the activation function (i.e., keep it "relu" for rectified linear unit) at least for this project! (I think you will find that the more layers we add the higher the accuracy on the training set but the lower the accuracy on the test set! That is OK. We'd have to change the optimizing/learning routine, as backprop ("rmsprop") fades with depth, but that is for another day! Just try to keep accuracy on the test set above 95%.)

Go ahead and run the cells to compile your newly designed network, then make sure the training and test sets are preprocessed, train the network (with "fit"), and check the accuracy on the training and test sets, all by just running (re-running) the cells already in the notebook.

(4) Next I want you to use some tools to visualize your new network.

(4a) First paste this code into a new cell somewhere and run it:

   import matplotlib.pyplot as plt
   print(network.summary())

(4b) Next paste in and run this code:

   from keras.utils.vis_utils import plot_model
   plot_model(network, to_file='model_plot.png',show_shapes=True, show_layer_names=True)

(5) Finally, the last thing I ask from you is to actually run your trained model (network) on a single example and examine the actual output. Use the test image you chose above for Task 2, since you now know what that looks like and what the correct category/class/label is. The following code picks out the i_th test image from the set of 10K:

(Study this code; there is a lot to see here! Note that there is a method called "predict()" that your network object inherits from the "model" class. Predict() wants an array of images, so "test_images[i:i+1]" is an array of ONE image, namely the test_image[i] in an array of length one. (The syntax "[j:k]" in python specifies a subset of an array from element at index j to the element at index k-1; in other words from j to just before k.) Note that the output of softmax is an array of ten probabilities, in floating point (scientific) notation. You should see that most of the probabilities are essentially zero, they are so small. One of them should be much larger than the others. This is your network's prediction. The "argmax()" function takes the array of ten numbers and returns the INDEX of the largest one, 0 to 9. This is the DIGIT that your trained network predicts.

So, does it match the LABEL for your chosen test image? __

If so, you are done! Save your notebook and let me know where to find it.

(One last note. For the code above to work the test_images must be in their PREPROCESSED, reshaped format (and NOT in the RAW format from when they are first loaded). If you have recently re-loaded the training and test data (probably in order to run the code in Task 2 above) then you'll need to re-run the code below (which you can find in the original code for the notebook).)