640 P1 report

CS 640 Programming assignment 1
Kaihong Wang
Date 3/2/2019


Problem Definition

In this assignment, we need to implement and analyse neural network on three different datasets, linear dataset, nonlinear dataset, and hand-written digit dataset and observe the influence of learning rate and regularization. The crucial parts of this assignment include: reading and transforming data from datasets to applicable data structure to neural network, implementation of forward and backward propagation algorithm as well as statistics and analysis of performance of neural network.


Method and Implementation

We need to implement a neural network with a hidden layer from source code. We will at first need to use csv file reader to read data from datasets and convert them into numpy array. Then we use computation function in numpy module to implement the computational part in forward and backward propagation algorithm based on mathematical derivation. Finally we compute the confusion matrix to evaluate the performance of neural network using five-fold cross validation.

I use python to implement neural network, and the code consists a class called nn, which contains constuctor to initalize some super parameters of neural network, compute_cost function to compute cost and gradient, activation and de_activation function to compute the output and derivative of an activation function, train function to launch the training process including computing cost and updating weights and a run function to read dataset, start training and evaluating performance using five-fold cross validation. In this assignment, I set up a neural network with one hidden layer with 10 neurons for linear and non linear datasets. For the hand written digit dataset, I built a neural network with a hidden layer with 100 neurons.


Experiments

I conducted four parts of experiments in this assignment. The first part include evaluating performance of neural network on linear and nonlinear datasets using five-fold cross validataion. Confusion matrix and accuracy will be provided.
The second part include evaluating performance of neural network using different learning rate.
The third part is about the effect of different regularization approaches.
The last part is about the performance of the neural network on the hand-written digit dataset. To eliminate randomness during training, I will repeat training for 5 times and sum up the confusion matrix.

For the first part, confusion matrix and accuracy of five-fold cross validation will be provided to evaluate performance of neural network.
For the second part, plot of cost value during the training process will be shown to analyse the influence of learning rate.
For the third part, confusion matrix and accuracy of different regularization parameter will be provided.
For the forth part, confusion matrix and accuracy of test data will be provided.


Results

1) Confusion matrix and accuracy for neural network on linear and non linear datasets.

Confusion Matrix for linear dataset

Ground Truth
Predicted Value
0 1
0 998 1
1 2 999

accuracy for linear data = 99.85%

Confusion Matrix for nonlinear dataset

Ground Truth
Predicted Value
0 1
0 784 13
1 216 987

accuracy for non linear data = 88.55%

2) Cost with respect to training epoch on non linear data with different learning rate.





3) Test performance of neural network with different regularization parameter lambda

Confusion Matrix for non linear dataset with lambda = 0

Ground Truth
Predicted Value
0 1
0 195 11
1 4 190

accuracy for non linear data = 96.25%

Confusion Matrix for non linear dataset with lambda = 0.01

Ground Truth
Predicted Value
0 1
0 194 10
1 5 191

accuracy for non linear data = 96.25%

Confusion Matrix for non linear dataset with lambda = 0.1

Ground Truth
Predicted Value
0 1
0 195 10
1 4 191

accuracy for non linear data = 96.50%

Confusion Matrix for non linear dataset with lambda = 1

Ground Truth
Predicted Value
0 1
0 197 3
1 2 198

accuracy for non linear data = 98.75%

Confusion Matrix for non linear dataset with lambda = 10

Ground Truth
Predicted Value
0 1
0 194 10
1 5 191

accuracy for non linear data = 96.25%

4) Confusion matrix and accuracy for neural network on hand-written digits dataset.

Confusion Matrix for hand-written digits dataset

Ground Truth
Predicted Value
label 0 1 2 3 4 5 6 7 8 9
0 428 0 5 0 2 0 0 0 0 0
1 0 374 0 6 10 0 5 0 15 0
2 0 0 400 5 0 0 0 0 1 0
3 0 5 19 391 0 0 0 0 15 5
4 5 5 0 0 416 0 0 0 6 0
5 7 5 0 15 0 433 0 0 25 19
6 0 5 0 0 22 5 450 0 0 0
7 0 0 0 25 0 0 0 431 5 0
8 0 10 3 13 0 0 0 14 353 0
9 0 51 3 0 10 17 0 0 20 436

accuracy for non linear data = 91.68%


Discussion


Conclusions

In this assignment, I implemented two neural networks from source code with reasonable size with respect to the input and output size of the model so that they will less likely to suffer from under fitting or over fitting.
We also can conclude that it is of importance to choose reasonable super parameters such as learning rate, regularization parameter, number of neuron in hidden layer in neural network. Other advanced approaches to train a neural network such as using ReLu activation function, dropout, etc. are also recommended.


Credits and Bibliography

Datsets used in this assignment are downloaded from course website of CS640.