Skip to main content

Day 14 (7/25/19): Running the Experiments

The models are still training today although some results can be seen in the early training batches. As expected, catastrophic forgetting is occurring very noticeably in the offline model which is resulting in a drastic decrease in OOD performance as well. I am excited to see how the other models with more incremental learning capabilities turn out with regards to OOD performance in the future.
This graph is showing the accuracy of the model on the training data through 3 batches. Notice how with each new batch of classes, performance significantly drops off as the model struggles to fit to the new data.

Tomorrow I hope to continue the experiments to include the Mahalanobis (distance in feature space) inference method in addition to the Softmax Thresholding method.

Comments

Popular posts from this blog

Day 29 (8/15/19): Final Day Before Presentations

Most of today was also spent practicing and editing my presentation to make it as professional as I can. I'm really looking forward to the opportunity to present my work to faculty and friends tomorrow. Here is a link to the slides for my final presentation: Novelty Detection in Streaming Learning using Neural Networks

Day 28 (7/14/19): Presentation Dry Run

In the morning, all of us interns got the chance to practice our presentations in front of each other in the auditorium. I was pretty happy with how mine went overall but the experience was definitely valuable in identifying typos or slight adjustments that should be made. Throughout the rest of the day, I tried to implement these changes and clean up a few plots that I want to include for Friday.

Day 9 (7/18/19): Incrementally Learning CUB200

Today I continued my work learning about incremental learning models by testing out different strategies on the CUB200 dataset. From what I understand from reading various articles, there seem to be five different approaches to mitigating catastrophic forgetting in lifelong learning models. These are regularization methods (adding constraints to a network's weights), ensemble methods (train multiple classifiers and combine them), rehearsal methods (mix old data with data from the current session), dual-memory methods (based off the human brain, includes a fast learner and a slow learner), and sparse-coding methods (reducing the interference with previously learned representations).  All of these methods have their constraints and I don't believe it is yet clear what method (or what combination of different methods) is best. Full rehearsal obviously seems to be the most effective at making the model remember what it had previously learned but given that all training exam...