Skip to main content

Posts

Showing posts from July, 2019

Day 18 (7/31/19): Obtaining Results from Early Experiments

Today I reviewed some of the first true results for the early rounds of experiments I performed. For the offline model (intended to be used as the baseline for the calculating the omega values of incrementally learned models), the final batch of 20 classes yielded an accuracy of 81.20%, an AUROC for Gaussian Noise of .99, an AUROC for Inter datset OOD of .82, and an AUROC for Intra dataset OOD of .80. It is important to note as well that I switched the learning rate scheduler to be exponentially defined rather than decaying the learning rate by steps once it reaches 2/3 of the batch iterations. The full rehearsal model, as expected, almost performed as well as the offline model achieving an  accuracy of 78.12%, an AUROC for Gaussian Noise OOD Omega of .89, an AUROC for Inter datset OOD Omega of .92, and an AUROC for Intra dataset OOD Omega of .98. It will be interesting to see how these results compare to future models. Most likely, these less memory-intensive models will perform

Day 17 (7/30/19): Adding Bounding Boxes in Experiments

Today was very successful as we finally were able to finish debugging the script we were using for training our experiment models. Previously, I was seeing the loss function diverge to nan during the second batch and the problem was actually that I was forgetting to shuffle the data with each new batch. After cleaning up much of the script, I was able to organize the code to be more general purpose for future experiments. I ran the new code to train an offline so i could generate some baseline numbers to use for omega values for the experiments. However, I am still adjusting a few hyperparameters (namely the patience counter for early stopping to prevent overfitting) to see what will yield the highest offline accuracy with bounding boxes implemented. The full rehearsal model was fairly simple to implement after the offline model trained so I plan to start evaluating Mahalanobis OOD and more complex models tomorrow.

Day 16 (7/29/19): Achieving Better Results by Adjusting Learning Rate

This morning I reviewed the results of the experiments I ran over the weekend. However, the tensorboard graphs showed that after a certain number of batches, every model appeared to stop learning on the training data, essentially becoming stagnant at the end. Below is a graphical representation of those models over the weekend. I realized that the training script I was using didn't explicitly reset the optimizer and learning rate scheduler after each batch as I intended. Therefore, as the learning rate was continuously getting smaller, the model was no longer learning in the further batches. I adjusted my script to reset the parameters of both the optimizer and scheduler after each batch of 20 classes and restarted many of the experiments. I also added some more capabilities to load and save the models by creating dictionaries to track how the accuracy and omega values (against default values of offline model) change as the number of batches and classes increase. I hope to h

Day 15 (7/26/19): Testing Models with Rehearsal and L2-SP Regularization

Today I continued the experiments from yesterday along with implementing a L2SP Model and Partial Rehearsal with Baseline OOD. So far it seems that the performance of every model (both accuracy and area under ROC curve) significantly drops as the number of classes learned increases. Implementing the more complex models such as SLDA, S-SVM, and L2SP (EWC) as well as more accurate inference methods such as Mahalanobis will be a challenge but also interesting to see how well they perform. The blue line represents the L2SP model and the red line represents the Full Rehearsal model. These have only been trained for around three batches of 20 classes and will continue to learn overnight. However, the performance trend will most likely continue as the accuracy drops with newly added classes. I also finished my presentation outline today which can be found at this link:  RIT Presentation Outline

Day 14 (7/25/19): Running the Experiments

The models are still training today although some results can be seen in the early training batches. As expected, catastrophic forgetting is occurring very noticeably in the offline model which is resulting in a drastic decrease in OOD performance as well. I am excited to see how the other models with more incremental learning capabilities turn out with regards to OOD performance in the future. This graph is showing the accuracy of the model on the training data through 3 batches. Notice how with each new batch of classes, performance significantly drops off as the model struggles to fit to the new data. Tomorrow I hope to continue the experiments to include the Mahalanobis (distance in feature space) inference method in addition to the Softmax Thresholding method.

Day 13 (7/24/19): Offline and Rehearsal Model Experiments

Today I began some experiments that I hope to include in my final project presentation. The main objective as of now is to figure out which incremental learning strategies yield the best out-of-distribution  (OOD) performance. For the experiments I performed today, I trained all layers of the models in batches of 20 classes (10 batches for the 200 species in CUB200 dataset) and evaluated OOD using a baseline softmax thresholding method. The performance metrics I hope to obtain are the Omega alpha (how accurate model is compared to offline model) and Omega OOD (how accurate the model is at novelty detection compared to offline model). *These models are currently still training so I should have the results in the morning. During lunch I went to the seminar which discussed ASL, specifically how it was important here at RIT. I found the talk very interesting and even learned a few signs which might be useful someday. Tomorrow I hope to continue my work on this project and expand the

Day 12 (7/23/19): Bounding Classifiers

Today I experimented with different bounded classifiers for open set recognition. A bounding classifier essentially is a type of model which can detect out-of-distribution (OOD) samples, i.e. when presented with an image of a class it has not been trained on. In the image below, a bounded classifier would identify images found in the less dense regions as unknowns rather than try to fit them to a previously-learned class. I performed my experiment evaluating different bounding classifiers by testing a Resnet50's accuracy in detecting out-of-distribution samples (original dataset is CUB200) either from generated Gaussian noise or the Oxford Flowers dataset. Here are two sample images from those respective dataloaders: The results I achieved were very similar to those shown in this table (the third row is the CUB200 dataset): Tomorrow I hope to finally begin to look into the intersection of incremental learning and open set recognition (having experiment with b

Day 11 (7/22/19): Partial Rehearsal and Out-Of-Distribution Recognition

This morning I reviewed the results of the different incremental learning models I started training over the weekend. Here are the results for five different methods. *Accuracy is computed as the average of the batch 1 and batch 2 accuracies to represent the overall test set. Omega is the average ratio between the accuracy of the incremental model vs. the offline model (offline accuracy with default hyperparameters: .7330) No Regularization: Batch 1--> Testing 45/45 Accuracy: 0.0241 Batch 2--> Testing 46/46 Accuracy: 0.7702 Accuracy: 0.3972 Omega: 0.5418 L2 Regularization: Testing 45/45 Accuracy: 0.0132 Testing 46/46 Accuracy: 0.7868 Accuracy: 0.4000 Omega: 0.5457 L2SP Regularization: Batch 1--> Testing 47/47 Accuracy: 0.0000 Batch 2--> Testing 45/45 Accuracy: 0.8185 Accuracy: 0.4093 Omega: 0.5583 Pseudo-Rehearsal (w/ random sampling of ten images per batch-1 class): Testing 45/45 Accuracy: 0.4166 Testing 46/46 Accuracy: 0.7988

Day 10 (7/19/19): 2-Batch Learning with Regularization and Rehearsal

Today I was given the task of trying incremental learning on the CUB200 dataset in four different ways: 1) With no special regularization  2) With L2SP regularization where the parameters of the first batch become the initialized parameters of the second batch 3) With full rehearsal  4) With partial rehearsal by implementing the herding method for finding a fixed buffer of k samples after training on the first batch Using default values for the Resnet18, I first evaluated the offline learning accuracy to be around 73%. Then, I used an adjusted dataloader function to split the dataset into two batches to be trained incrementally. After setting the regularization term weights to 0, I trained the new incremental model with no special regularization and got an accuracy of 39.72%. This drop in accuracy demonstrates the phenomenon of catastrophic forgetting associated with incremental learning as this accuracy yields an Omega value of .5418 using the metric definition below. N

Day 9 (7/18/19): Incrementally Learning CUB200

Today I continued my work learning about incremental learning models by testing out different strategies on the CUB200 dataset. From what I understand from reading various articles, there seem to be five different approaches to mitigating catastrophic forgetting in lifelong learning models. These are regularization methods (adding constraints to a network's weights), ensemble methods (train multiple classifiers and combine them), rehearsal methods (mix old data with data from the current session), dual-memory methods (based off the human brain, includes a fast learner and a slow learner), and sparse-coding methods (reducing the interference with previously learned representations).  All of these methods have their constraints and I don't believe it is yet clear what method (or what combination of different methods) is best. Full rehearsal obviously seems to be the most effective at making the model remember what it had previously learned but given that all training exam

Day 8 (7/17/19): Incremental Lifelong Learning

I spent the majority of today reading recent papers outlining various approaches to achieving effective incremental learning deep learning models. There seems to be a wide variety of proposed systems with no general consensus on which is best or how to evaluate the different models. In fact, incremental learning does not always mean the same thing in different papers because some models incrementally learn classes while others incrementally learn datasets or even stray from the batch setting altogether by learning from streaming data. As a result there does not yet exist a standardized way of evaluating which models are actually best at achieving lifelong learning because they are often tested on significantly different tasks. After lunch, my advisor and I went through a presentation made by Tyler Hayes, another PhD student in the lab. It discussed many of the problems and proposed solutions which I was reading about and explained the focus of a lot of the research the kLab is doi

Day 7 (7/16/19): Transfer Learning with Stanford Dogs

Today I continued my work with transfer learning from yesterday and applied it to new data: the Stanford Dogs dataset. This contains images representing 120 different species of dogs, and I trained a model to classify them that was similar to the one I used for CUB-200. I realized an error I was doing before was computing the Cross Entropy Loss as well as passing the weights of the fully connected through a Log Softmax function. As a result, I was actually computing the log of the weights twice which was affecting the training of my model's parameters. After fixing that structure, I achieved ~83% accuracy classifying the dog species using fixed feature extraction on Resnet18, although I hope to still improve in the future. There are a number of hyperparameters I can play around with such as the transformations for data augmentation, the learning rate for the criterion, the criterion and optimizer functions themselves, the batch size of the training samples, etc. Along with twe

Day 6 (7/15/19): Transfer Learning with CUB-200

Today I started practicing transfer learning with CNNs. This process involves pretraining a CNN on a very large dataset (e.g. ImageNet, which contains 1.2 millions images with 1000 categories) and then using that pretrained model either as an initialization or a fixed feature extractor for a new task (e.g. classify images from a new dataset). This technique is actually much more widely used in practice than training a CNN from scratch because it is rare to have a dataset of sufficient size to fit your task and training a model on a dataset like ImageNet would take weeks.  The dataset I started working with is called Caltech-UCSD Birds 200 (CUB-200) which contains images representing 200 different species of birds. First, I loaded the dataset and split the data into train and test sets. Using the transform function I implemented data augmentation, a strategy to significantly increase the diversity of data without actually collecting new data. Then, using pytorch, I loaded a pre

Day 5 (7/12/19): Convolutional Neural Networks in Pytorch

I started off today by reading a few papers written on topics related to my project of combining open set recognition with incremental learning. I understand the goals and challenges associated with creating a CNN with these capabilities but am not yet sure exactly how I will implement them concretely in a model to experiment with. The rest of the morning I spent going through Andrew Ng's course on Convolutional Neural Networks. I feel like I learned a lot regarding how convolutions work, how to control padding and stride, and understanding the structure of notable case studies such as the LeNet-5, AlexNet, VGGNet, and ResNet. These open-source networks are helpful places to start when I implement my own CNNs because they have already chosen decent hyperparameters which could then be fine-tuned to fit a specific problem. Visualization of VGG-19 vs. Plain Network vs. Residual Network structure Around noon, the other interns and I listened to a seminar given by David Messin

Day 4 (7/11/19): Convolutions and Image Enhancement

I started off today working through a few computer vision homework problems involving training classifiers on different datasets, discussing the limitations and potential solutions associated with the various classifiers, implementing histogram equalization for image enhancement (altering the contrast as well as gamma value), and experimenting with linear filtering to apply different convolutions to images.  This is the original image. ________________________________________________ These two images are with diagonal convolutions applied. The filters highlight the diagonal lines in the image in different directions. _________________________________________________ This is the original image manipulated by changing the contrast and gamma values. __________________________________________________ The graph on the right displays the histogram of pixel values separated by RGB values. The graph on the left shows the normalized and cumulative histogram of t

Day 3 (7/10/19): Experimentation with Classifiers

Today I took a step forward toward a more specific focus on machine learning and neural networks. After quickly reviewing yesterday's material through Datacamp exercises on python and Numpy, I spent most of the morning completing tutorials on implementing various classifiers using the machine learning software Scikit-learn. First using the simple K Nearest Neighbors Classifier, I learned how to fit classifiers to data, use the classifiers to predict targets for new test data, and evaluate the accuracy and errors of each classifier using Confusion Matrices and Classification Reports. Then, I applied what I learned to both the Iris (classifying three types of flowers) and MNIST (classifying handwritten digits) datasets implementing different types of models such as Support Vector Machine (both Linear and RBF), Gaussian Process, and Multilayer Perceptron Classifiers. In the afternoon, I watched a Youtube playlist which provided an introduction to neural networks including a b

Day 2 (7/9/19): Mathematics and Data Science Review

After a brief meeting with the other interns in the morning (and receiving our prizes from the team building exercise yesterday), I started working in the Machine Vision Lab to establish a solid foundation of the math and programming packages that will be essential for this internship. After finishing setting up my workplace and resolving a few operating system issues, we successfully installed all the necessary software, such as anaconda, Spyder, Numpy, and Pytorch, for me to begin experimenting and utilizing the learning resources. I started off reading a few articles and completing an edX course titled "Essential Math for Machine Learning- Python Edition". These materials reviewed the fundamentals of linear algebra (vectors, matrices, tensors, and their operations), calculus (multivariate differentiation, integration, etc.), and statistics/probability (measures of central tendency and variance, confidence intervals, sampling distributions, and hypothesis testing).

Day 1 (7/8/19): Introduction

Today was my first day participating in the summer internship program for high school students at the RIT Imaging Center. The morning was spent completing the necessary in-processing such as setting up my computer account, id, and parking permit. After we completed this, the next four hours was spent on a team building exercise consisting of a scavenger hunt all around campus. We created a final video product to present to three judges, and although we didn't achieve as many points as we had hoped, the experience was a lot of fun and a great way to get to know the other interns. Finally, the rest of the day was spent talking with our research lab mentors. I was introduced to the different topics of material I will spend the rest of this week learning including Linear Algebra, Numpy, Computer Vision, Machine Learning, Deep Learning, and Neural Networks. Establishing a solid foundation around these topics will allow me to complete valuable research in the lab in the future. Overall,