The models are still training today although some results can be seen in the early training batches. As expected, catastrophic forgetting is occurring very noticeably in the offline model which is resulting in a drastic decrease in OOD performance as well. I am excited to see how the other models with more incremental learning capabilities turn out with regards to OOD performance in the future.
This graph is showing the accuracy of the model on the training data through 3 batches. Notice how with each new batch of classes, performance significantly drops off as the model struggles to fit to the new data.
Tomorrow I hope to continue the experiments to include the Mahalanobis (distance in feature space) inference method in addition to the Softmax Thresholding method.
Comments
Post a Comment