I'm following @AshwinVijayakumar instructions (https://movidius.github.io/blog/deploying-custom-caffe-models/) about training a NN. I do the following.
- Split training/validation images (Kaggle, Stanford Cars which identifies car models)
- Put the images in an LMDB file, scale them to 224x224
- Make some minor changes to the GoogleNet training prototxt file (drop number of classes down from 1000 to 196)
- Start training. I'm not using transfer learning, although if I did, I see the same results below.
I didn't change the image mean since the calculated mean is about the same as the default. What I see is that the loss3 quickly hits ~5 and says there forever. This can't possibly be right. I'm training on an EC2 instance w/ a GPU but there's 6k images in the set.
I0926 17:54:09.614917 110 solver.cpp:237] Train net output #0: loss1/loss1 = 5.30372 (* 0.3 = 1.59112 loss)
I0926 17:54:09.614935 110 solver.cpp:237] Train net output #1: loss2/loss2 = 5.24625 (* 0.3 = 1.57387 loss)
I0926 17:54:09.614946 110 solver.cpp:237] Train net output #2: loss3/loss3 = 5.26799 (* 1 = 5.26799 loss)
I've been stuck on this for a couple of days now and I'm tearing out my hair. Any help, suggestions, or speculations are deeply appreciated.