Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.
448 Discussions

Training data with caffe and python in Colfax (terminal): Do you have an example?

CBell1
New Contributor II
11,007 Views

Hello,

I have some doubts about how to define the CAFFE_PATH= and the data path I have to write within Phyton program to training data in my Colfax Home directory.

Do you have an example?

Thank you

0 Kudos
26 Replies
CBell1
New Contributor II
884 Views

Hi Deepthi,

I tried to execute the solver (myapplication2.py previously attached) using the train_val.prototxt file saved in my caffe_path.

The error now is:

[...]

I0919 07:48:29.793191 150110 net.cpp:1238] data -> label

F0919 07:48:29.793673 150134 db_lmdb.hpp:53] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory

*** Check failure stack trace: ***

/var/spool/torque/mom_priv/jobs/165768.c009.SC: line 3: 150110 Aborted python3 myapplication2.py

Attached the error log and the train_val.prototxt

Thanks you

Cosma

0 Kudos
idata
Employee
884 Views

Hi Cosma,

 

 

This error comes since it is not able to find lmdb in the path specified. Since you are using the train_val.prototxt of intel caffe googlenet, the lmdb path is pointing to examples/imagenet/ folder which is not present in ncappzoo.

 

 

This can be resolved in two ways.

 

 

1. Use the train_val.prototxt available in ncappzoo

 

 

2. If you want to use train_val.prototxt in the intel caffe googlenet, please give the correct path of train and test lmdb in train_val.prototxt.

 

Currently the path is examples/imagenet/ilsvrc12_train_lmdb and examples/imagenet/ilsvrc12_val_lmdb. Please change that to data/dogsvscats_train_lmdb and data/dogsvscats_val_lmdb (lmdbs generated using the Makefile).

 

 

Hope this helps.

 

 

Regards,

 

Deepthi Raj.
CBell1
New Contributor II
884 Views

Hi Deepthi,

I used the local data/dogsvscats_train_lmdb and data/dogsvscats_val_lmdb within the train_val.prototxt.

The job is still running since 2 hrs, so, fingers crossed.

Waiting the result, I have some questions:

What is it the rule of create-lmdb.sh used by Makefile for the dataset preparation versus the train_val.prototxt used for training data?

Should these files use the same resize images parameters?

How to define the memory size needed in a cluster to training data?

Do the memory size depend from image size parameters wrote in create-lmdb.sh and/or train_val.prototxt?

Can you suggest Trainings to improve competences in practice to be able to customize dataset preparation and Training using Caffe and TensorFlow using a Cluster Nodes like Colfax or external resources?

(Obviously, I'll keep you informed when the job will be finished)

Thank you

Regards

Cosma

0 Kudos
CBell1
New Contributor II
884 Views

Hi Deepthi,

mission accomplished!

Dataset preparation (Makefile + create-lmdb.sh) and Training Data (solver.prototxt + train_val.prototxt) done!

myjob2.py.e165929

...

I0920 05:39:27.024672 242270 solver.cpp:443] Optimization Done.

myjob2.py.o165929

...

#

# End of output for job 165929.c009

# Date: Thu Sep 20 05:39:27 PDT 2018

#

Many thanks for your great support!

Cosma

0 Kudos
idata
Employee
884 Views

Hi Cosma,

 

 

Happy to hear that it worked for you :)

 

 

It would be great if you can raise a new thread for the other queries that you have ( like the questions on create_lmdb and train_val.prototxt) so that we can close this thread. This would help us to better assist you since the actual query got cleared.

 

 

Please feel free to contact us in case of any issues :)

 

 

 

Regards,

 

Deepthi Raj
0 Kudos
CBell1
New Contributor II
884 Views

Hi Deepthi,

OK! I'll open a dedicated case about my questions.

Please, close this case as solved.

Many thanks again for your support

Cosma

0 Kudos
Reply