- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I decided to try train neural network to recognize hand-written symbols using sample from page https://software.intel.com/en-us/node/682103#DAAL-EXAMPLE-CPP-NEURAL_NETWORK_BATCH
and I have several questions
1. Can you give me example on how to set input data for this:
net.input.set(training::data, trainingData);
from images in standard formats, like ".jpg",".png", etc?
Cause I want to use my own image collection, not MNIST.
2. How can I set labels(answers) for testing and training image collections?
3. Is it possible to send for training not whole batch at once, but one by one image?
4. Is it possible to decrease learning rate at every training iteration?
Best regards,
Alexander Smirnov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexander,
The DAAL example, provide the sample code and data set under DAAL install folder, you may refer them directly.
In original the traindata is from *.csv
/* Read training data set from a .csv file and create a tensor to store input data
*/TensorPtr trainingData = readTensorFromCSV(trainDatasetFile);
TensorPtr trainingGroundTruth = readTensorFromCSV(trainGroundTruthFile);
which was defined in service.h.
readTensorFromJPG(trainDatasetFile); right? or convert your JPG file to CSV file. then call the exact code. So you actually need one
What tools do you use to decode the JPG or PNG image? For example, in the webinar https://www.codeproject.com/Articles/1151612/A-Performance-Library-for-Data-Analytics-and-Machi. we use Python package to read png image and save png image. To get the image data, I guess, you may to use one of script tools to convert all JPG files to CSV files. Then feed them into net.
Or rewrite the daal::services::SharedPtr<Tensor> readTensorFromJPG(const std::string &datasetFileName). Add JPG image decode code. Like if you are using OpenCV, use Imread should be help you get image data. then deside what size, channel to to the Tensor Array.
Here is example one image's feature data.
27,51,0,-73,21,47,64,-88,84,70,-43,96,75,-34,97,-37,-22,-49,67,79
How can I set labels(answers) for testing and training image collections?
You may add the labels(answers) using the integer as below. like use Python script to set them, when you read them.
The example labels is like
0,
0,
1,
0,
0,
3. Is it possible to send for training not whole batch at once, but one by one image?
It is possible, you may write the loop out of the net. and processing one by one image should be almost same as distribute model. (streamline model). But for good performance and if you have all test image ready, then you can process them batched.
for loop {
net.input.set(training::data, trainingData);
net.input.set(training::groundTruth, trainingGroundTruth);
/* Run the neural network training */
net.compute();
}
4. Is it possible to decrease learning rate at every training iteration?
we will check it later.
Best Regards,
Ying
from PIL import Image
from scipy.misc import imread, imsave, imresize
def processImage(self):
## save png file to binaries: 0, 1
self.doodle.buffer.SaveFile('./test_digit.png', wx.BITMAP_TYPE_PNG)
self.im = np.array(Image.open('./test_digit.png').convert('L'))
self.im = 1*np.logical_not(self.im)
daal::services::SharedPtr<Tensor> readTensorFromCSV(const std::string &datasetFileName)
{
FileDataSource<CSVFeatureManager> dataSource(datasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext);
dataSource.loadDataBlock();
daal::services::SharedPtr<HomogenNumericTable<double> > ntPtr =
daal::services::staticPointerCast<HomogenNumericTable<double>, NumericTable>(dataSource.getNumericTable());
daal::services::Collection<size_t> dims;
dims.push_back(ntPtr->getNumberOfRows());
size_t size = dims[0];
if (ntPtr->getNumberOfColumns() > 1)
{
dims.push_back(ntPtr->getNumberOfColumns());
size *= dims[1];
}
HomogenTensor<float> *tensor = new HomogenTensor<float>( dims, Tensor::doAllocate );
float *tensorData = tensor->getArray();
double *ntData = ntPtr->getArray();
for(size_t i = 0; i < size; i++)
{
tensorData = (float)ntData;
}
daal::services::SharedPtr<Tensor> tensorPtr(tensor);
return tensorPtr;
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexander,
The DAAL example, provide the sample code and data set under DAAL install folder, you may refer them directly.
In original the traindata is from *.csv
/* Read training data set from a .csv file and create a tensor to store input data
*/TensorPtr trainingData = readTensorFromCSV(trainDatasetFile);
TensorPtr trainingGroundTruth = readTensorFromCSV(trainGroundTruthFile);
which was defined in service.h.
readTensorFromJPG(trainDatasetFile); right? or convert your JPG file to CSV file. then call the exact code. So you actually need one
What tools do you use to decode the JPG or PNG image? For example, in the webinar https://www.codeproject.com/Articles/1151612/A-Performance-Library-for-Data-Analytics-and-Machi. we use Python package to read png image and save png image. To get the image data, I guess, you may to use one of script tools to convert all JPG files to CSV files. Then feed them into net.
Or rewrite the daal::services::SharedPtr<Tensor> readTensorFromJPG(const std::string &datasetFileName). Add JPG image decode code. Like if you are using OpenCV, use Imread should be help you get image data. then deside what size, channel to to the Tensor Array.
Here is example one image's feature data.
27,51,0,-73,21,47,64,-88,84,70,-43,96,75,-34,97,-37,-22,-49,67,79
How can I set labels(answers) for testing and training image collections?
You may add the labels(answers) using the integer as below. like use Python script to set them, when you read them.
The example labels is like
0,
0,
1,
0,
0,
3. Is it possible to send for training not whole batch at once, but one by one image?
It is possible, you may write the loop out of the net. and processing one by one image should be almost same as distribute model. (streamline model). But for good performance and if you have all test image ready, then you can process them batched.
for loop {
net.input.set(training::data, trainingData);
net.input.set(training::groundTruth, trainingGroundTruth);
/* Run the neural network training */
net.compute();
}
4. Is it possible to decrease learning rate at every training iteration?
we will check it later.
Best Regards,
Ying
from PIL import Image
from scipy.misc import imread, imsave, imresize
def processImage(self):
## save png file to binaries: 0, 1
self.doodle.buffer.SaveFile('./test_digit.png', wx.BITMAP_TYPE_PNG)
self.im = np.array(Image.open('./test_digit.png').convert('L'))
self.im = 1*np.logical_not(self.im)
daal::services::SharedPtr<Tensor> readTensorFromCSV(const std::string &datasetFileName)
{
FileDataSource<CSVFeatureManager> dataSource(datasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext);
dataSource.loadDataBlock();
daal::services::SharedPtr<HomogenNumericTable<double> > ntPtr =
daal::services::staticPointerCast<HomogenNumericTable<double>, NumericTable>(dataSource.getNumericTable());
daal::services::Collection<size_t> dims;
dims.push_back(ntPtr->getNumberOfRows());
size_t size = dims[0];
if (ntPtr->getNumberOfColumns() > 1)
{
dims.push_back(ntPtr->getNumberOfColumns());
size *= dims[1];
}
HomogenTensor<float> *tensor = new HomogenTensor<float>( dims, Tensor::doAllocate );
float *tensorData = tensor->getArray();
double *ntData = ntPtr->getArray();
for(size_t i = 0; i < size; i++)
{
tensorData = (float)ntData;
}
daal::services::SharedPtr<Tensor> tensorPtr(tensor);
return tensorPtr;
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ying,
Thanks for your reply,
I tried to set input training and testing data from csv using function:
daal::services::SharedPtr<Tensor> readTensorFromCSV(const std::string &datasetFileName) { FileDataSource<CSVFeatureManager> dataSource(datasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext); dataSource.loadDataBlock(); daal::services::SharedPtr<HomogenNumericTable<double> > ntPtr = daal::services::staticPointerCast<HomogenNumericTable<double>, NumericTable>(dataSource.getNumericTable()); daal::services::Collection<size_t> dims; dims.push_back(ntPtr->getNumberOfRows()); size_t size = dims[0]; if (ntPtr->getNumberOfColumns() > 1) { dims.push_back(ntPtr->getNumberOfColumns()); size *= dims[1]; } HomogenTensor<float> *tensor = new HomogenTensor<float>( dims, Tensor::doAllocate ); float *tensorData = tensor->getArray(); double *ntData = ntPtr->getArray(); for(size_t i = 0; i < size; i++) { tensorData = (float)ntData; } daal::services::SharedPtr<Tensor> tensorPtr(tensor); return tensorPtr; }
I created training csv data file for some images 28x28,1 channel for example train data like this:
1 row: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,97,255,164,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,185,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,254,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,254,254,107,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100,254,254,28,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,110,254,254,28,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,110,254,254,80,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,110,254,254,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,197,254,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,14,218,254,254,210,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,254,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,254,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,254,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,19,236,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,202,254,189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,202,254,238,12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,188,254,254,16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,191,254,254,16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,202,254,254,16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,61,215,166,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2 row: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,226,153,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,59,249,223,12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,135,254,254,20,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,14,243,255,217,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,254,254,106,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,118,254,242,37,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,44,244,254,124,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,169,254,254,85,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,83,254,254,228,42,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,106,254,246,37,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,23,206,254,198,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,150,254,240,68,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,89,237,254,149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,212,254,223,32,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,178,255,250,96,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,114,250,254,173,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,179,254,254,190,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,111,254,254,240,39,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,220,254,239,59,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,153,231,37,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...
and training labels like this:
1 row: 0,
2 row: 2,
...
Number of rows in both files equals.
For testing I maked the same things.
Python script for generation csv-files:
import os from PIL import Image from array import * from random import shuffle from __builtin__ import list # Load from and save to Names = [['./training-images','train'], ['./test-images','test']] for name in Names: FileList = [] for dirname in os.listdir(name[0]): path = os.path.join(name[0],dirname) for filename in os.listdir(path): if filename.endswith(".png"): FileList.append(os.path.join(name[0],dirname,filename)) shuffle(FileList) for filename in FileList: label = int(filename.split('\\')[1]) Im = Image.open(filename) pixel = Im.load() width, height = Im.size pixels = [] for x in range(0,width): for y in range(0,height): pixels.append(pixel[y,x]) with open(name[1] + '.csv', 'a+') as outfile: outfile.write(','.join([str(i) for i in pixels]) + "\n") with open(name[1] + '_labels.csv', 'a+') as outfile: outfile.write(str(label) + ',' + "\n")
So after that I set input data:
_trainingData = readTensorFromCSV(datasetFileNamesCSV[0]); _testingData = readTensorFromCSV(datasetFileNamesCSV[2]); _trainingGroundTruth = readTensorFromCSV(datasetFileNamesCSV[1]); _testingGroundTruth = readTensorFromCSV(datasetFileNamesCSV[3]);
My problem is exception: daal::services::interface1::Exception at memory location 0x00F3FAC0
at function net.compute();
void train() { const size_t _batchSize = 1; double learningRate = 0.01; SharedPtr<optimization_solver::sgd::Batch<float> > sgdAlgorithm(new optimization_solver::sgd::Batch<float>()); (*(HomogenNumericTable<double>::cast(sgdAlgorithm->parameter.learningRateSequence)))[0][0] = learningRate; training::TopologyPtr topology = configureNet(); training::Batch<> net; net.parameter.batchSize = _batchSize; net.parameter.optimizationSolver = sgdAlgorithm; //net.parameter.optimizationSolver->parameter->nIterations = 1; net.initialize(_trainingData->getDimensions(), *topology); net.input.set(training::data, _trainingData); net.input.set(training::groundTruth, _trainingGroundTruth); net.compute(); _predictionModel = net.getResult()->get(training::model)->getPredictionModel<double>(); }
What am i doing wrong?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexandr,
Can you please clarify NN topology you use in the example?
In your code snippet for reading csv file, the output tensor will have two dimensions, with the shape = (rows, columns). If the first layer of your topology is convolution, the input to this layer should have four dimensions, with the shape = (batch_size, channels, rows, columns), and you need “reshape” the tensor you got from CSV.
Please, let me know, if you have more questions on this topic, and we will gladly help you.
Ruslan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alexander,
This is the answer for:
4. Is it possible to decrease learning rate at every training iteration?
With present version of Intel DAAL you can modify the learning rate at each solver by running the loop over the set of batches in the dataset as demonstrated below in the code snip:
… SharedPtr<optimization_solver::sgd::Batch<float> > sgdAlgorithm(new optimization_solver::sgd::Batch<float>()); net.parameter.optimizationSolver = sgdAlgorithm; … for (size_t i = 0; i < nBatches; i++) { // fill in trainingDataArray with the next batch of data and trainingGroundTruthArray with respective values of the ground truth for that batch … net.input.set(training::data, trainingDataArray); net.input.set(training::groundTruth, trainingGroundTruthArray); (*(HomogenNumericTable<double>::cast(sgdAlgorithm->parameter.learningRateSequence)))[0][0] = nextLearningRateValue; net.compute(); }
In longer term we consider options to extend the library to hide this logic inside of the method compute() of the neural network object.
Please, let us know if it answers your question.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page