- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I am adding number of hidden layers or changing number of neurons in last layer in
https://github.com/01org/daal/blob/daal_2018_beta_update1/examples/java/com/intel/daal/examples/neural_networks/NeuralNetConfiguratorDistr.java
Then I am getting NaN values.
I am running this java file -
https://github.com/01org/daal/blob/daal_2018_beta_update1/examples/java/com/intel/daal/examples/neural_networks/NeuralNetDenseDistr.java
I am testing neural network on MNIST data which contains 10 labels. So I am changing neurons to 10 in line 60 of NeuralNetConfiguratorDistr.java
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Per our analysis, there is a bug in the distributed training of a neural network in the present version of Intel DAAL: weights and biases are not properly initialized in the beginning of the computations.
We plan to fix it in the future versions of the library.
The workaround for this bug is below: add the piece of the code
if (i == 0) { /* Retrieve training model of the neural network on master node */ TrainingModel trainingModelOnMaster = net.getResult().get(TrainingResultId.model); /* Retrieve training model of the neural network on local node */ TrainingModel trainingModelOnLocal = netLocal[0].input.get(DistributedStep1LocalInputId.inputModel); /* Set weights and biases on master node using the weights and biases from local node */ trainingModelOnMaster.setWeightsAndBiases(trainingModelOnLocal.getWeightsAndBiases()); /* Set initialization flag parameter as true in all forward layers of the training model on master node */ ForwardLayers forwardLayers = trainingModelOnMaster.getForwardLayers(); for (int j = 0; j < forwardLayers.size(); j++) { forwardLayers.get(j).getLayerParameter().setWeightsAndBiasesInitializationFlag(true); } }
into the example NeuralNetDenseDistr.java, line 156.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Mayank J,
I assume this is the same issue that was reported recently on Intel® DAAL GitHub (https://github.com/01org/daal/issues/18).
We are running the analysis of the issue on our side. Please give us several days and we will get back to you with the results.
Best regards,
Victoriya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Per our analysis, there is a bug in the distributed training of a neural network in the present version of Intel DAAL: weights and biases are not properly initialized in the beginning of the computations.
We plan to fix it in the future versions of the library.
The workaround for this bug is below: add the piece of the code
if (i == 0) { /* Retrieve training model of the neural network on master node */ TrainingModel trainingModelOnMaster = net.getResult().get(TrainingResultId.model); /* Retrieve training model of the neural network on local node */ TrainingModel trainingModelOnLocal = netLocal[0].input.get(DistributedStep1LocalInputId.inputModel); /* Set weights and biases on master node using the weights and biases from local node */ trainingModelOnMaster.setWeightsAndBiases(trainingModelOnLocal.getWeightsAndBiases()); /* Set initialization flag parameter as true in all forward layers of the training model on master node */ ForwardLayers forwardLayers = trainingModelOnMaster.getForwardLayers(); for (int j = 0; j < forwardLayers.size(); j++) { forwardLayers.get(j).getLayerParameter().setWeightsAndBiasesInitializationFlag(true); } }
into the example NeuralNetDenseDistr.java, line 156.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, It is working now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Victoriya Have you tried with increasing number of hidden layer?
I am getting NaN values when I am adding a hidden layer.
I am attaching code of my NeuralNetConfiguratorDistr.java file -
/* file: NeuralNetConfiguratorDistr.java */ /******************************************************************************* * Copyright 2014-2017 Intel Corporation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *******************************************************************************/ /* // Content: // Java example of neural network configurator //////////////////////////////////////////////////////////////////////////////// */ package com.intel.daal.examples.neural_networks; import com.intel.daal.algorithms.neural_networks.*; import com.intel.daal.algorithms.neural_networks.initializers.uniform.*; import com.intel.daal.algorithms.neural_networks.training.TrainingTopology; import com.intel.daal.algorithms.neural_networks.layers.fullyconnected.*; import com.intel.daal.algorithms.neural_networks.layers.softmax_cross.*; import com.intel.daal.algorithms.neural_networks.layers.LayerDescriptor; import com.intel.daal.algorithms.neural_networks.layers.NextLayers; import com.intel.daal.algorithms.neural_networks.layers.ForwardLayer; import com.intel.daal.algorithms.neural_networks.layers.BackwardLayer; import com.intel.daal.examples.utils.Service; import com.intel.daal.services.DaalContext; /** * <a name="DAAL-EXAMPLE-JAVA-NEURALNETWORKCONFIGURATORDISTR"> * @example NeuralNetConfiguratorDistr.java */ class NeuralNetConfiguratorDistr { public static TrainingTopology configureNet(DaalContext context) { /* Create layers of the neural network */ /* Create fully-connected layer and initialize layer parameters */ FullyConnectedBatch fullyconnectedLayer1 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 20); fullyconnectedLayer1.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, -0.001, 0.001)); fullyconnectedLayer1.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0, 0.5)); /* Create fully-connected layer and initialize layer parameters */ FullyConnectedBatch fullyconnectedLayer2 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 40); fullyconnectedLayer2.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1)); fullyconnectedLayer2.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1)); FullyConnectedBatch fullyconnectedLayerTest = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 40); fullyconnectedLayerTest.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1)); fullyconnectedLayerTest.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1)); /* Create fully-connected layer and initialize layer parameters */ FullyConnectedBatch fullyconnectedLayer3 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 2); fullyconnectedLayer3.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, -0.005, 0.005)); fullyconnectedLayer3.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0, 1)); /* Create softmax cross-entropy loss layer and initialize layer parameters */ SoftmaxCrossBatch softmaxCrossEntropyLayer = new SoftmaxCrossBatch(context, Float.class, SoftmaxCrossMethod.defaultDense); /* Create topology of the neural network */ TrainingTopology topology = new TrainingTopology(context); /* Add layers to the topology of the neural network */ long fc1 = topology.add(fullyconnectedLayer1); long fc2 = topology.add(fullyconnectedLayer2); long fcTest = topology.add(fullyconnectedLayerTest); long fc3 = topology.add(fullyconnectedLayer3); long sm = topology.add(softmaxCrossEntropyLayer); topology.addNext(fc1, fc2); topology.addNext(fc2, fcTest); topology.addNext(fcTest, fc3); topology.addNext(fc3, sm); return topology; } }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have not yet tried to increase the number of hidden layers in example.
Please give me couple of days to run the analysis with your code. After that the results of the analysis will be provided to you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The investigation shown that SGD algorithm that is used on the training stage starts to diverge in the example after adding one more hidden layer. You can check it yourself by printing the numeric table of weights and biases (wb) on each iteration. The weights and biases grow very fast and eventually become NaNs. This unlimited growth of weights and biases shows that the SGD algorithm diverges.
There are several options available to make the optimization solver converge:
- Make the learning rate of the SGD smaller. Learning rate 0.00001 should work fine.
- Use another optimization solver. Try AdaGrad or other methods of SGD: mini-batch and momentum.
Best regards,
Victoriya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Victoria
I was already checking the numeric table on each iteration. But I was not sure that it is happening due to optimization solver.
Thank you very much. It is working now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
fyi - the fix of the problem available into DAAL v.2018 ( released at the middle of September 2017)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page