Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Mayank_Jindal
Beginner
111 Views

JAVA interface of neural network is giving NaN values

Jump to solution

When I am adding number of hidden layers or changing number of neurons in last layer in 
https://github.com/01org/daal/blob/daal_2018_beta_update1/examples/java/com/intel/daal/examples/neur...

Then I am getting NaN values.
I am running this java file - 
https://github.com/01org/daal/blob/daal_2018_beta_update1/examples/java/com/intel/daal/examples/neur...

I am testing neural network on MNIST data which contains 10 labels. So I am changing neurons to 10 in line 60 of NeuralNetConfiguratorDistr.java
 

0 Kudos

Accepted Solutions
111 Views

Per our analysis, there is a bug in the distributed training of a neural network in the present version of Intel DAAL: weights and biases are not properly initialized in the beginning of the computations.
We plan to fix it in the future versions of the library.

The workaround for this bug is below: add the piece of the code

if (i == 0) {
    /* Retrieve training model of the neural network on master node */
    TrainingModel trainingModelOnMaster = net.getResult().get(TrainingResultId.model);
    /* Retrieve training model of the neural network on local node */
    TrainingModel trainingModelOnLocal  = netLocal[0].input.get(DistributedStep1LocalInputId.inputModel);

    /* Set weights and biases on master node using the weights and biases from local node */
    trainingModelOnMaster.setWeightsAndBiases(trainingModelOnLocal.getWeightsAndBiases());

    /* Set initialization flag parameter as true in all forward layers of the training model on master node */
    ForwardLayers forwardLayers = trainingModelOnMaster.getForwardLayers();
    for (int j = 0; j < forwardLayers.size(); j++) {
        forwardLayers.get(j).getLayerParameter().setWeightsAndBiasesInitializationFlag(true);
    }
}

into the example NeuralNetDenseDistr.java, line 156.

View solution in original post

8 Replies
111 Views

Hello Mayank J,

I assume this is the same issue that was reported recently on Intel® DAAL GitHub (https://github.com/01org/daal/issues/18).

We are running the analysis of the issue on our side. Please give us several days and we will get back to you with the results.

Best regards,

Victoriya

112 Views

Per our analysis, there is a bug in the distributed training of a neural network in the present version of Intel DAAL: weights and biases are not properly initialized in the beginning of the computations.
We plan to fix it in the future versions of the library.

The workaround for this bug is below: add the piece of the code

if (i == 0) {
    /* Retrieve training model of the neural network on master node */
    TrainingModel trainingModelOnMaster = net.getResult().get(TrainingResultId.model);
    /* Retrieve training model of the neural network on local node */
    TrainingModel trainingModelOnLocal  = netLocal[0].input.get(DistributedStep1LocalInputId.inputModel);

    /* Set weights and biases on master node using the weights and biases from local node */
    trainingModelOnMaster.setWeightsAndBiases(trainingModelOnLocal.getWeightsAndBiases());

    /* Set initialization flag parameter as true in all forward layers of the training model on master node */
    ForwardLayers forwardLayers = trainingModelOnMaster.getForwardLayers();
    for (int j = 0; j < forwardLayers.size(); j++) {
        forwardLayers.get(j).getLayerParameter().setWeightsAndBiasesInitializationFlag(true);
    }
}

into the example NeuralNetDenseDistr.java, line 156.

View solution in original post

Mayank_Jindal
Beginner
111 Views

Thanks, It is working now.

Mayank_Jindal
Beginner
111 Views

@Victoriya Have you tried with increasing number of hidden layer? 

I am getting NaN values when I am adding a hidden layer.

 

I am attaching code of my NeuralNetConfiguratorDistr.java file - 

/* file: NeuralNetConfiguratorDistr.java */
/*******************************************************************************
* Copyright 2014-2017 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
*     http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*******************************************************************************/

/*
 //  Content:
 //     Java example of neural network configurator
 ////////////////////////////////////////////////////////////////////////////////
 */

package com.intel.daal.examples.neural_networks;

import com.intel.daal.algorithms.neural_networks.*;
import com.intel.daal.algorithms.neural_networks.initializers.uniform.*;
import com.intel.daal.algorithms.neural_networks.training.TrainingTopology;
import com.intel.daal.algorithms.neural_networks.layers.fullyconnected.*;
import com.intel.daal.algorithms.neural_networks.layers.softmax_cross.*;
import com.intel.daal.algorithms.neural_networks.layers.LayerDescriptor;
import com.intel.daal.algorithms.neural_networks.layers.NextLayers;
import com.intel.daal.algorithms.neural_networks.layers.ForwardLayer;
import com.intel.daal.algorithms.neural_networks.layers.BackwardLayer;
import com.intel.daal.examples.utils.Service;
import com.intel.daal.services.DaalContext;

/**
 * <a name="DAAL-EXAMPLE-JAVA-NEURALNETWORKCONFIGURATORDISTR">
 * @example NeuralNetConfiguratorDistr.java
 */
class NeuralNetConfiguratorDistr {
    public static TrainingTopology configureNet(DaalContext context) {
        /* Create layers of the neural network */
        /* Create fully-connected layer and initialize layer parameters */
        FullyConnectedBatch fullyconnectedLayer1 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 20);

        fullyconnectedLayer1.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, -0.001, 0.001));

        fullyconnectedLayer1.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0, 0.5));

        /* Create fully-connected layer and initialize layer parameters */
        FullyConnectedBatch fullyconnectedLayer2 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 40);

        fullyconnectedLayer2.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1));

        fullyconnectedLayer2.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1));


        FullyConnectedBatch fullyconnectedLayerTest = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 40);

        fullyconnectedLayerTest.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1));

        fullyconnectedLayerTest.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0.5, 1));


        /* Create fully-connected layer and initialize layer parameters */
        FullyConnectedBatch fullyconnectedLayer3 = new FullyConnectedBatch(context, Float.class, FullyConnectedMethod.defaultDense, 2);

        fullyconnectedLayer3.parameter.setWeightsInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, -0.005, 0.005));

        fullyconnectedLayer3.parameter.setBiasesInitializer(new UniformBatch(context, Float.class, UniformMethod.defaultDense, 0, 1));

        /* Create softmax cross-entropy loss layer and initialize layer parameters */
        SoftmaxCrossBatch softmaxCrossEntropyLayer = new SoftmaxCrossBatch(context, Float.class, SoftmaxCrossMethod.defaultDense);

        /* Create topology of the neural network */
        TrainingTopology topology = new TrainingTopology(context);

        /* Add layers to the topology of the neural network */
        long fc1 = topology.add(fullyconnectedLayer1);
        long fc2 = topology.add(fullyconnectedLayer2);
        long fcTest = topology.add(fullyconnectedLayerTest);
        long fc3 = topology.add(fullyconnectedLayer3);
        long sm = topology.add(softmaxCrossEntropyLayer);
        topology.addNext(fc1, fc2);
        topology.addNext(fc2, fcTest);
        topology.addNext(fcTest, fc3);
        topology.addNext(fc3, sm);
        return topology;
    }
}

 

111 Views

I have not yet tried to increase the number of hidden layers in example.

Please give me couple of days to run the analysis with your code. After that the results of the analysis will be provided to you.

111 Views

The investigation shown that SGD algorithm that is used on the training stage starts to diverge in the example after adding one more hidden layer. You can check it yourself by printing the numeric table of weights and biases (wb) on each iteration. The weights and biases grow very fast and eventually become NaNs. This unlimited growth of weights and biases shows that the SGD algorithm diverges.

There are several options available to make the optimization solver converge:

  • Make the learning rate of the SGD smaller. Learning rate 0.00001 should work fine.
  • Use another optimization solver. Try AdaGrad or other methods of SGD: mini-batch and momentum.

Best regards,

Victoriya

Mayank_Jindal
Beginner
111 Views

@Victoria

I was already checking the numeric table on each iteration. But I was not sure that it is happening due to optimization solver.

Thank you very much. It is working now.

Gennady_F_Intel
Moderator
111 Views

fyi - the fix of the problem available into DAAL v.2018 ( released at the middle of September 2017)