Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6556 Discussions

Direct Gradient Extraction in OpenVINO IR Format

Xanph
Beginner
948 Views

Context:

A convolutional network is making predictions on whether there is motion occurring on a live stream. This is running on a lightweight (resource-wise) edge device.

Problem & Goal:

Needing the model to pinpoint where the 'motion' (class) is occurring that influenced the model's prediction.

Targeted Possible Solution:

Use the gradients outputted by the third convolution layer to generate Grad-CAM visualisations (a heatmap over the image of the motion class activation).

 

---

The problem, however, is that OpenVINO IR does not allow for direct access to gradients (based on my research), because the model is designed for efficient inference, understandably.

As a result, the original keras model has been modified to contain the third convolution layer as a second output. So we now have a binary output for motion and a feature map output from the third conv layer.

I then need to approximate what the gradients are by using the Central Differencing Scheme numerical calculation. This requires perturbing the input, re-running inference twice to then calculate the estimated gradient, like so:

 

(python)

def numerical_gradient(infer_request, input_data, output_index, h=1e-3
    grad = np.zeros_like(input_data)
    for i in range(input_data.size):
        input_data_plus_h = np.copy(input_data)
        input_data_minus_h = np.copy(input_data)
        input_data_plus_h.flat[i] += h
        input_data_minus_h.flat[i] -= h

        # Run inference with perturbed inputs
        infer_request.infer(inputs={input_layer: input_data_plus_h})
        f_x_plus_h = infer_request.get_tensor(dense_output_layer).data
        infer_request.infer(inputs={input_layer: input_data_minus_h})
        f_x_minus_h = infer_request.get_tensor(dense_output_layer).data

        if f_x_plus_h.ndim == 3:
            f_x_plus_h = f_x_plus_h[0, -1, 0]

        if f_x_minus_h.ndim == 3:
            f_x_minus_h = f_x_minus_h[0, -1, 0]

        grad.flat[i] = (f_x_plus_h - f_x_minus_h) / (2 * h)
    return grad

 

 

So now the problem is that this implementation involves three inference steps, for each frame inside an LSTM sequence of 50, on a live stream - not very efficient:

 

  • First Inference: The initial inference to get the prediction.
  • Second Inference: A perturbed input with a small positive perturbation for gradient calculation.
  • Third Inference: A perturbed input with a small negative perturbation for gradient calculation.

As an alternative, I could try using Forward Finite Difference, where it only needs a second inference, but this still isn't the best for an edge device with limited GPU resources.

 

I wanted to see if there are any other suggestions on how I can reach my goal of getting the model to state where the motion class is appearing in the scene. Alternatively, I can fall back on to background subtraction, for getting the positions.

 

I also appreciate that there's quite a lot of math in this, for which I am learning on the spot about too!

 

Many thanks,

Xanph

 

(Using OpenVINO Nightly)

Labels (1)
0 Kudos
4 Replies
Vipin_S_Intel
Moderator
855 Views

Hi Flynn, could you please provide us with the following details?

 

  • The exact name and build version of the Intel® Toolkit you’re using.
  • The operating system and its build version.
  • Whether the product has been installed.
  • A detailed explanation of your query, along with a screenshot if possible.

 

To assist you further, we would require these details.


0 Kudos
Iffa_Intel
Moderator
704 Views

Hi,


If you still need help with this issue,


Please help to clarify & share :

  1. Your model framework (eg: Tensorflow,ONNX,etc)
  2. Does the model is a custom model? Could you elaborate the custom part?
  3. Your conversion commands
  4. Relevant model files
  5. Which OpenVINO sample app did you use for inferencing? (if custom please share the specific code)
  6. Your issue right now is, you are not satisfied with the model's inferencing result and wish to improvise am I right?

 

 

 

Cordially,

Iffa


0 Kudos
Xanph
Beginner
230 Views

Apologies for the delay Iffa.

 

I'm using Tensorflow 2.19, but was using 2.17 at the time, and this was the below sequential model with the keras api:

 

model = Sequential(name=f"{model_version}")

model.add(Conv3D(64, kernel_size=3, input_shape=(SEQUENCE_LENGTH, 140, 250, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(1, 2, 2)))

model.add(Conv3D(128, kernel_size=3, padding='same'))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(1, 2, 2)))

model.add(Conv3D(128, kernel_size=3, padding='same', name="last_conv3d"))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(1, 2, 2)))

model.add(TimeDistributed(Flatten()))

# See pattern recognition from frame 0 -> 49 and 49 -> 0.
model.add(Bidirectional(LSTM(32, return_sequences=True)))

model.add(BatchNormalization())

model.add(Dense(1, activation='sigmoid', dtype='float32'))

model.compile(
optimizer=Adam(learning_rate=0.0001),
loss='binary_crossentropy',
metrics=['accuracy', tf.keras.metrics.AUC()]
)

 

Code used for converting the model. At the time I was using OpenVINO 2024 nightly:


model_name = f"{version}"
model_path = f'models/best_keras/{model_name}.h5'
model = keras.models.load_model(model_path)

# Save the model in the SavedModel format
saved_model_dir = f'models/tensorflow/{model_name}_tf'
model.save(saved_model_dir, save_format='tf')

# Convert the SavedModel to OpenVINO IR format with multiple outputs
ir_model = ov.convert_model(
saved_model_dir,
input={"conv3d_input": [1, SEQUENCE_LENGTH, 140, 250, 3]},
output=["last_conv3d", "dense"]
)

# Save the converted IR model
output_dir = "models/intermediate_representation"
ov.save_model(ir_model, f"{output_dir}/ir_{version}.xml")

 

As for the answer to number 5, I was just using the OpenVINO package directly for inference.

For 6, my goal is to extract the Grad-CAM from the last_conv3d layer of the model, in addition to the last dense layer (the prediction).

 

Many thanks, and I now have notifications turned on

0 Kudos
Iffa_Intel
Moderator
635 Views

Hi,


Thank you for your question. If you need any additional information from Intel, please submit a new question as Intel is no longer monitoring this thread. 



Cordially,

Iffa


0 Kudos
Reply