Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, FPGA AI Suite, Software Stack, and Reference Designs
504 Discussions

Eltwise_mult with broadcasting on FPGA

ThanhN
Beginner
2,433 Views

Hi, 

 

I am using FPGA AI suite 2025.1 & OpenVINO 2024.6.0 with DE10-Agilex dev kit. In ".arch" files provided , such as AGX7_Performance.arch, I see "enable_eltwise_mult : true". But it seems not supporting broadcasting.

 

What I want is to perform an element-wise multiplication between two tensors of shapes [1, 1, H, W] and [1, C, H, W], resulting in an output of shape [1, C, H, W], and have this operation executed on the FPGA.

 

I'm wondering if there's a way to do this, or if I'm missing something. I'd appreciate any help

 

Bests.

Labels (1)
0 Kudos
1 Solution
JohnT_Intel
Employee
2,078 Views

Hi,


May I know if you have any other queries?


View solution in original post

0 Kudos
7 Replies
JohnT_Intel
Employee
2,341 Views

Hi,


May I know what do you mean by "enable_eltwise_mult : true" is not working? May I know if you have the FPGA AI Suite license to performed customized bitstream?


0 Kudos
ThanhN
Beginner
2,315 Views

Hi @JohnT_Intel 

 

Some AGX7_**.arch files are included with the FPGA AI Suite 2025.1. In some of these .arch files, the option "enable_eltwise_mult: true" is defined, while in others it is not.

When "enable_eltwise_mult" is set to "true", the FPGA supports element-wise multiplication for tensors with matching shapes, such as [1, 1, H, W] × [1, 1, H, W]. However, it does not support broadcasting—for example, multiplying tensors of shapes [1, 1, H, W] × [1, channel, H, W] is not supported on the FPGA. This is what I meant by "enable_eltwise_mult : true" is not working.

As for the FPGA AI Suite license, I believe it refers to the Quartus Pro license required for building custom bitstreams (.sof files). I do not have a Quartus Pro 2025 license, but I do have a license for Quartus Pro 2021, which I believe was provided by Terasic with the DE10-Agilex development kit.

 

Thanks.

0 Kudos
JohnT_Intel
Employee
2,294 Views

Hi,


The enable_eltwise_mult is used for MobilenetV3 network architecture. May I know what type of the Broadcast network architecture are you using? If there is a different on the implementation then it will not support broadcasting.


0 Kudos
ThanhN
Beginner
2,288 Views

Hi,

 

I’m working with a customized model, and one of the steps involves applying a mask with shape [1, 1, H, W] to crossing 96 channels of a feature-level tensor of shape [1, 96, H, W], which is the output of a convolution layer. In PyTorch, this can be done using torch.mul(tensor1, tensor2) where broadcasting handles the channel mismatch and performs element-wise multiplication.

 

Is there a way to make this operation work directly on an FPGA? I want the entire model computation to be executed on the FPGA without involving the CPU. It seems to me that the "enable_eltwise_mult: true" not perform broadcasting that handles the channel mismatch. I may be wrong.

 

Thanks!

0 Kudos
JohnT_Intel
Employee
2,244 Views

Hi,


You will need to customize the FPGA AI suite design in order for it to support this features.


You will need to contact local sales team (https://www.altera.com/contact.html#4257225834-4085749461)


Thanks


0 Kudos
JohnT_Intel
Employee
2,079 Views

Hi,


May I know if you have any other queries?


0 Kudos
ThanhN
Beginner
2,053 Views
Thanks, Join.

That should be all.
0 Kudos
Reply