Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16606 Discussions

OpenSource OpenCL design of CNNs Accelerator on FPGAs for everyone :)

Altera_Forum
Honored Contributor II
1,409 Views

I have designed an OpenCL-based Accelerator for Convolutional Neural Networks on FPGAs, it can be found on github.com/doonny/PipeCNN. 

 

Currently, AlexNet and VGG-166 models are tested on DE5-net boards. 

 

Hope someone maybe interested to join us to improve the design. 

 

Cheers,
0 Kudos
3 Replies
Altera_Forum
Honored Contributor II
474 Views

I'm very interested in your work,and thank you for your code, but i can't compile the conv_pipe.cl because I can't find the RTL folder.

0 Kudos
Altera_Forum
Honored Contributor II
474 Views

 

--- Quote Start ---  

I'm very interested in your work,and thank you for your code, but i can't compile the conv_pipe.cl because I can't find the RTL folder. 

--- Quote End ---  

 

Hi, I found the RTL directory under the project -> device directory. 

I did manage to get it to run the AlexNet on a DE10-Standard Board with a total Kernel Run time of 298ms 

J
0 Kudos
Altera_Forum
Honored Contributor II
474 Views

Hi, 

Thanks for providing code.  

 

I am able to run the demo using OpenCV with following o/p for Alexnet: 

Loading kernel/binary from file project/conv.aocx 

Reprogramming device [0] with handle 1 

61063552 total weights read  

1024 total output reference read  

Loading picture ./data/ILSVRC2012_val_00000001.JPEG ..... 

Executing Layer 1: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching single work-item kernel Pooling 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 27, 27, 96) 

Launching kernel lrn with local size: 1, 1, 24 (global size: 27, 27, 24) 

Executing Layer 2: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching single work-item kernel Pooling 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 13, 13, 256) 

Launching kernel lrn with local size: 1, 1, 64 (global size: 13, 13, 64) 

Executing Layer 3: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 13, 13, 384) 

Executing Layer 4: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 13, 13, 384) 

Executing Layer 5: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching single work-item kernel Pooling 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 6, 6, 256) 

Executing Layer 6: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 1, 1, 4096) 

Executing Layer 7: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 1, 1, 4096) 

Executing Layer 8: 

Launching single work-item kernel winbuffer 

Launching single work-item kernel Conv 

Launching kernel MemWr with local size: 1, 1, 16 (global size: 1, 1, 1024) 

Done! Inference time is 0.045396s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n01737021 water snake (the prob is 37.00)  

False: True_label = 970 Inferred_label = 58 

Current Top-1 accuracy = 0.000 

Current Top-5 accuracy = 1.000 

Loading picture ./data/ILSVRC2012_val_00000002.JPEG ..... 

Done! Inference time is 0.044439s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n04228054 ski (the prob is 46.00)  

False: True_label = 230 Inferred_label = 795 

Current Top-1 accuracy = 0.000 

Current Top-5 accuracy = 1.000 

Loading picture ./data/ILSVRC2012_val_00000003.JPEG ..... 

Done! Inference time is 0.044546s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n02105855 Shetland sheepdog, Shetland sheep dog, Shetland (the prob is 99.00)  

Current Top-1 accuracy = 0.333 

Current Top-5 accuracy = 1.000 

Loading picture ./data/ILSVRC2012_val_00000004.JPEG ..... 

Done! Inference time is 0.044284s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n07920052 espresso (the prob is 81.00)  

False: True_label = 516 Inferred_label = 967 

Current Top-1 accuracy = 0.250 

Current Top-5 accuracy = 0.750 

Loading picture ./data/ILSVRC2012_val_00000005.JPEG ..... 

Done! Inference time is 0.044279s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n03125729 cradle (the prob is 34.00)  

Current Top-1 accuracy = 0.400 

Current Top-5 accuracy = 0.800 

Loading picture ./data/ILSVRC2012_val_00000006.JPEG ..... 

Done! Inference time is 0.044487s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n01755581 diamondback, diamondback rattlesnake, Crotalus adamanteus (the prob is 51.00)  

False: True_label = 334 Inferred_label = 67 

Current Top-1 accuracy = 0.333 

Current Top-5 accuracy = 0.833 

Loading picture ./data/ILSVRC2012_val_00000007.JPEG ..... 

Done! Inference time is 0.044391s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n02346627 porcupine, hedgehog (the prob is 99.00)  

Current Top-1 accuracy = 0.429 

Current Top-5 accuracy = 0.857 

Loading picture ./data/ILSVRC2012_val_00000008.JPEG ..... 

Done! Inference time is 0.044590s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n03063599 coffee mug (the prob is 56.00)  

False: True_label = 674 Inferred_label = 504 

Current Top-1 accuracy = 0.375 

Current Top-5 accuracy = 0.750 

Loading picture ./data/ILSVRC2012_val_00000009.JPEG ..... 

Done! Inference time is 0.044468s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n03201208 dining table, board (the prob is 26.00)  

False: True_label = 332 Inferred_label = 532 

Current Top-1 accuracy = 0.333 

Current Top-5 accuracy = 0.667 

Loading picture ./data/ILSVRC2012_val_00000010.JPEG ..... 

Done! Inference time is 0.044281s  

Copied all batched results from fc_2 buffers. 

Selected item = 0 from the combined batch results in fc buffers 

The inference result is n04399382 teddy, teddy bear (the prob is 36.00)  

False: True_label = 0 Inferred_label = 850 

Current Top-1 accuracy = 0.300 

Current Top-5 accuracy = 0.700 

Demo exited !!! 

Total number of 10 pictures have been processed. 

Final Top-1 accuracy = 0.300 

Final Top-5 accuracy = 0.700 

 

Any suggestion to improve the results using any optimization techniques...? 

 

Thanks
0 Kudos
Reply