The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Pemprosesan berjujukan konvensional pada perisian dengan CPU tujuan umum telah menjadi tidak mencukupi dengan ketara untuk pengiraan berat tertentu disebabkan oleh permintaan tinggi kuasa pemprosesan untuk menyampaikan daya pemprosesan dan prestasi yang mencukupi. Disebabkan oleh banyak sebab, tahap minat yang tinggi boleh diperhatikan untuk pemprosesan video masa nyata berprestasi tinggi pada sistem terbenam. Walau bagaimanapun, platform pemprosesan terbenam dengan prestasi terhad sekurang-kurangnya dapat memenuhi permintaan pemprosesan beberapa pengiraan intensif sedemikian dalam domain penglihatan komputer. Oleh itu, pecutan perkakasan boleh dilihat sebagai penyelesaian ideal di mana pengiraan intensif proses boleh dipercepatkan menggunakan perkakasan khusus aplikasi yang disepadukan dengan CPU tujuan umum. Dalam penyelidikan ini kami telah menumpukan pada membina seni bina khusus aplikasi prestasi tinggi yang selari untuk pemecut perkakasan sedemikian untuk pengiraan HOG-SVM yang dilaksanakan pada Zynq 7000 FPGA. Teknik Histogram Kecerunan Berorientasikan (HOG) digabungkan dengan pengelas berasaskan Mesin Vektor Sokongan (SVM) adalah serba boleh dan sangat popular dalam domain penglihatan komputer berbeza dengan permintaan tinggi untuk kuasa pemprosesan. Disebabkan populariti dan serba boleh, pelbagai penyelidikan terdahulu telah mencuba untuk mendapatkan daya pengeluaran yang mencukupi pada HOG-SVM. Penyelidikan dengan daya pemprosesan tinggi 240FPS pada skala tunggal pada bingkai VGA bersaiz 640x480 keluar ini melaksanakan prestasi kes terbaik pada skala tunggal penyelidikan terdahulu dengan kira-kira faktor 3-4. Selanjutnya ia adalah kira-kira 15x kelajuan berbanding versi perisian dipercepatkan GPU dengan ketepatan yang sama. Penyelidikan ini telah meneroka kemungkinan menggunakan seni bina baru berdasarkan saluran paip dalam, pemprosesan selari dan struktur BRAM untuk mencapai prestasi tinggi pada pengiraan HOG-SVM. Selanjutnya VPU (unit pemprosesan video) yang dibangunkan di atas yang bertindak sebagai pemecut perkakasan akan disepadukan sebagai perkakasan pemprosesan bersama kepada CPU hos menggunakan struktur pemecut tersuai baru dengan bas cip dalam fesyen System-On-Chip (SoC) . Ini boleh digunakan untuk memuatkan aliran video yang berat memproses pengiraan berlebihan kepada VPU manakala kuasa pemprosesan CPU boleh dikekalkan untuk menjalankan aplikasi ringan. Penyelidikan ini tertumpu terutamanya pada teknik seni bina yang digunakan untuk mencapai prestasi yang lebih tinggi pada pemecut perkakasan dan pada struktur pemecut novel yang digunakan untuk mengintegrasikan pemecut dengan CPU hos.
Piyumal RANAWAKA
the University of Moratuwa
Mongkol EKPANYAPONG
Asian Institute of Technology
Adriano TAVARES
University of Minho
Mathew DAILEY
Asian Institute of Technology
Krit ATHIKULWONGSE
National Science and Technology Development Agency
Vitor SILVA
University of Minho
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Piyumal RANAWAKA, Mongkol EKPANYAPONG, Adriano TAVARES, Mathew DAILEY, Krit ATHIKULWONGSE, Vitor SILVA, "High Performance Application Specific Stream Architecture for Hardware Acceleration of HOG-SVM on FPGA" in IEICE TRANSACTIONS on Fundamentals,
vol. E102-A, no. 12, pp. 1792-1803, December 2019, doi: 10.1587/transfun.E102.A.1792.
Abstract: Conventional sequential processing on software with a general purpose CPU has become significantly insufficient for certain heavy computations due to the high demand of processing power to deliver adequate throughput and performance. Due to many reasons a high degree of interest could be noted for high performance real time video processing on embedded systems. However, embedded processing platforms with limited performance could least cater the processing demand of several such intensive computations in computer vision domain. Therefore, hardware acceleration could be noted as an ideal solution where process intensive computations could be accelerated using application specific hardware integrated with a general purpose CPU. In this research we have focused on building a parallelized high performance application specific architecture for such a hardware accelerator for HOG-SVM computation implemented on Zynq 7000 FPGA. Histogram of Oriented Gradients (HOG) technique combined with a Support Vector Machine (SVM) based classifier is versatile and extremely popular in computer vision domain in contrast to high demand for processing power. Due to the popularity and versatility, various previous research have attempted on obtaining adequate throughput on HOG-SVM. This research with a high throughput of 240FPS on single scale on VGA frames of size 640x480 out performs the best case performance on a single scale of previous research by approximately a factor of 3-4. Further it's an approximately 15x speed up over the GPU accelerated software version with the same accuracy. This research has explored the possibility of using a novel architecture based on deep pipelining, parallel processing and BRAM structures for achieving high performance on the HOG-SVM computation. Further the above developed (video processing unit) VPU which acts as a hardware accelerator will be integrated as a co-processing peripheral to a host CPU using a novel custom accelerator structure with on chip buses in a System-On-Chip (SoC) fashion. This could be used to offload the heavy video stream processing redundant computations to the VPU whereas the processing power of the CPU could be preserved for running light weight applications. This research mainly focuses on the architectural techniques used to achieve higher performance on the hardware accelerator and on the novel accelerator structure used to integrate the accelerator with the host CPU.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E102.A.1792/_p
Salinan
@ARTICLE{e102-a_12_1792,
author={Piyumal RANAWAKA, Mongkol EKPANYAPONG, Adriano TAVARES, Mathew DAILEY, Krit ATHIKULWONGSE, Vitor SILVA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={High Performance Application Specific Stream Architecture for Hardware Acceleration of HOG-SVM on FPGA},
year={2019},
volume={E102-A},
number={12},
pages={1792-1803},
abstract={Conventional sequential processing on software with a general purpose CPU has become significantly insufficient for certain heavy computations due to the high demand of processing power to deliver adequate throughput and performance. Due to many reasons a high degree of interest could be noted for high performance real time video processing on embedded systems. However, embedded processing platforms with limited performance could least cater the processing demand of several such intensive computations in computer vision domain. Therefore, hardware acceleration could be noted as an ideal solution where process intensive computations could be accelerated using application specific hardware integrated with a general purpose CPU. In this research we have focused on building a parallelized high performance application specific architecture for such a hardware accelerator for HOG-SVM computation implemented on Zynq 7000 FPGA. Histogram of Oriented Gradients (HOG) technique combined with a Support Vector Machine (SVM) based classifier is versatile and extremely popular in computer vision domain in contrast to high demand for processing power. Due to the popularity and versatility, various previous research have attempted on obtaining adequate throughput on HOG-SVM. This research with a high throughput of 240FPS on single scale on VGA frames of size 640x480 out performs the best case performance on a single scale of previous research by approximately a factor of 3-4. Further it's an approximately 15x speed up over the GPU accelerated software version with the same accuracy. This research has explored the possibility of using a novel architecture based on deep pipelining, parallel processing and BRAM structures for achieving high performance on the HOG-SVM computation. Further the above developed (video processing unit) VPU which acts as a hardware accelerator will be integrated as a co-processing peripheral to a host CPU using a novel custom accelerator structure with on chip buses in a System-On-Chip (SoC) fashion. This could be used to offload the heavy video stream processing redundant computations to the VPU whereas the processing power of the CPU could be preserved for running light weight applications. This research mainly focuses on the architectural techniques used to achieve higher performance on the hardware accelerator and on the novel accelerator structure used to integrate the accelerator with the host CPU.},
keywords={},
doi={10.1587/transfun.E102.A.1792},
ISSN={1745-1337},
month={December},}
Salinan
TY - JOUR
TI - High Performance Application Specific Stream Architecture for Hardware Acceleration of HOG-SVM on FPGA
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1792
EP - 1803
AU - Piyumal RANAWAKA
AU - Mongkol EKPANYAPONG
AU - Adriano TAVARES
AU - Mathew DAILEY
AU - Krit ATHIKULWONGSE
AU - Vitor SILVA
PY - 2019
DO - 10.1587/transfun.E102.A.1792
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E102-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2019
AB - Conventional sequential processing on software with a general purpose CPU has become significantly insufficient for certain heavy computations due to the high demand of processing power to deliver adequate throughput and performance. Due to many reasons a high degree of interest could be noted for high performance real time video processing on embedded systems. However, embedded processing platforms with limited performance could least cater the processing demand of several such intensive computations in computer vision domain. Therefore, hardware acceleration could be noted as an ideal solution where process intensive computations could be accelerated using application specific hardware integrated with a general purpose CPU. In this research we have focused on building a parallelized high performance application specific architecture for such a hardware accelerator for HOG-SVM computation implemented on Zynq 7000 FPGA. Histogram of Oriented Gradients (HOG) technique combined with a Support Vector Machine (SVM) based classifier is versatile and extremely popular in computer vision domain in contrast to high demand for processing power. Due to the popularity and versatility, various previous research have attempted on obtaining adequate throughput on HOG-SVM. This research with a high throughput of 240FPS on single scale on VGA frames of size 640x480 out performs the best case performance on a single scale of previous research by approximately a factor of 3-4. Further it's an approximately 15x speed up over the GPU accelerated software version with the same accuracy. This research has explored the possibility of using a novel architecture based on deep pipelining, parallel processing and BRAM structures for achieving high performance on the HOG-SVM computation. Further the above developed (video processing unit) VPU which acts as a hardware accelerator will be integrated as a co-processing peripheral to a host CPU using a novel custom accelerator structure with on chip buses in a System-On-Chip (SoC) fashion. This could be used to offload the heavy video stream processing redundant computations to the VPU whereas the processing power of the CPU could be preserved for running light weight applications. This research mainly focuses on the architectural techniques used to achieve higher performance on the hardware accelerator and on the novel accelerator structure used to integrate the accelerator with the host CPU.
ER -