0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
Human Action Recognition in Still Images Using ConViT
نویسندگان :
Seyed Rohollah Hosseyni
1
Sanaz Seyedin
2
Hassan Taheri
3
1- Amirkabir University of Technology
2- Amirkabir University of Technology
3- Amirkabir University of Technology
کلمات کلیدی :
Human action recognition،Still images،Convolutional Neural Network،Vision Transformer
چکیده :
Understanding the relationship between different parts of an image is crucial in a variety of applications, including object recognition, scene understanding, and image classification. Despite the fact that Convolutional Neural Networks (CNNs) have demonstrated impressive results in classifying and detecting objects, they lack the capability to extract the relationship between different parts of an image, which is a crucial factor in Human Action Recognition (HAR). To address this problem, this paper proposes a new module that functions like a convolutional layer that uses Vision Transformer (ViT). In the proposed model, the Vision Transformer can complement a convolutional neural network in a variety of tasks by helping it to effectively extract the relationship among various parts of an image. It is shown that the proposed model, compared to a simple CNN, can extract meaningful parts of an image and suppress the misleading parts. The proposed model has been evaluated on the Stanford40 and PASCAL VOC 2012 action datasets and has achieved 95.5% mean Average Precision (mAP) and 91.5% mAP results, respectively, which are promising compared to other state-of-the-art methods.
لیست مقالات
لیست مقالات بایگانی شده
Switched-Inductor Cuk and SEPIC Power Factor Correction Rectifiers
Maryam Pourmahdi-torghabe - Hamed Heydari-doostabad - Reza Ghazi
A 0.5-V Ultra-Low-Power Low-Pass-filter with Low Noise for ECG detection system
Yasin Heydarzadeh - Mehran Khanehbeygi - Sajad Sohrabian - Ziaddin Daie Koozehkanani
Spotting of a Particular Printed Word in Farsi Handwritten Forms Using Deep Learning
Mohammad jafar Gholami Kenari - Ehsanollah Kabir
Constructing a security network for improving the information vulnerability of transmission systems observability
Vahid Sohrabi Tabar - Saeid Ghassemzadeh - Sajjad Tohidi - Pierluigi Siano
A 2D Geometry Based Grasping Pose Generation Algorithm for a Two-finger Robot Hand
Arash Akbari - Arman Akbari - Mehdi Tale Masouleh
A New High Voltage Gain Z-Source Based DC-DC Converter for High-Power DG Applications
Sakina Bakhshi - Reza Beiranvand
مبدل زمان پیوسته سیگما دلتا با پهنای باند 200k-28M مناسب برای گیرنده های باند پایه3G,4G
فائزه جسور قره باغ - مرتضی موسی زاده
Binomial Distribution based K-means for Graph Partitioning Approach in Partially Reconfigurable Computing system
Zahra Asgari - Maryam Sadat Mastoori
Multi-agent H-Learning Based Cooperative Spectrum Sensing for Cognitive Radio Networks
Elaheh Karimpour Fard - Mahdi Nouri - Hamid Behroozi - Sima Sobhi-Givi
Enhancing Kriging with Inductive Spatio-Temporal GraphODE
Amin Sheykhzadeh - Behzad Moshiri - Ebrahim Ghafar-Zadeh
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 41.7.4