0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
Human Action Recognition in Still Images Using ConViT
نویسندگان :
Seyed Rohollah Hosseyni
1
Sanaz Seyedin
2
Hassan Taheri
3
1- Amirkabir University of Technology
2- Amirkabir University of Technology
3- Amirkabir University of Technology
کلمات کلیدی :
Human action recognition،Still images،Convolutional Neural Network،Vision Transformer
چکیده :
Understanding the relationship between different parts of an image is crucial in a variety of applications, including object recognition, scene understanding, and image classification. Despite the fact that Convolutional Neural Networks (CNNs) have demonstrated impressive results in classifying and detecting objects, they lack the capability to extract the relationship between different parts of an image, which is a crucial factor in Human Action Recognition (HAR). To address this problem, this paper proposes a new module that functions like a convolutional layer that uses Vision Transformer (ViT). In the proposed model, the Vision Transformer can complement a convolutional neural network in a variety of tasks by helping it to effectively extract the relationship among various parts of an image. It is shown that the proposed model, compared to a simple CNN, can extract meaningful parts of an image and suppress the misleading parts. The proposed model has been evaluated on the Stanford40 and PASCAL VOC 2012 action datasets and has achieved 95.5% mean Average Precision (mAP) and 91.5% mAP results, respectively, which are promising compared to other state-of-the-art methods.
لیست مقالات
لیست مقالات بایگانی شده
MAD-TI: Meta-path Aggregated-Graph Attention Network for Drug Target Interaction Prediction
Reza Shami Tanha - Maryam Sadighian - Arash Zabihian - Mohsen Hooshmand - Mohsen Afsharchi
Fault tolerant control design for linear systems based on cubic observers
Mahsa Hasanshahi - Malihe Maghfoori Farsangi - Elham Amini Boroujeni
Ultra-Low Power Current-Mode ASK Demodulator for Contactless Smart Cards
Somayeh Yousefi - Mohsen Jalali
A Dual-Band LPDA Antenna Based on MXene for High-Band 5G Application
Javad Shokri seyyedi - Reza Sarraf Shirazi - Gholamreza Moradi
Connective Reconstruction-based Novelty Detection
Seyyed Morteza Hashemi - Parvaneh Aliniya - Parvin Razzaghi
انتخاب سبد سهام بهینه در بورس تهران با استفاده از تقریب تصادفی انحراف همزمان
زینب گدازگر
تجزیه وابستگی با استفاده از Q-Learning محافظه کار
امیر زارعی - علیرضا خیاطیان - پیمان ستوده
LPV Controller Design for Trajectory Tracking of Nonholonomic Wheeled Mobile Robots in the Presence of Slip
Mohammad Sabouri - Mohammad Hassan Asemani
A High Responsivity Plasmonic Internal Photoemission detector for Optical Communication
Faramarz Alihosseini - Aref Rasoulzadeh Zali - Tavakol Pakizeh - Hesam Zandi
خلاصه سازی ویدیوهای کپسول آندوسکوپی با رویکرد یادگیری انتقالی
محدثه امیریان چایجان - رضا آقائی زاده ظروفی - مسعود رضا سهرابی
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0