0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
Human Action Recognition in Still Images Using ConViT
نویسندگان :
Seyed Rohollah Hosseyni
1
Sanaz Seyedin
2
Hassan Taheri
3
1- Amirkabir University of Technology
2- Amirkabir University of Technology
3- Amirkabir University of Technology
کلمات کلیدی :
Human action recognition،Still images،Convolutional Neural Network،Vision Transformer
چکیده :
Understanding the relationship between different parts of an image is crucial in a variety of applications, including object recognition, scene understanding, and image classification. Despite the fact that Convolutional Neural Networks (CNNs) have demonstrated impressive results in classifying and detecting objects, they lack the capability to extract the relationship between different parts of an image, which is a crucial factor in Human Action Recognition (HAR). To address this problem, this paper proposes a new module that functions like a convolutional layer that uses Vision Transformer (ViT). In the proposed model, the Vision Transformer can complement a convolutional neural network in a variety of tasks by helping it to effectively extract the relationship among various parts of an image. It is shown that the proposed model, compared to a simple CNN, can extract meaningful parts of an image and suppress the misleading parts. The proposed model has been evaluated on the Stanford40 and PASCAL VOC 2012 action datasets and has achieved 95.5% mean Average Precision (mAP) and 91.5% mAP results, respectively, which are promising compared to other state-of-the-art methods.
لیست مقالات
لیست مقالات بایگانی شده
Stability Analysis of Distributed-Order Systems: a Lyapunov Scheme
Vahid Badri
Error Probability Analysis of Non-Orthogonal Multiple Access
Rozita Shafie - AliAkbar Tadaion - Zolfa Zeinalpour-Yazdi
A Bi-Level Attack-Defense Model for the Forecasting False Data Injection Attacks on the Integrated Energy Systems
Maryam Azimi - Hamed Delkhosh - Mahdi Ghaedi
Stable Target Tracking in Wireless Sensor Networks Under Malicious Cyber Attacks
Jafar Akhondali - Mohammad Taheri
Multi-agent H-Learning Based Cooperative Spectrum Sensing for Cognitive Radio Networks
Elaheh Karimpour Fard - Mahdi Nouri - Hamid Behroozi - Sima Sobhi-Givi
A compact 5G MIMO antenna with reduced mutual coupling
Marziyeh Amiri - Ali Ghafoorzadeh-yazdi - Abbas-Ali Heidari
طراحی کنترل آموزش پذیر تکرار شونده مقاوم برای سیستم خط نورد فلزات با رویکرد سیستمهای دو بعدی
علی ردانی پور - مسعود شفیعی
طبقهبندی محیط صوتی با استفاده از ویژگی ترکیبی مبتنی بر فیلتربانک گابور
مسعود گراوانچی زاده - سپیده اختری خسروشاهی - سحر ذاکری
Optimal Energy Management of EVs in intelligent parking lots with Considering solar panels
Noorallah Yavari - Fatemeh Jahanbani Ardakani - Alireza Sedighi Anaraki
Data Association and Multi-Target Localization Using Particle Swarm Optimization
Seyed Mohammad B. Seyedin - Fereidoon Behnia
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0