0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
Human Action Recognition in Still Images Using ConViT
نویسندگان :
Seyed Rohollah Hosseyni
1
Sanaz Seyedin
2
Hassan Taheri
3
1- Amirkabir University of Technology
2- Amirkabir University of Technology
3- Amirkabir University of Technology
کلمات کلیدی :
Human action recognition،Still images،Convolutional Neural Network،Vision Transformer
چکیده :
Understanding the relationship between different parts of an image is crucial in a variety of applications, including object recognition, scene understanding, and image classification. Despite the fact that Convolutional Neural Networks (CNNs) have demonstrated impressive results in classifying and detecting objects, they lack the capability to extract the relationship between different parts of an image, which is a crucial factor in Human Action Recognition (HAR). To address this problem, this paper proposes a new module that functions like a convolutional layer that uses Vision Transformer (ViT). In the proposed model, the Vision Transformer can complement a convolutional neural network in a variety of tasks by helping it to effectively extract the relationship among various parts of an image. It is shown that the proposed model, compared to a simple CNN, can extract meaningful parts of an image and suppress the misleading parts. The proposed model has been evaluated on the Stanford40 and PASCAL VOC 2012 action datasets and has achieved 95.5% mean Average Precision (mAP) and 91.5% mAP results, respectively, which are promising compared to other state-of-the-art methods.
لیست مقالات
لیست مقالات بایگانی شده
Two-Stage Stochastic Modeling for Energymnagement and Control of Virtual Power Plants: Addressing Renewable Energy Challenges
Mohammadreza Mousavi Khademi - Mehdi Zareian Jahromi
Ultra-Low Power Current-Mode ASK Demodulator for Contactless Smart Cards
Somayeh Yousefi - Mohsen Jalali
Modeling and optimal control of the vibration model of constrained buildings based on fractional order singular theory using orthogonal polynomials
Vahid Safari Dehnavi - Masoud Shafiee
Exploring the Impact of Machine Translation on Fake News Detection: A Case Study on Persian Tweets about COVID-19
Masood Hamed Saghayan - Seyedeh Fatemeh Ebrahimi - Mohammad Bahrani
Goodbye Bitcoin: A general framework for migrating to quantum-secure cryptocurrencies
Saeed Banaeian Far - Azadeh Imani Rad - Maryam Rajabzadeh Asaar
Design and Analysis of a New Hybrid Three-Phase Multilevel Inverter with Improved Specifications
Hossein Jafari - Daryoush Nazarpour - Sajjad Golshannavaz - Ebrahim Babaei
3D Microwave Imaging inside PEMC Cavity Using Combined-Norm Regularization Term and Modified CG Algorithm
Omid Babazadeh - Hassan Nasseri
Design of Optimal Iterative Learning Control AutoPilot for Landing Fixed-Wing Aircraft
Ali Raddanipour - Masoud Shafiee
Design and Implementation of a TEM Double-ridge Horn Antenna for Ultra-Wideband Applications
Seyed Navid Seyfossadat - Hassan Zakeri - Ahad Tavakoli - Gholamreza Moradi
Combination of Classifiers to Detecting Grade of Gliblastoma using MRS
Roqaie Moqadam - Nazila Loghmani - Meysam Siyahmansoori - Armin Allahverdy
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.3