0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
High-Resolution Remote Sensing Image Captioning Based on Structured Attention and SAM Network
نویسندگان :
Yassin Riyazi
1
Seyyed Mostafa Sadjadi
2
Abbas Zohrevand
3
Reshad Hosseini
4
1- دانشگاه تهران
2- دانشگاه تهران
3- دانشگاه تهران
4- دانشگاه تهران
کلمات کلیدی :
image captioning،image segmentation،remote sensing image،structured attention
چکیده :
Due to its broad applications, remote sensing image captioning (RSIC) has gained popularity in recent years. However, it poses extra challenges for containing low-resolution images with highly structured semantic content. By incorporating image labeling and segmentation, this work expands on the RSIC framework developed by Zhao et al. [1]. The method presents a structured attention module that highlights important semantic components to maintain a geometric and structured shape. The quality and edge emphasis of UCM-captioned photographs is improved by upsampling them to 512×512 pixels. Using the Segment Anything Model (SAM) produces better image proposals, leading to higher accuracy than traditional techniques. A balanced output of large- and small-object masks is facilitated by SAM's promptability. The decoder can more easily learn a suitable statistical model using the model's spatial structure to provide an all-encompassing attention map. The effects of multiple hyperparameters, such as teacher forcing, the number of region proposals, and the effects of DSR and AVR loss factors, are investigated in this work. Overall, by combining image labeling and segmentation, this research improves remote sensing capabilities. It also shows how well the structured attention module and SAM work together to improve accuracy and consider different hyperparameter issues.
لیست مقالات
لیست مقالات بایگانی شده
Multi-Bit Memory Architecture for In-memory Computing using In-Plane MTJ
Milad Ashtari Gargari - Nima Eslami - Mohammad Hossein Moaiyeri
Scalable Multipurpose Smart Indoor Lighting System for Wireless Sensor Networks
Atefesadat Seyedolhosseini - Reza Nemati - Hossein Maghsoumi - Shokrollah Karimian - Nasser Masoumi
Realization of a high-resolution plasmonic refractive index sensor based on double-nanodisk shaped resonators
Leila Hajshahvaladi - Hassan Kaatuzian - Mohammad Danaie - Ghazaleh Nourbakhsh
Radio frequency energy harvesting with multi band rectenna in GSM 1800, UMTS2100 and WiFi
Sahar Bayat - Zahra Bahrami - Asghar Keshtkar
بیشینه سازی ظرفیت در رله های تمام دوطرفه تک مسیره با در نظر گرفتن اختلالات سخت افزاری
حسین حصاربنی - زهرا کشاورز گندمانی
RDOD: A Robust Distance-based Technique for Outlier Detection
Reza Heydari gharaei - Hossein Nezamabadi-pour
Entanglement-Assisted Classical-Quantum Multiple Access Wiretap Channel: One-shot Achievable Rate Region
Hadi Aghaee - Bahareh Akhbari
Online Estimation of Power System Inertia Using Electromechanical Oscillation Parameters with High Penetration of Renewables
Shwan Sheikhahmadi - Ali Hesami Naghshbandy - Ayda Faraji
Parkinson’s Disease Classification Using Continuous Wavelet Transform and Ensemble Convolutional Neural Networks on EEG Signals
Seyed Pedram Monazami - Raheleh Davoodi
A Low-Cost Linearized Analog Resolver-To-DC Converter
Seyed Ali Samareh-TaheriNasab - Mohammad Sadegh KhajueeZadeh - Zahra Nasiri-Gheeidari - Samad Sheikhaei
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4