0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
High-Resolution Remote Sensing Image Captioning Based on Structured Attention and SAM Network
نویسندگان :
Yassin Riyazi
1
Seyyed Mostafa Sadjadi
2
Abbas Zohrevand
3
Reshad Hosseini
4
1- دانشگاه تهران
2- دانشگاه تهران
3- دانشگاه تهران
4- دانشگاه تهران
کلمات کلیدی :
image captioning،image segmentation،remote sensing image،structured attention
چکیده :
Due to its broad applications, remote sensing image captioning (RSIC) has gained popularity in recent years. However, it poses extra challenges for containing low-resolution images with highly structured semantic content. By incorporating image labeling and segmentation, this work expands on the RSIC framework developed by Zhao et al. [1]. The method presents a structured attention module that highlights important semantic components to maintain a geometric and structured shape. The quality and edge emphasis of UCM-captioned photographs is improved by upsampling them to 512×512 pixels. Using the Segment Anything Model (SAM) produces better image proposals, leading to higher accuracy than traditional techniques. A balanced output of large- and small-object masks is facilitated by SAM's promptability. The decoder can more easily learn a suitable statistical model using the model's spatial structure to provide an all-encompassing attention map. The effects of multiple hyperparameters, such as teacher forcing, the number of region proposals, and the effects of DSR and AVR loss factors, are investigated in this work. Overall, by combining image labeling and segmentation, this research improves remote sensing capabilities. It also shows how well the structured attention module and SAM work together to improve accuracy and consider different hyperparameter issues.
لیست مقالات
لیست مقالات بایگانی شده
Low VHF Wire Antenna with Low-cost and Wideband Properties
Mahdieh Bozorgi - Mahmood Rafaei-booket - Sina Hasibi-Taheri
Job Title Prediction from Tweets Using Word Embedding and Deep Neural Networks
Shayan Vassef - Ramin Toosi - Mohammad Ali Akhaee
استفاده از طیفنگاری مادون قرمز نزدیک کارکردی جهت بررسی اثر پشیمانی بر تصمیمگیری خودکنترلی
جاوید بکرانی - سید کمال الدین ستاره دان - عبدالحسین وهابی
DOA estimation of acoustic signals using stacked products of cross-correlations and coherence factor
Mojtaba Amiri - Amir Akhavan - Ahmad Tavakol - Ehsan Rouhani
Optimal Sizing and Placing of Capacitors in Distribution Networks in the Presence of Three-Phase Induction Motors Using Genetic Algorithm
Seyed Amir Hossein Mohamadi - Seyed Amir Mohammad Lahaghi - Shayan Nazari - Behrooz Zaker
Performance Analysis of an UAV-assisted cognitive D2D communication-based Disaster Response Network
Hossein Mohammadi Firozjae - Javad Zeraatkar Moghaddam - Mehrdad Ardebilipour
بازشناسی مقاوم زمانی – مکانی انسان در یک سیستم نظارتی بر اساس شبکه GAN
آزاده سادات موسوی - شهریار برادران شکوهی
Optimized ANFIS-based Control Design Using Genetic Algorithm to Obtain the Vaccination and Isolation Rates for the COVID-19
Zohreh Abbasi - Mohsen Shafieirad - Amir Hossein Amiri Mehra - Iman Zamani
Higher Derivatives Extremum Seeking with Very Slow/ Drifting Sensor
Farzaneh Karimi - Mohsen Mojiri
Evaluation of Blood Bilirubin via Visible Light Waves
Reyhane Zarei - Mousa Shamsi - Amin Eidi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.3