CAliC: Accurate and Efficient Image-Text Retrieval via Contrastive Alignment and Visual Contexts Modeling

Gao, Hongyu; Zhu, Chao; Liu, Mengyin; Gu, Weibo; Wang, Hongfa; Liu, Wei; Yin, Xu-Cheng

doi:10.1145/3503161.3548320

Cited by 4 publications

(1 citation statement)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Methods [4], [5], [7], [10], [13], [14], [15], [16], [17], [18] employ two separate encoders to independently extract features for visual and textual data. CLIP [10] effectively applies contrastive learning to learn image-language alignment from a large volume of noisy image-text pairs, achieving remarkable performance on vision-language tasks, as demonstrated in [19], [20], [21], [22], [23]. In VATT [15], the authors employ contrastive learning to align the videos, audios and texts, and achieve impressive performance on the downstream tasks.…”

Section: Vision-language Pre-trainingmentioning

confidence: 99%

Monitoring the Spatio-Temporal Changes of Non-Cultivated Land via Long-Time Series Remote Sensing Images in Xinghua

Zhang

et al. 2022

IEEE Access

View full text Add to dashboard Cite

The amount of cultivated land per capita in China is relatively low, and the phenomenon of non-agricultural cultivated land (NACL) in recent years has negatively impacted the stability of grain production in China. In this study, long-time series images obtained via satellite remote sensing were used to monitor spatio-temporal changes in NACL at the county scale. Seven-phase images were acquired from 1990 to 2020 (every five years) using medium-resolution Landsat MSS, TM, ETM+, and Sentinel MSI. Vegetation indices and texture features were extracted for all images. Terrain features such as slope, aspect and elevation were extracted from the DEM data. Combining vegetation index features, texture features, terrain features and multispectral bands, the image classification was performed using the random forest (RF) algorithm. The indices of classification accuracy assessment indices included overall accuracy (OA) and multiclass F-scores (Fm). Zonal statistics were used to calculate the area of cultivated land in towns for 1990 and 2020, and to create grades for the reduction of cultivated land. Finally, indicators including land use dynamic degree (LUDD), land use type change (LUTC) and land use change rate (LUCR) were adopted to reflect the spatio-temporal of NACL in the study area. The results show that RF classification algorithm achieves accurate and efficient land use extraction. The OA were greater than 86%, and the Fm were over 0.88. The cultivated land area in the study area showed decreasing trend. From 1990 to 2020, the ratio of cultivated land decreased from 59.75% to 50.21%. Meanwhile, the dynamic degree of cultivated land increased annually. The conversion of cultivated land into construction land was dominant, accounting for 31.84% of the total change in cultivated land over the past 30 years. This study also reveals that NACL is highly related to local economic and land-use policies. Multi-source remote sensing data have been used to quantitatively analyse the spatio-temporal changes in cultivated land conversion, providing a reference for relevant land management departments to master cultivated land use changes and adjust land management policies.INDEX TERMS Non-agricultural cultivated land, long-time series, random forest, spatio-temporal change, remote sensing.

show abstract