José Luis Martínez-Fernández scite author profile

Arabic is the most widely spoken language in the Arab World. Most people of the Islamic World understand the Classic Arabic language because it is the language of the Qur'an. Despite the fact that in the last decade the number of Arabic Internet users (Middle East and North and East of Africa) has increased considerably, systems to analyze Arabic digital resources automatically are not as easily available as they are for English. Therefore, in this work, an attempt is made to build a real time Named Entity Recognition system that can be used in web applications to detect the appearance of specific named entities and events in news written in Arabic. Arabic is a highly inflectional language, thus we will try to minimize the impact of Arabic affixes on the quality of the pattern recognition model applied to identify named entities. These patterns are built up by processing and integrating different gazetteers, from DBPedia (http://dbpedia.org/About, 2009) to GATE (A general architecture for text engineering, 2009) and ANERGazet

show abstract

Validation of Soil Moisture Data Products from the NASA SMAP Mission

Colliander¹,

Reichle²,

Crow³

et al. 2021

Preprint

View full text Add to dashboard Cite

NASA’s Soil Moisture Active Passive (SMAP) mission has been validating its soil moisture (SM) products since the start of data production on March 31, 2015. Prior to launch, the mission defined a set of criteria for core validation sites (CVS) that enable the testing of the key mission SM accuracy requirement (unbiased root-mean-square error <0.04 m<sup>3</sup>/m<sup>3</sup>). The validation approach also includes other (“sparse network”) in situ SM measurements, satellite SM products, model-based SM products, and field experiments. Over the past six years, the SMAP SM products have been analyzed with respect to these reference data, and the analysis approaches themselves have been scrutinized in an effort to best understand the products’ performance. Validation of the most recent SMAP Level 2 and 3 SM retrieval products (R17000) shows that the L-band (1.4 GHz) radiometer-based SM record continues to meet mission requirements. The products are generally consistent with SM retrievals from the ESA Soil Moisture Ocean Salinity mission, although there are differences in some regions. The high-resolution (3-km) SM retrieval product, generated by combining Copernicus Sentinel-1 data with SMAP observations, performs within expectations. Currently, however, there is limited availability of 3-km CVS data to support extensive validation at this spatial scale. The most recent (version 5) SMAP Level 4 SM data assimilation product providing surface and root-zone SM with complete spatio-temporal coverage at 9-km resolution also meets performance requirements. The SMAP SM validation program will continue throughout the mission life; future plans include expanding it to forested and high-latitude regions.

show abstract

A Preliminary Approach to the Automatic Extraction of Business Rules from Unrestricted Text in the Banking Industry

Martínez-Fernández

González

Villena

et al.

View full text Add to dashboard Cite

Abstract. This paper addresses the problem of extracting formal statements, in the form of business rules, from free text descriptions of financial products or services. This automatic process is integrated in the banking software factory, permitting business analysts the formal specification, direct implementation and fast deployment of new products. This system is fully integrated with the typical software methodologies and architectures used in the banking industry for conventional development of backoffice or online applications.

show abstract

MIRACLE at ImageCLEFphoto 2007: Evaluation of Merging Strategies for Multilingual and Multimedia Information Retrieval

Villena-Román

Lana-Serrano

Martínez-Fernández

et al.

View full text Add to dashboard Cite

This paper describes the participation of MIRACLE research consortium at the ImageCLEF Photographic Retrieval task of ImageCLEF 2007. For this campaign, the main purpose of our experiments was to thoroughly study different merging strategies, i.e. methods of combination of textual and visual retrieval techniques. While we have applied all the well known techniques which we had already used in previous campaigns, for both textual and visual components of the system, our research has primarily focused on the idea of performing all possible combinations of those techniques in order to evaluate which ones may offer the best results and analyze if the combined results may improve (in terms of MAP) the individual ones.The system includes three main modules. On one hand, apart from the search engine (Xapian or Lucene), the textual retrieval module includes parsers, stemming, stopword filtering, proper noun detection and semantic expansion components. On the other hand, the visual retrieval module is based on two well-known content-based engines: GIFT and FIRE. Finally, the merging module allows to use different operators (AND, OR, LEFT, RIGHT) to combine the outputs of the two previous subsystems and to calculate the result relevance based on different metrics (max, min, avg, max-min).We finally submitted 110 multilingual textual (text-based) runs, 22 visual (content-based) runs and 21 mixed runs. Results in general show a poor performance for all groups, due to the characteristics of the image collection and the difficulty of the defined topics. The most interesting conclusion is that the defined merging strategies are successful as our best mixed experiment outperforms both the textual and visual experiments in which it is based, using the LEFT operator for the combination along with the max-min metric for computing the relevance.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.