Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP-related tasks. In the end, this survey briefly discusses the commonly used ML- and DL-based classifiers, evaluation metrics, and the applications of these word embeddings in different NLP tasks.
Next‐generation wireless communication networks, in particular, the densified 5G will bring many developments to the existing telecommunications industry. The key benefits will be the higher throughput and very low latency. In this context, the usage of unmanned aerial vehicle (UAV) is becoming a feasible option for deploying 5G services on demand. At the same time, the immense bandwidth potential of mmWave has strengthened its performance in radio communication. In this article, we provide a consolidated synthesis on the role of UAVs and mmWave in 5G, emphasis on recent developments and challenges. The review focuses on UAV relay architectures, identifies the relevant problems and limitations in the deployment of UAVs using mmWave in both access and backhaul links simultaneously. There is a critical analysis of the optimum placement of the UAVs as a relay with a focus on the mmWave band. The distinctive rich characteristics of the mmWave propagation and scattering are presented. We also synthesis mmWave path loss models. Then, the scope of artificial intelligence and machine learning techniques as an efficient solution for combating the dynamic and complex nature of UAV‐based cellular communication networks are discussed. In the end, security and privacy issues in UAV‐based cellular network are spotlighted. It is believed that the literature discussed, and the findings reached in this article are of significant importance to researchers, application engineers and decision‐makers in the designing and deployment of UAV‐supported 5G network.
A sentiment analysis of Arabic texts is an important task in many commercial applications such as Twitter. This study introduces a multi-criteria method to empirically assess and rank classifiers for Arabic sentiment analysis. Prominent machine learning algorithms were deployed to build classification models for Arabic sentiment analysis classifiers. Moreover, an assessment of the top five machine learning classifiers’ performances measures was discussed to rank the performance of the classifier. We integrated the top five ranking methods with evaluation metrics of machine learning classifiers such as accuracy, recall, precision, F-measure, CPU Time, classification error, and area under the curve (AUC). The method was tested using Saudi Arabic product reviews to compare five popular classifiers. Our results suggest that deep learning and support vector machine (SVM) classifiers perform best with accuracy 85.25%, 82.30%; precision 85.30, 83.87%; recall 88.41%, 83.89; F-measure 86.81, 83.87%; classification error 14.75, 17.70; and AUC 0.93, 0.90, respectively. They outperform decision trees, K-nearest neighbours (K-NN), and Naïve Bayes classifiers.
An enormous amount of clinical free-text information, such as pathology reports, progress reports, clinical notes and discharge summaries have been collected at hospitals and medical care clinics. These data provide an opportunity of developing many useful machine learning applications if the data could be transferred into a learn-able structure with appropriate labels for supervised learning. The annotation of this data has to be performed by qualified clinical experts, hence, limiting the use of this data due to the high cost of annotation. An underutilised technique of machine learning that can label new data called Active Learning (AL) is a promising candidate to address the high cost of the label the data. AL has been successfully applied to labelling speech recognition and text classification, however, there is a lack of literature investigating its use for clinical purposes. We performed a comparative investigation of various AL techniques using ML and deep learning (DL) based strategies on three unique biomedical datasets. We investigated Random Sampling (RS), Least confidence (LC), Informative diversity and density (IDD), Margin and Maximum representativeness-diversity (MRD) AL query strategies. Our experiments show that AL has the potential to significantly reducing the cost of manual labelling. Additionally, AL-assisted pre-annotations accelerates the de novo annotation process with less annotation time required.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.