“…The proposed model showed that the product level and user level are critical in fake review detection. More recently, Cao, et al [65] introduced a deceptive reviews detection framework based on combination fine-grained and coarse features to implicit the semantic information from reviews. The extract features were learned with a coarsegrained concatenation of 2-neural network layer and Latent Dirichlet Allocation (LDA).…”
Section: ) Other Neural Network Methodsmentioning
confidence: 99%
“…A second dataset is the "deception dataset" [100] constructed from TripAdvisor and Amazon Mechanical Turk websites from Chicago city, which contains 3,032 reviews from different domains (Hotel, Restaurant, and Doctor) by crowdsourcing platform. This dataset has extensively used in literature, and it is semi-real dataset [3,4,12,27,29,32,37,65]. For simplicity, we combined these three-domain reviews at current stages, and we leave the investigation of each domain separately (i.e., multi-domain detection model) for future work.…”
In e-commerce, user reviews can play a significant role in determining the revenue of an organisation.Online users rely on reviews before making decisions about any product and service. As such, the credibility of online reviews is crucial for businesses and can directly affect companies' reputation and profitability. That is why some businesses are paying spammers to post fake reviews. These fake reviews exploit consumer purchasing decisions. Consequently, the techniques for detecting fake reviews have extensively been explored in the past twelve years. However, there still lacks a survey that can analyse and summarise the existing approaches. To bridge up the issue, this survey paper details the task of fake review detection, summing up the existing datasets and their collection methods. It analyses the existing feature extraction techniques. It also summarises and analyses the existing techniques critically to identify gaps based on two groups: traditional statistical machine learning and deep learning methods. Further, we conduct a benchmark study to investigate the performance of different neural network models and transformers that have not been used for fake review detection yet. The experimental results on two benchmark datasets show that RoBERTa performs about 7% better than the state-of-the-art methods in a mixed domain for the deception dataset with the highest accuracy of 91.2%, which can be used as a baseline for future studies. Finally, we highlight the current gaps in this research area and the possible future directions.
INDEX TERMSFake review; Fake review detection; Feature engineering, Machine learning; Deep learning.
“…The proposed model showed that the product level and user level are critical in fake review detection. More recently, Cao, et al [65] introduced a deceptive reviews detection framework based on combination fine-grained and coarse features to implicit the semantic information from reviews. The extract features were learned with a coarsegrained concatenation of 2-neural network layer and Latent Dirichlet Allocation (LDA).…”
Section: ) Other Neural Network Methodsmentioning
confidence: 99%
“…A second dataset is the "deception dataset" [100] constructed from TripAdvisor and Amazon Mechanical Turk websites from Chicago city, which contains 3,032 reviews from different domains (Hotel, Restaurant, and Doctor) by crowdsourcing platform. This dataset has extensively used in literature, and it is semi-real dataset [3,4,12,27,29,32,37,65]. For simplicity, we combined these three-domain reviews at current stages, and we leave the investigation of each domain separately (i.e., multi-domain detection model) for future work.…”
In e-commerce, user reviews can play a significant role in determining the revenue of an organisation.Online users rely on reviews before making decisions about any product and service. As such, the credibility of online reviews is crucial for businesses and can directly affect companies' reputation and profitability. That is why some businesses are paying spammers to post fake reviews. These fake reviews exploit consumer purchasing decisions. Consequently, the techniques for detecting fake reviews have extensively been explored in the past twelve years. However, there still lacks a survey that can analyse and summarise the existing approaches. To bridge up the issue, this survey paper details the task of fake review detection, summing up the existing datasets and their collection methods. It analyses the existing feature extraction techniques. It also summarises and analyses the existing techniques critically to identify gaps based on two groups: traditional statistical machine learning and deep learning methods. Further, we conduct a benchmark study to investigate the performance of different neural network models and transformers that have not been used for fake review detection yet. The experimental results on two benchmark datasets show that RoBERTa performs about 7% better than the state-of-the-art methods in a mixed domain for the deception dataset with the highest accuracy of 91.2%, which can be used as a baseline for future studies. Finally, we highlight the current gaps in this research area and the possible future directions.
INDEX TERMSFake review; Fake review detection; Feature engineering, Machine learning; Deep learning.
“…Topic models have been studied in various fields. In the realm of electronic commerce research, topic narratives (Bastani et al ., 2019), topic distribution (Cao et al ., 2020), topic features (Mou et al ., 2019; Zhong and Schweidel, 2020) from customer reviews are verified to have a significant relationship with company performance. In the field of corporate management, topic features are used to investigate stock performance (Liu, 2020), stock market efficiency (Xu et al ., 2020) and detect corporate fraud (Dong et al ., 2018).…”
In this paper, we develop an intelligent approach to detect default risk of FinTech lending platforms. Using China's peer‐to‐peer (P2P) lending market as an empirical application, we assemble a unique dataset of matched default and non‐default platforms. We apply state‐of‐art techniques to extract sentiment and topic features from several stakeholders' social media data, which are used as supportive soft information. Our approach exhibits better predictive abilities than those with hard information only, where the value of dynamic soft information is demonstrated. Our approach serves as a proof of concept to complement traditional methods of financial risk prediction.
“…The Bi-LSTM structure, based on the deep learning framework, has a good fitting ability and is widely used in text classification (Nguyen & Le Nguyen, 2018). Bi-LSTM is used for text analysis to detect whether e-commerce reviews are deceptive (Cao et al, 2020). This study uses a neural-network-based sequence model for classification and feeds the results into a one-dimensional convolutional neural network for sentiment classification.…”
The time series data of financial markets are nonlinear, owing to rapid data accumulation. Thus, research on stock price prediction has always been a challenge. This study proposes a quantitative trading strategy that combines basic quantitative trading rules and deep learning methods to help investors realize arbitrage. We combine basic quantitative trading arbitrage with deep learning frameworks to fully extract market characteristics and develop trading strategies for investors. The hybrid forecasting model is a new signal‐trading system that uses a genetic algorithm to obtain optimal parameters for the technical indicator timing method of the moving average price. The deep learning structure of the CNN‐Bi‐LSTM, with the attention mechanism and modified loss function, optimizes the trading signal to achieve local optimization. Its core concept is to determine the trading signal through the local trend of price fluctuations and then correct the trading signal through the prediction results. A‐shares in the Chinese market trading data are used as the statistical arbitrage analysis process to output actual trading signals and verify the effectiveness of the system. The results demonstrate that an arbitrage strategy based only on moving average trading rules is ineffective. With the optimization of the deep learning CNN‐Bi‐LSTM framework, the arbitrage ability improves significantly. The optimized strategy can increase the final profit by 1.6042 to the greatest extent. The annualized revenue increased by 35.16%, and the winning rate increased by 15.22%. In addition, we consider the transaction costs during the simulated transaction process. An optimized trading strategy can effectively seize arbitrage opportunities; hence, its profitability and stability are significantly improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.