Abstract:The manuscript presents a tool to estimate and predict data accuracy in hospitality by means of automated machine learning (AutoML). It uses a tree-based pipeline optimization tool (TPOT) as a methodological framework. The TPOT is an AutoML framework based on genetic programming, and it is particularly useful to generate classification models, for regression analysis, and to determine the most accurate algorithms and hyperparameters in hospitality. To demonstrate the presented tool’s real usefulness, we show t… Show more
“…The range of tasks to which AutoML can be applied is very broad, ranging from processing simple baseline ML models up to complex artificial neural networks (ANNs) in deep learning (DL), with the number of upcoming AutoML methods and concepts continuing to rise [2]. Among the latest trends to be observed is the application of genetic programming (GP) as an optimization method in AutoML [3,4].…”
The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch and Auto-Sklearn. While all these techniques are compared to several baseline algorithms upon their introduction, research still lacks direct comparisons between them, especially of the GP approaches with state-of-the-art AutoML. This study intends to fill this gap in order to analyze the true potential of GP for AutoML. The performances of the different tools are assessed by applying them to 20 benchmark datasets of the imbalanced binary classification field, thus an area that is a frequent and challenging problem. The tools are compared across the four categories average performance, maximum performance, standard deviation within performance, and generalization ability, whereby the metrics F1-score, G-mean, and AUC are used for evaluation. The analysis finds that the GP techniques, while unable to completely outperform state-of-the-art AutoML, are indeed already a very competitive alternative. Therefore, these advanced GP tools prove that they are able to provide a new and promising approach for practitioners developing machine learning (ML) models. Doi: 10.28991/ESJ-2023-07-04-021 Full Text: PDF
“…The range of tasks to which AutoML can be applied is very broad, ranging from processing simple baseline ML models up to complex artificial neural networks (ANNs) in deep learning (DL), with the number of upcoming AutoML methods and concepts continuing to rise [2]. Among the latest trends to be observed is the application of genetic programming (GP) as an optimization method in AutoML [3,4].…”
The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch and Auto-Sklearn. While all these techniques are compared to several baseline algorithms upon their introduction, research still lacks direct comparisons between them, especially of the GP approaches with state-of-the-art AutoML. This study intends to fill this gap in order to analyze the true potential of GP for AutoML. The performances of the different tools are assessed by applying them to 20 benchmark datasets of the imbalanced binary classification field, thus an area that is a frequent and challenging problem. The tools are compared across the four categories average performance, maximum performance, standard deviation within performance, and generalization ability, whereby the metrics F1-score, G-mean, and AUC are used for evaluation. The analysis finds that the GP techniques, while unable to completely outperform state-of-the-art AutoML, are indeed already a very competitive alternative. Therefore, these advanced GP tools prove that they are able to provide a new and promising approach for practitioners developing machine learning (ML) models. Doi: 10.28991/ESJ-2023-07-04-021 Full Text: PDF
“…(1) Hospitality and lodging have long been intertwined, and there are currently a large number of hotels available, adding more visitors' access to the worth and choice within the area. (2) The location or existence of a hotel is not enough to uplift the tourism of an area, but it also indicates health tourism. The four sectors of hotels, meals and drinks, travel and tourism, and leisure of the hospitality industry.…”
Section: Introductionmentioning
confidence: 99%
“…14,3 million of her jobs are in the hospitality and tourist sector in the US. (2) There are 561 000 latino workers in the hotel sector, and half of them are Latina women. Of these Latin American accommodation workers, 41 % perform cleaning duties in the cleaning departments of hotel facilities, which is often considered "dirty" work.…”
Section: Introductionmentioning
confidence: 99%
“…Of these Latin American accommodation workers, 41 % perform cleaning duties in the cleaning departments of hotel facilities, which is often considered "dirty" work. (2) While the job of cleaning hotel rooms is physically demanding, the pay is low, with her average annual income of US$19 570. In addition, employees are exposed to a variety of physical risks and psychosocial stressors in their work environment, as well as chemical and biological agents that can cause respiratory, skin, and infectious diseases are also exposed to toxic substances.…”
This article examines the trend around the adoption of machine learning in the hotel business in light of the significance of new technologies. According to previous research, the hospitality industry uses a variety of chemicals for cleaning. Cleaning supplies are the housekeeping department's primary tool in their daily routine to keep rooms and common areas clean and tidy. Guest and staff don't know the harmfulness of these chemicals. Providing hospitality that meets the needs of guests requires not only a positive attitude, but also high-quality and excellent services that keep guests warm, relaxed, and comfortable. But in some incidents, we find that the guest and staff health is affected by the chemicals. Also, no one worked on predicting the chemical's effects on staff and guest health in the hospitality sector with the use of Machine Learning models. For this purpose, data is collected from different hotels of Delhi NCR in India. There were two distinct fields utilized for assessment and instruction. For the investigation, machine learning methods were employed. The research project employed five machine learning methods. The newly developed MHC-CNN algorithm achieved the highest accuracy (93.75) in comparison to other cutting-edge machine learning techniques. The created technique can be expanded upon and applied in many hotels all around the world.
“…Nevertheless, it may be used as a solid reference by authors or users who wish to use the same model to extract information from similar sources (invoices). Furthermore, it is also considered a valid and useful contribution to, as advocated by other authors, make empiricism on data mainstream [23,24] and promote the usage of learning curves as part of a standard learning system evaluation [21]. One example of how such empiricism has gained importance in the practical usage of deep learning models is the Model Cards of the wellknown Hugging Face repository (https://huggingface.co/docs/hub/model-cards).…”
One of the main challenges when training or fine-tuning a machine learning model concerns the number of observations necessary to achieve satisfactory performance. While, in general, more training observations result in a better-performing model, collecting more data can be time-consuming, expensive, or even impossible. For this reason, investigating the relationship between the dataset's size and the performance of a machine learning model is fundamental to deciding, with a certain likelihood, the minimum number of observations that are necessary to ensure a satisfactory-performing model is obtained as a result of the training process. The learning curve represents the relationship between the dataset’s size and the performance of the model and is especially useful when choosing a model for a specific task or planning the annotation work of a dataset. Thus, the purpose of this paper is to find the functions that best fit the learning curves of a Transformers-based model (LayoutLM) when fine-tuned to extract information from invoices. Two new datasets of invoices are made available for such a task. Combined with a third dataset already available online, 22 sub-datasets are defined, and their learning curves are plotted based on cross-validation results. The functions are fit using a non-linear least squares technique. The results show that both a bi-asymptotic and a Morgan-Mercer-Flodin function fit the learning curves extremely well. Also, an empirical relation is presented to predict the learning curve from a single parameter that may be easily obtained in the early stage of the annotation process. Doi: 10.28991/ESJ-2023-07-05-03 Full Text: PDF
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.