Although by the end of 2020, most of companies will be running 1000 node Hadoop in the system, the Hadoop implementation is still accompanied by many challenges like security, fault tolerance, flexibility. Hadoop is a software paradigm that handles big data, and it has a distributed file systems so-called Hadoop Distributed File System (HDFS). HDFS has the ability to handle fault tolerance using data replication technique. It works by repeating the data in multiple DataNodes which means the reliability and availability are achieved. Although data replications technique works well, but still waste much more time because it uses single pipelined paradigm. The proposed approach improves the performance of HDFS by using multiple pipelines in transferring data blocks instead of single pipeline. In addition, each DataNode will update its reliability value after each round and send this updated data to the NameNode. The NameNode will sort the DataNodes according to the reliability value. When the client submits request to upload data block, the NameNode will reply by a list of high reliability DataNodes that will achieve high performance. The proposed approach is fully implemented and the experimental results show that it improves the performance of HDFS write operations.
No abstract
Annotation is considered one of the main applications that semantic web applies. The idea beyond annotation focused on adding metadata to existing information which facilitates machines dealing with data that have meanings and can be readable. Semantic annotation is one of the techniques used for the enrichment of web content semantically, which facilitates writing comments and evaluate previously annotated resources that can lead to better search results. Our framework aims to enrich ontology via embedding data directly to ontology in order to have completed and accurate data.
Abstract-Although planning techniques achieved a significant progress during recent years, solving many planning problem still difficult even for modern planners. In this paper, we will adopt landmark concept to hybrid planning setting -a method that combines reasoning about procedural knowledge and causalities. Landmarks are a well-known concept in the realm of classical planning. Recently, they have been adapted to hierarchical approaches. Such land marks can be extracted in a pre-processing step from a declarat ive hierarchical p lanning domain and problem description. It was shown how this technique allows for a considerable reduction of the search space by eliminating futile plan develop ment options before the actual planning. Therefore, we will present a new approach to integrate landmark pre-processing technique in the context of hierarchical planning with landmark technique in the classical planning. This integration allo ws to incorporate the ability of using extracted land mark tasks fro m hierarchical do main knowledge in the form of HTN and using landmark literals fro m classical planning. To this end, we will construct a transformation technique to transform the hybrid planning domain into a classical domain model. The methodologies in this paper have been implemented successfully, and we will present some experimental results that give evidence for the considerable performance increase gained through planning system.
While several approaches have been developed to enhance the efficiency of hierarchical Artificial Intelligence planning (AI-planning), complex problems in AI-planning are challenging to overcome. To find a solution plan, the hierarchical planner produces a huge search space that may be infinite. A planner whose small search space is likely to be more efficient than a planner produces a large search space. In this paper, we will present a new approach to integrating hierarchical AI-planning with the map-reduce paradigm. In the mapping part, we will apply the proposed clustering technique to divide the hierarchical planning problem into smaller problems, so-called sub-problems. A pre-processing technique is conducted for each sub-problem to reduce a declarative hierarchical planning domain model and then find an individual solution for each so-called sub-problem sub-plan. In the reduction part, the conflict between sub-plans is resolved to provide a general solution plan to the given hierarchical AI-planning problem. Preprocessing phase helps the planner cut off the hierarchical planning search space for each sub-problem by removing the compulsory literal elements that help the hierarchical planner seek a solution. The proposed approach has been fully implemented successfully, and some experimental results findings will be provided as proof of our approach's substantial improvement inefficiency.
The electronic journalism industry became one of the most important achievements of technology in the two decades. Through online media platforms, information and instant news delivered easily and cheaper than before. In addition to that, e-journalism reduces the time and space needed in traditional journalism industry, and hence improve the information lifecycle beginning from collecting reaching to delivering the news to users in convenient ways. On the other hand, Semantic Web technologies enrich the meaning of web content by converting the unstructured data to structured format. So, our proposed works aims to build robust e-journalism system based on Semantic Web technologies to improve the quality of service for journalists and readers.
Flight data is a large source of big data. Million flights are delayed or canceled each year due to several factors. Study aviation systems are being significant to the economy which improves customer satisfaction, and saves time. Delay Prediction in aviation systems is somewhat complicated because of the large volume of data, the multiple causes of delays. The reasons vary from region to region and from company to another. In this paper, we compare the performance of different machine learning approaches (Random Forest Classifier, logistic regression, Gaussian Naive Bayes and Decision Tree Classifier) for predicting the arrival delay depending on the multiple characteristics and mention the features in each approach. Using machine-learning toolkit supported on the Splunk platform to make a comparison between them. The Airline On-Time Performance Data are used for evaluating the models. The results demonstrate that the Logistic regression is better than others and works well with discrete data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.