Data modelling is a complex process that depends on the knowledge and experience of the designers who carry it out. The quality of created models has a significant impact on the quality of successive phases of information systems development. This paper, in short, reviews the data modelling process, the entity-relationship method (ERM) and actors in the data modelling process. Further, in more detail it presents systems, methods, and tools for the data modelling process and identifies problems that occur during the development phase of an information system. These problems also represent the authors' motivation for conducting research that aims to develop a knowledge-based system (KBS) in order to support the data modelling process by applying formal language theory (particularly translation) during the process of conceptual modelling. The paper describes the main identified characteristics of the authors' new KB system that are derived from the analysis of existing systems, methods, and tools for the data modelling process. This represents the focus of the research.
The emergence of new data types in big data era implicates the need to analyse and exploit them to gain valuable business insight. Traditional platforms cannot fully meet the analytical needs of the company if support for an unstructured data type is needed. This paper gives an overview and synthesis of areas related to big data technologies, with a series of guidelines for adopting the appropriate software, storage structure, and efficient deployment for big data management. A broad data management context is presented through a conceptual model of business performance management in a modern data management era.
U radu je dan pregled područja povezanog s procesiranjem prirodnih jezika i njihova međusobnog odnosa, počevši od šire domene kao što je umjetna inteligencija, putem strojnog učenja, računalne lingvistike, metoda strojnog prevođenja te posebice onih zasnovanim na dubokom učenju. Opisane su karakteristike, primjene, faze i glavni problemi obrade prirodnih jezika s leksičke, sintaktičke, semantičke, govorne i pragmatičke perspektive. Opisane su faze prepoznavanja i analize prirodnog jezika kao i faza generiranja prirodnih jezika. Postupci pre-editinga i post-editinga uz korištenje kontroliranih prirodnih jezika dani su kao primjeri prakse kojom se povećava točnost i kvaliteta automatskog prevođenja i općenito procesiranja teksta. Poseban je fokus stavljen na strojno prevođenje te metode strojnog prevođenja. Pristupi strojnom prevođenju kao statistički, temeljen na pravilima, hibridni i pristup temeljen na dubokom učenju opisani su i predstavljeni s obzirom na njihove prednosti i nedostatke i prikladnu primjenu u praksi. Na kraju su dani još uvijek neriješeni izazovi kao smjer daljnjih istraživanja vezanih uz obradu prirodnih jezika te značaj razvoja pristupa temeljenog na dubokom učenju.
The paper describes our current research activities and results related to developing knowledge-based systems to support the creation of entity-relationship (ER) models. The authors based obtaining an ER model in textual form on translation from one language into another, that is, from an English controlled natural language into the formalized language of an ER data model. Our translation method consisted of creating translation rules of sentential form parts into ER model constructs based on the textual and character patterns detected in the business descriptions. To enable the computer analyses necessary for creating translation mechanisms, we created a linguistic corpus that contains lists of the business descriptions and the texts of other business materials. From the corpus, we then created a specific dictionary and linguistic rules to automate the business descriptions' translation into the ER data model language. Before that, however, the corpus was enriched by adding annotations to the words related to ER data model constructs. In this paper, we also present the main issues uncovered during the translation process and offer a possible solution with utility evaluation: applying information-extraction performance measures to a set of sentences from the corpus.
Abstract:This paper presents an overview of terms, concepts, trends and technologies that are relevant to today's business. It describes the basics of data and information integration and flow in a company through a central ERP system with concepts of CRM and SCM. The emergence of big data as a tributary of a huge number of often unstructured data from different sources can become a central problem or opportunity for advancement and achievement of competitive advantages of a company. Ignorance of key figures and/or the non-acceptance of new business conditions, new technologies and possible deployment solutions are the main reasons for non-productivity and poor business performance. To demonstrate the dynamics of appearance and popularity of terms, concepts, trends and technologies this paper offers a tabular overview of the frequencies based on the data from 3 global databases. Meta analysis shows the expected future development of analytical trends and technologies. This paper is intended for those who lead, run and participate in projects of implementation of large software systems, dealing with quality management of business, or want to understand the complexity of this area and the future directions of development.
Automated creation of a conceptual data model based on user requirements expressed in the textual form of a natural language is a challenging research area. The complexity of natural language requires deep insight into the semantics buried in words, expressions, and string patterns. For the purpose of natural language processing, we created a corpus of business descriptions and an adherent lexicon containing all the words in the corpus. Thus, it was possible to define rules for the automatic translation of business descriptions into the entity–relationship (ER) data model. However, since the translation rules could not always lead to accurate translations, we created an additional classification process layer—a classifier which assigns to each input sentence some of the defined ER method classes. The classifier represents a formalized knowledge of the four data modelling experts. This rule-based classification process is based on the extraction of ER information from a given sentence. After the detailed description, the classification process itself was evaluated and tested using the standard multiclass performance measures: recall, precision and accuracy. The accuracy in the learning phase was 96.77% and in the testing phase 95.79%.
The business system is an entity of mutually connected economic, technical and social elements that produces goods or services for market needs. In the process, it uses the proper resources and bears the risk due to gaining the profit and other economic and social goals. The business processes, more or less, influence the inner and external environment since they provide answers to questions, such as, who, what, how, for whom and with what success accomplish his goals. The inner environment includes all factors submissive to the company. External environment cannot be controlled by company and it includes national laws and regulations, offer-demand relationship on the market, inflation, demographic changes, education and technological progress which combined represent a general or social environment that should be observed and reacted to
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.