This paper aims at constructing a music composition system that composes music by the interaction between human and a computer. Even users without special musical knowledge can compose 16-bar musical works with one melody part and some backing parts using this system. The interactive Genetic Algorithm is introduced to music composition so that users' feeling toward music is reflected in the composed music. One chromosome corresponds to 4-bar musical work information. Users participate in music composition by evaluating composed works after GA operators such as crossover, mutation, virus infection are applied to chromosomes based on the evaluation results. From the experimental results, it is found that the users' evaluation values become high over the progress of generations. That is, the system can compose 16-bar musical works reflecting users' feeling.
This paper discusses and proposes a rough set model for an incomplete information system, which defines an extended tolerance relation using frequency of attribute values in such a system. It first discusses some rough set extensions in incomplete information systems. Next, "probability of matching" is defined from data in information systems and then measures the degree of tolerance. Consequently, a rough set model is developed using a tolerance relation defined with a threshold. The paper discusses the mathematical properties of the newly developed rough set model and also introduces a method to derive reducts and the core.
The original rough set theory deals with precise and complete data, while real applications frequently contain imperfect information. A typical imperfect data studied in rough set research is the missing values. Though there are many ideas proposed to solve the issue in the literature, the paper adopts a probabilistic approach, because it can incorporate other types of imperfect data including imprecise and uncertain values in a single approach. The paper first discusses probabilities of attribute values assuming different type of attributes in real applications, and proposes a generalized method of probability of matching. It also discusses the case of continuous data as well as discrete one. The proposed probability of matching could be used for defining valued tolerance/similarity relations in rough set approaches.
Biomedical named entity recognition (BNER) is one of the most essential and initial tasks (discovering relations between biomedical entities, identifying molecular pathways, etc.) of biomedical information retrieval. Although named entity recognition performed well in ordinary text, it still remains challenging in molecular biology domain because of the complex nature of biomedical nomenclature, different kinds of spelling forms and many more reasons. Even though biomedical entities in biological text are found successfully, classifying them into relevant biomedical classes such as genes, proteins, diseases, drug names, etc. is still another challenge and an open question. This paper presents a new method to classify biomedical named entities into protein and non-protein classes. Our approach employs Random Forest, a machine learning algorithm, with a new combination of features. They are orthographic, keyword and morphological, as well as a probabilistic feature called Proteinhood and a Protein-Score feature based on the Medline abstracts cited on the Pubmed, which are the main contributions in the paper. A series of experiments is conducted to compare the proposed approach with other state of the art approaches. Our protein named entity classifier shows significant performance in the experiments on GENIA corpus achieving the highest values of precision 93.8%, recall 83.8% and F-measure 88.5% for protein named entity identification. In this study we showed the effect of new Proteinhood and Protein-Score features as well as adjusting parameters of Random Forest algorithm.
The paper introduces a rough set model to analyze an information system in which some conditions and decision data are missing. Many studies have focused on missing condition data, but very few have accounted for missing decision data. Common approaches tend to remove objects with missing decision data because such objects are apparently considered worthless from the perspective of decision-making. However, we indicate that this removal may lead to information loss. Our method retains such objects with missing decision data. We observe that a scenario involving missing decision data is somewhat similar to the situation of semi-supervised learning, because some objects are characterized by complete decision data whereas others are not. This leads us to the idea of estimating potential candidates for the missing data using the available data. These potential candidates are determined by two quantitative indicators: local decision probability and universal decision probability. These potential candidates allow us to define set approximations and the definition of reduct. We also compare the reducts and rules induced from two information systems: one removes objects with missing decision data and the other retains such objects. We highlight that the knowledge induced from the former can be induced from the latter using our approach. Thus, our method offers a more generalized approach to handle missing decision data and prevents information loss.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.