Ⱥɧɧɨɬɚɰɢɹ ɉɪɨɛɥɟɦɚ ɩɟɪɟɞɚɱɢ ɬɟɤɫɬɨɜ ɜ ɢɧɮɨɪɦɚɰɢɨɧɧɵɯ ɫɢɫɬɟɦɚɯ ɧɟɩɨɫɪɟɞɫɬɜɟɧɧɨ ɫɜɹɡɚɧɚ ɫ ɪɚɡɪɚɛɨɬɤɨɣ ɚɥɝɨɪɢɬɦɨɜ ɫɠɚɬɢɹ ɢɧɮɨɪɦɚɰɢɢ, ɭɩɪɚɜɥɟɧɢɟɦ ɩɨɬɨɤɚɦɢ ɢɧɮɨɪɦɚɰɢɢ, ɬɪɚɧɫɩɨɪɬɧɨɣ ɡɚɞɚɱɟɣ, ɤɨɧ-ɬɟɤɫɬɧɵɦ ɚɧɚɥɢɡɨɦ ɢɧɮɨɪɦɚɰɢɢ ɞɥɹ ɟɟ ɚɞɪɟɫɧɨɣ ɞɨɫɬɚɜɤɢ. Ɉɞɧɢɦ ɢɡ ɤɥɸɱɟɜɵɯ ɜɨɩɪɨɫɨɜ ɹɜɥɹɟɬɫɹ ɨɞɧɨɡɧɚɱɧɨɫɬɶ ɬɪɚɤɬɨɜɤɢ ɩɟɪɟɞɚɜɚɟɦɨɣ ɢɧɮɨɪɦɚɰɢɢ. ȼ ɪɚɛɨɬɟ ɩɪɟɞɥɚɝɚɟɬɫɹ ɨɞɢɧ ɢɡ ɩɨɞɯɨɞɨɜ ɤ ɪɚɫɫɦɨɬɪɟɧɢɸ ɩɪɨɛɥɟɦɵ ɭɩɪɚɜɥɟɧɢɹ ɬɟɤɫɬɚɦɢ: ɫɠɚɬɢɹ ɢ ɩɟɪɟɞɚɱɢ ɢɧɮɨɪɦɚɰɢɢ ɜ ɥɨɤɚɥɶɧɨɣ ɢɧ-ɮɨɪɦɚɰɢɨɧɧɨɣ ɫɢɫɬɟɦɟ, ɩɪɟɞɧɚɡɧɚɱɟɧɧɨɣ ɞɥɹ ɭɩɪɚɜɥɟɧɢɹ ɩɟɪɫɨɧɚɥɨɦ ɢ ɭɩɪɚɜɥɟɧɢɹ ɱɚɫɬɧɵɦɢ ɬɟɤɫɬɚɦɢ. Ɋɚɫɫɦɚɬɪɢɜɚɸɬɫɹ ɩɪɨɛɥɟɦɵ ɫɠɚɬɢɹ ɬɟɤɫɬɨɜ ɞɥɹ ɢɯ ɩɟɪɟɞɚɱɢ ɩɨ ɤɚɧɚɥɚɦ ɫɜɹɡɢ ɢ ɚɪɯɢɜɚ-ɰɢɢ ɫ ɫɨɯɪɚɧɟɧɢɟɦ ɫɦɵɫɥɨɜɨɝɨ ɫɨɞɟɪɠɚɧɢɹ. Ⱦɥɹ ɫɬɪɭɤɬɭɪɢɡɚɰɢɢ ɬɟɤɫɬɨɜ ɩɪɟɞɥɚɝɚɟɬɫɹ ɢɫɩɨɥɶɡɨ-ɜɚɬɶ ɦɟɬɨɞ ɨɧɬɨɥɨɝɢɱɟɫɤɨɝɨ ɚɧɚɥɢɡɚ. ɋɠɚɬɢɟ ɬɟɤɫɬɨɜ ɩɪɟɞɥɚɝɚɟɬɫɹ ɜɵɩɨɥɧɢɬɶ ɫ ɩɪɢɦɟɧɟɧɢɟɦ ɬɟɨ-ɪɢɢ ɰɟɩɧɵɯ ɞɪɨɛɟɣ. ɉɪɟɞɥɚɝɚɟɦɵɣ ɩɨɞɯɨɞ ɩɨɡɜɨɥɹɟɬ ɜɵɩɨɥɧɢɬɶ ɫɠɚɬɢɟ ɬɟɤɫɬɚ ɛɟɡ ɩɨɬɟɪɶ ɢ ɦɨɠɟɬ ɛɵɬɶ ɢɫɩɨɥɶɡɨɜɚɧ ɤɚɤ ɜɚɪɢɚɧɬ ɡɚɳɢɬɵ ɬɟɤɫɬɨɜ ɩɪɢ ɫɨɯɪɚɧɟɧɢɢ ɢ ɩɟɪɟɫɵɥɤɟ. Ʉɥɸɱɟɜɵɟ ɫɥɨɜɚ: ɰɟɩɧɵɟ ɞɪɨɛɢ, ɩɟɪɟɞɚɱɚ ɬɟɤɫɬɚ ɛɟɡ ɩɨɬɟɪɶ, ɚɥɝɨɪɢɬɦ ɫɠɚɬɢɹ ɬɟɤɫɬɚ ɛɟɡ ɩɨ-ɬɟɪɶ, ɨɧɬɨɥɨɝɢɱɟɫɤɢɣ ɚɧɚɥɢɡ ɬɟɤɫɬɚ.ɐɢɬɢɪɨɜɚɧɢɟ: Ɇɭɪɨɦɫɤɢɣ, Ⱥ.Ⱥ. ɂɫɩɨɥɶɡɨɜɚɧɢɟ ɨɧɬɨɥɨɝɢɱɟɫɤɨɝɨ ɩɨɞɯɨɞɚ ɞɥɹ ɡɚɳɢɬɵ ɞɚɧɧɵɯ ɩɪɢ ɢɯ ɩɟɪɟɫɵɥɤɟ ɢ ɚɪɯɢɜɚɰɢɢ / Ⱥ.Ⱥ. Ɇɭɪɨɦɫɤɢɣ, ɇ.ɉ. Ɍɭɱɤɨɜɚ // Ɉɧɬɨɥɨɝɢɹ ɩɪɨɟɤɬɢɪɨɜɚɧɢɹ. - . -Ɍ. 6, ʋ2(20). -ɋ. 136-148. -DOI: 10.18287/2223- -9537-2016 ȼɜɟɞɟɧɢɟ Ɍɟɦɚ ɪɚɫɩɪɨɫɬɪɚɧɟɧɢɹ ɢɧɮɨɪɦɚɰɢɢ ɜɨ ɜɫɟ ɜɪɟɦɟɧɚ ɛɵɥɚ ɜɚɠɧɟɣɲɟɣ ɞɥɹ ɜɫɟɯ ɨɛɥɚɫɬɟɣ ɱɟɥɨɜɟɱɟɫɤɨɣ ɞɟɹɬɟɥɶɧɨɫɬɢ ɢ ɨɛɳɟɫɬɜɟɧɧɨɝɨ ɪɚɡɜɢɬɢɹ. ɋ ɩɨɹɜɥɟɧɢɟɦ ɢɧɮɨɪɦɚɰɢɨɧɧɵɯ ɬɟɯ-ɧɨɥɨɝɢɣ ɜɨɡɧɢɤɥɢ ɧɨɜɵɟ ɩɪɨɛɥɟɦɵ, ɫɜɹɡɚɧɧɵɟ ɫ ɤɢɛɟɪɛɟɡɨɩɚɫɧɨɫɬɶɸ, ɤɨɬɨɪɵɟ ɨɛɵɱɧɨ ɨɛ-ɫɭɠɞɚɸɬɫɹ ɜ ɝɥɨɛɚɥɶɧɨɦ ɝɨɫɭɞɚɪɫɬɜɟɧɧɨɦ ɚɫɩɟɤɬɟ. Ɍɟɦ ɧɟ ɦɟɧɟɟ, ɩɪɚɤɬɢɱɟɫɤɢ ɤɚɠɞɵɣ ɪɭɤɨ-ɜɨɞɢɬɟɥɶ ɜɵɧɭɠɞɟɧ ɧɚ ɫɜɨɟɦ ɭɪɨɜɧɟ ɬɚɤ ɢɥɢ ɢɧɚɱɟ ɪɟɲɚɬɶ ɩɪɨɛɥɟɦɵ ɩɟɪɟɞɚɱɢ ɢɧɮɨɪɦɚɰɢɢ ɜ ɩɪɨɰɟɫɫɟ ɭɩɪɚɜɥɟɧɢɹ ɛɟɡ ɢɫɤɚɠɟɧɢɹ, ɚ ɨɛɵɱɧɵɣ ɩɨɥɶɡɨɜɚɬɟɥɶ ɫɬɚɥɤɢɜɚɟɬɫɹ ɫ ɩɪɨɛɥɟɦɚɦɢ ɡɚɳɢɬɵ ɱɚɫɬɧɵɯ ɚɪɯɢɜɨɜ, ɤɚɤ ɨɬ ɧɟɫɚɧɤɰɢɨɧɢɪɨɜɚɧɧɨɝɨ ɢɫɩɨɥɶɡɨɜɚɧɢɹ, ɬɚɤ ɢ ɨɬ ɩɨɬɟɪɶ ɩɪɢ ɩɟɪɟɞɚɱɟ ɩɨ ɤɚɧɚɥɚɦ ɫɜɹɡɢ ɢɥɢ ɤɨɩɢɪɨɜɚɧɢɢ.Ɂɚɞɚɱɚ ɫɠɚɬɢɹ ɢɧɮɨɪɦɚɰɢɢ ɛɟɡ ɩɨɬɟɪɶ ɧɟ ɬɟɪɹɟɬ ɫɜɨɟɣ ɚɤɬɭɚɥɶɧɨɫɬɢ ɧɚ ɩɪɨɬɹɠɟɧɢɢ ɜɫɟɝɨ «ɰɢɮɪɨɜɨɝɨ» ɩɟɪɢɨɞɚ ɪɚɡɜɢɬɢɹ ɰɢɜɢɥɢɡɚɰɢɢ, ɢ ɨɫɨɛɟɧɧɨ, ɩɪɢ ɨɛɪɚɛɨɬɤɟ «ɛɨɥɶɲɢɯ ɞɚɧɧɵɯ». Ⱦɥɹ ɯɪɚɧɟɧɢɹ ɢ ɩɟɪɟɞɚɱɢ ɬɟɤɫɬɨɜɨɣ ɢɧɮɨɪɦɚɰɢɢ ɨɱɟɧɶ ɜɚɠɧɨ ɭɦɟɧɶɲɢɬɶ ɟɺ ɨɛɴɺɦ, ɧɟ ɢɫɤɚ-ɠɚɹ ɫɦɵɫɥɨɜɨɝɨ ɫɨɞɟɪɠɚɧɢɹ. ɋɨɛɫɬɜɟɧɧɨ, ɩɪɨɛɥɟɦɚ ɩɟɪɟɞɚɱɢ ɬɟɤɫɬɚ ɹɜɥɹɟɬɫɹ ɩɪɢɱɢɧɨɣ ɜɨɡ-ɧɢɤɧɨɜɟɧɢɹ ɩɪɟɞɦɟɬɧɨɣ ɨɛɥɚɫɬɢ, ɧɚɡɵɜɚɟɦɨɣ ɤɨɞɢɪɨɜɚɧɢɟɦ.ɂɡɜɟɫɬɧɵ ɨɩɪɟɞɟɥɟɧɢɹ ɷɬɨɝɨ ɩɨɧɹɬɢɹ. Ʉɨɞɢɪɨɜɚɧɢɟ -ɩɪɢɫɜɨɟɧɢɟ ɱɢɫɥɨɜɵɯ ɤɨɞɨɜ ɩɨɡɢ-ɰɢɹɦ ɜ ɫɨɰɢɨɥɨɝɢɱɟɫɤɨɣ ɚɧɤɟɬɟ [1, ɫ.141]. Ʉɨɞɢɪɨɜɚɧɢɟ (codification, coding, encoding) -ɩɪɨ-ɰɟɫɫ ɨɬɨɛɪɚɠɟɧɢɹ ɫɨɫɬɨɹɧɢɹ ɨɞɧɨɣ ɮɢɡɢɱɟɫɤɨɣ ɫɢɫɬɟɦɵ ɱɟɪɟɡ ɫɨɫɬɨɹɧɢɟ ɧɟɤɨɬɨɪɨɣ ɞɪɭɝɨɣ ɫɢɫɬɟɦɵ, ɩɪɨɢɡɜɨɞɢɦɨɣ ɫ ɰɟɥɶɸ ɩɟɪɟɞɚɱɢ ɢɧɮɨɪɦɚɰɢɢ [2, ɫ.125]. Ɉɧɬɨɥɨɝɢɹ ɩɪɨɟɤɬɢɪɨɜɚɧɢɹ ɬɨɦ ʋ2(20) ȺȺ Ɇɭɪɨɦɫɤɢɣ ɇɉ Ɍɭɱɤɨɜɚ ȼ ɩɟɪɜɨɦ ɨɩɪɟɞɟɥɟɧɢɢ: ɚɧɤɟɬɚ -ɧɨɫɢɬɟɥɶ ɢɧɮɨɪɦɚɰɢɢ ɩɪɢ ɟɺ ɞɜɢɠɟɧɢɢ. ȼɚɠɧɟɣɲɟɟ ɢɡɨɛɪɟɬɟɧɢɟ ɜ ɨɛɫɭɠɞɚɟɦɨɣ ɨɛɥɚɫɬɢ -ɚɡɛɭɤɚ Ɇɨɪɡɟ. Ʉɨɞ Ɇɨɪɡɟ ɢɫɩɨɥɶɡɭɟɬ ɡɜɭɤɨɜɵɟ ɢɥɢ ɫɜɟɬɨɜɵɟ ɫɢɝɧɚɥɵ. Ⱦɥɹ ɤɨɞɢɪɨɜɚɧɢɹ ɧɚ ɩɢɫɶɦɟ ɢɫɩɨɥɶɡɭɸɬɫɹ ɫɢɦɜɨɥɵ ɬɢɪɟ «-» ɢ ɬɨɱɤɢ «.».Ɉɞɧɨɣ ɢɡ ɝɥɚɜɧɵɯ ɩɪɨɛɥɟɦ ɜɫɹɤɨɝɨ ɤɨɞɢɪɨɜɚɧɢɹ ɹɜɥɹɟɬɫɹ ɡɚɳɢ...
The search for mathematical articles on the Internet and related issues are discussed. The main problem of search are the features of the query formation and the presence of information noise. When using formulas in the search query, there are additional difficulties and additional information noise, although the goal is completely different-the specification of the query to obtain the pertinental, information. Difficulties are caused by the fact that not all are ready to use the symbolic notation in the search query, for example, TeX-notation, and publications are often presented in pdf-format. Information noise arises as a consequence of the search for formulas to within analogies. Problems are aggravated also if the formula does not have a generally accepted name, designations, unlike, for example, from such as the d'Alembert formula, the Tricomi equation, etc. There is one more circumstance that is not sufficiently discussed, namely, a rather complex mechanism for replenishing information resources with formulas that can and should be used in the search form. On the example of the thesaurus on the subject domain "mixed problems of mathematical physics", a variant of using formulas for searching mathematical articles is proposed. We consider a version of the work of a user knowledgeable in the subject domain with mathematical texts of a publication made using TeX-notation. A usual document markup mechanism is specified when there are keywords in secondary documents. Otherwise, using a frequency analysis, a local thesaurus or dictionary for the article is compiled. The dictionary article in this form contains all the fields for identifying the corresponding formulas in the text.
The subject areas related to science and their features are considered in this paper. An attempt has been made to single out common concepts for them. One of the features of the scientific fields of knowledge is that the described data structures are subject to frequent changes. The paper discusses the generalized model of the scientific domain, its originality, as well as its implementation in the search systems, notes their differences from the classical approaches to information retrieval in scientific data sets. The set of introduced concepts defines the generalized ontology of the scientific subject domain in this work. This ontology together with the data determines the space of scientific knowledge of the considering domain. When moving to the objective level of conceptualization, it becomes necessary to limit it within the specific field of science. A set of concepts used to describe this subject area at this level. Most often these terms are organized in the form of a thesaurus. The corresponding terms of the subject area are associated with concepts of domain. System LibMeta is a program implementation of the constructed model. The main task of this system is the creation of such an information system for scientific libraries that could take into account the diversity of different types of resources that can be stored in it and at the same time support the terminological description of any subject area. The domain of ordinary differential equations is chosen to demonstrate work with scientific information in this system.
The research is devoted to the use of symbolic expressions for informational queries. The problem of information search, is discussed with use of designations and special symbols in various subject domains. It is proposed to use specialized thesauri, where along with natural language definitions are present symbolic expressions. This approach allowed refining a retrieval, regardless of the language of the original source and can also be applied to the identification of critical situations on the basis of an analysis of symbolic indicators underlying the structured data. Proposed technology comparison texts by analogy and mechanism for replenishment of the thesaurus, as well as the possibility of establishing additional semantic relationships that may speed up the search for information in a critical situation. It is offered to use these data for finding of 'special points' in subject domains and to operate them.
Анализируются связи понятий математической предметной области на примере раздела уравнений математической физики. Предлагается вариант статьи тезауруса для терминов и связанных с ними уравнений и формул. Особенность такого тезауруса заключается в использовании контекста формул для их дополнительной идентификации в предметной области. Кроме того, предлагается учитывать индексы авторов и статей, где встречаются термины тезауруса. Предложенный подход способствует уточнению поискового запроса и уменьшению информационного шума при использовании тезауруса в цифровых библиографических коллекциях. The studying is focusing to the relationship of the concepts of the mathematical subject area on the example of the section of mathematical physics equations. A version of the thesaurus article for terms and related equations and formulas is proposed. The peculiarity of such a thesaurus lies in the use of the context of formulas for their additional identification in the subject area. In addition, it is proposed to take into account the indices of authors and articles where thesaurus terms are found. This approach helps to refine the search query and reduce information noise when using the thesaurus in digital bibliographic collections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.