Многофазные системы

Gokarev V.N., Bezzubov A.F., Berezkin D.V. Analysis of the possibility of using intelligent machine learning systems to work with scientific documents. Multiphase Systems. 2025;20(2):105–111 (in Russian).

2025. Vol. 20. Issue 2, Pp. 105–111
URL: http://mfs.uimech.org/mfs2025.2.015
DOI: 10.21662/mfs2025.2.015

Analysis of the possibility of using intelligent machine learning systems to work with scientific documents

V.N. Gokarev^{1 🖂}, A.F. Bezzubov¹, D.V. Berezkin²

¹The 27th Central Research Institute, Moscow, Russia
²Bauman Moscow State Technical University, Moscow, Russia

Abstract

The article is devoted to the analysis of the application of modern methods of automation of work with scientific documents using intelligent machine learning systems. The paper highlights the importance of these approaches for solving urgent problems in various scientific fields. The article examines the key problems that require an integrated approach to solving, and describes promising areas for further research. A methodology for improving automation efficiency is proposed, in which special attention is paid to hybrid systems that combine expert assessments, algorithmic solutions and neural network models to increase automation efficiency. The continuous growth of computing power contributes to the development of these technologies and the expansion of their practical application. The results obtained can be used in optimizing чproduction automation processes, as well as in the development of scalable systems based on modern artificial intelligence technologies.

Keywords

document processing automation,
machine learning,
expert assessments,
scientific papers,
neural networks,
natural language processing (NLP),
linguistic text processing,
classification,
hybrid methods

Article outline

The proposed article describes a hybrid approach to automating the processing of scientific and technical documents in a research institute, which combines a multi-input neural network architecture and the integration of linguistic methods. Such methods can significantly optimize document processing processes - accelerate information extraction and reduce the proportion of errors at the first stages of data processing, which ultimately has an exponentially positive effect on all subsequent stages of working with scientific documents. The paper notes the importance of these approaches for solving urgent problems in various scientific fields.

The article considers key problems that require an integrated approach to solution, and describes promising areas for further research. A methodology for improving the efficiency of automation is proposed, in which special attention is paid to hybrid systems that combine expert assessments, algorithmic solutions and neural network models to improve the efficiency of automation.

The continuous growth of computing power contributes to the development of these technologies and the expansion of their practical application. The results obtained can find application in optimizing production automation processes, as well as in the development of scalable systems based on modern artificial intelligence technologies. The relevance of using AI in working with scientific documents is due to several factors. Firstly, the continuous increase in the volume of scientific and technical information requires the creation of automated systems capable of promptly processing data, which has already been reflected in a number of modern research and developments. Thus, many research teams and technology companies, such as Google Scholar, Elsevier, Springer Nature, as well as leading universities, are actively developing and implementing machine learning algorithms for classification, citation analysis and identifying thematic trends. Secondly, the automation of routine processes allows reducing the time for preparing and processing information, which is especially important for specialists in the field of scientific research, librarians and analysts working with large databases. The areas of application of such solutions are very diverse. They cover both scientific research activities and educational processes, management of library collections, information support for scientific projects and monitoring of innovative technologies. The use of intelligent systems allows for a quick and accurate search for relevant information, optimizes peer review processes and even helps identify new areas of research. Thus, the development and implementation of methods for automating the processing of scientific documents is becoming a necessity for universities, research institutes, publishers and organizations involved in the analysis of scientific trends.

As a result, the integration of interdisciplinary methods and technologies in the field of automation of work with scientific documents opens up broad prospects for further research. The use of deep neural network models in combination with ensemble analysis methods and expert assessments contributes to the creation of adaptive systems capable of self-correction and learning based on new data. Standardization of processes and intersectoral cooperation will become key factors for improving the efficiency of scientific document management and the quality of information analysis in the future.

The use of a hybrid approach to automating processes for working with scientific documents, which includes a model for representing document clusters, a hybrid approach should provide a significant increase in processing accuracy. The use of the described approach opens up opportunities for automating the analysis of large volumes of data, minimizing errors associated with manual input, and increasing the transparency of scientific work.

References

de la Torre-L?pez J, Ram?rez A, Romero JR. Artificial intelligence to automate the systematic review of scientific literature. Computing. 2023;105:2171–2194. DOI: 10.1007/s00607-023-01181-x
Мирошниченко МА, Абдуллаева АА, Вовк МА. Тенденции развития технологий искусственного интеллекта в управлении документами организации. Естественно-гуманитарные исследования. 2024;(51):398–403.
Miroshnichenko MA, Abdullaeva aAA, Vovk MA. Development trends of artificial intelligence technologies in organizational document management. Natural-Humanitarian Studies. 2024;(51):398–403 (in Russian). EDN: ypbwdd
Кубарский А В. Технологии искусственного интеллекта при построении самообучающейся системы электронного документооборота. Эпомен. 2021;(61):49–55.
Kubarskiy AV. Intelligence technologies in the construction of a self-learning electronic document management system. Epomen. 2021;(61):49–55 (in Russian). EDN: vjtnhp
Перова МВ, Сибилева АА. Искусственный интеллект в системах электронного документооборота. Тенденции развития науки и образования. 2022;(81-2):33–36.
Perova MV, Sibileva AA. Artificial Intelligence in Electronic Document Management Systems. Trends in Science and Education Development. 2022;(81-2):33–36 (in Russian). DOI: 10.18411/trnio-01-2022-50
Белов ИИ. Автоматизация функций систем электронного документооборота посредством применения технологий искусственного интеллекта. Вестник архивиста. 2022;(3):772–783.
Belov II. Automation of Electronic Document Management Systems Functions by Means of Artificial Intelligence Technologies. Herald of an archivist. 2022;(3):772–783 (in Russian). DOI: 10.28995/2073-0101-2022-3-772-783
Павкина НН. Организация документооборота в современных учреждениях. Актуальные проблемы современности: наука и общество. 2020;(1(26)):12–16.
Pavkina NN. Document management in the present-day institutions. Current issues of our time: science and society. 2020;(1(26)):12–16. EDN: cleivn
IlyaGusev/mbart_ru_sum_gazeta Hugging Face. [online] URL: https://huggingface.co/IlyaGusev/mbart_ru_sum_gazeta (Accessed 1.11.2024).
Zhao J, Guo Y, Zhanget H, et al. Improved machine learning estimation of surface turbulent flux using interpretable model selection and adaptive ensemble algorithms over the Horqin Sandy Land area. Atmospheric Research. 2025;316:107952. DOI: 10.1016/j.atmosres.2025.107952
Menon RR, Ravi V. Using AHP-TOPSIS methodologies in the selection of sustainable suppliers in an electronics supply chain. Cleaner Materials. 2022;5:100130. DOI: 10.1016/j.clema.2022.100130
Mehdy AKMN, Mehrpouyan HA. Multi-Input Multi-Output Transformer-Based Hybrid Neural Network for Multi-Class Privacy Disclosure Detection. 2nd International Conference on Machine Learning Techniques and NLP (MLNLP 2021), September 18–19, 2021, Copenhagen, Denmark. CS \& IT Conference Proceedings. 2021. p. 221–241. DOI: 10.5121/csit.2021.111419