The paper presents an overview of the task of automatic text summarization. The formulation of the problem of automatic text summarization is carried out. The classification of algorithms for automatic text summarization by the type of the resulting summary and by the approach to solving the problem is carried out. Some existing problems in the field of automatic text summarization and disadvantages of certain classes of algorithms are described. The concepts of quality and information completeness of the summary are defined. The most popular approaches to the assessment of the information completeness of the summary and their classification in accordance with the methodology used are considered. The metrics of the ROUGE family are considered in relation to the task of automatic text summarization. Special attention is paid to the evaluation of the information completeness of the summary using such metrics of information proximity as the Kulback-Leibler divergence, the Jensen-Shannon divergence and the cosine distance (similarity). The metrics mentioned above can be applied to the text vector representations of the initial text and summary. The text vector representation in question can be performed using such methods like frequency vectorization, TF-IDF, static vectorizers and so on.
Keywords: automatic summarization, summary, information completeness, ROUGE, vectorization, TF-IDF, static vectorizer, Kullback-Leibler divergence, Jensen-Shannon divergence, cosine distance
The issue of using the screen of an aircraft's collimator system as a means of providing a help to the pilot about the vertical profile of the flight path in poor visibility conditions at low and extremely low piloting altitudes is being considered.
Keywords: low flight altitude, extremely low flight altitude, threat of collision, collimator, virtual elevation map, virtual reality, augmented reality, artificial intelligence, data fusion, pilot assistance system
The historical aspects of the emergence of the problem of noise-resistant image encoding are considered using the example of delivering photographs of the surface of Mars to Earth. Using the example of generalization of orthogonal matrices by quasi-orthogonal ones, the expansion of the number of matrices for use in image conversion for transmission in noise communication channels is shown.
Keywords: Hadamard matrices, Hadamard coding, Reed-Solomon codes, orthogonal matrices, quasi-orthogonal matrices, noise-resistant image encoding
This article analyzes and reviews modern methods and technologies used in anti-plagiarism systems, with an emphasis on the Russian market. The purpose of considering all of the above is to choose a suitable anti-plagiarism system for integration. The article presents the most popular Russian services for detecting borrowings, their business models, algorithms of operation, as well as a general description of the principles and mechanisms underlying these algorithms. It was determined that the most universal and effective system for finding loans is the service Antiplagiat.ru , since it has the possibility of integration via the API, as well as 34 additional modules that provide the opportunity to adapt the functionality of the system to individual needs.
Keywords: antiplagiarism, text analysis, text processing algorithms, semantic analysis, stylistic analysis
This article discusses the basic principles and design patterns of an application for collecting data from third-party sources. Research has been carried out on various methods of obtaining data, including web scraping, using APIs and file parsing. It also describes various approaches to extracting information from structured and unstructured sources.
Keywords: internet sources, API, parsing, web, headless browser, scraping, etag, data collection
Road surface quality assessment is one of the most urgent tasks in the world. To solve it, there are many systems that mainly interact with images of the roadway. They work on the basis of both traditional methods (machine learning is not used) and machine learning algorithms. Traditional approaches, for example, include methods for edge detection in images that are the object of this study. However, each of the algorithms has certain features. For example, some of them allow to get a processed version of the original photo faster. The following methods were selected for analysis: "Canny algorithm", "Kirsch operator", "Laplace Operator", "Marr-Hildreth algorithm", "Prewitt operator" and "Sobel Operator". The main indicator of effectiveness in the study is the average time to receive the processed photo. The initial material of the experiment is 10 different images of the road surface in 5 sizes (1000x1000, 894x894, 775x775, 632x632, 447x447) in bmp, jpg, png formats. The study found that the "Kirsch operator", "Laplace Operator" and "Prewitt Operator" and "Sobel operator" have a linear dependence of O(n), the "Canny algorithm" and the "Marr-Hildreth algorithm" have a quadratic character of O(n2). The best results are demonstrated by the "Prewitt Operator" and the "Sobel Operator".
Keywords: comparison, effectiveness, method, edge detection, image, photo, road surface, dependence, size, format
Unintentional errors occur in all data transmission channels. The standard way to deal with them is to use noise-resistant codecs based on the use of algebraic error correction codes. There are transmission channels in which a special type of error occurs – erasures, i.e. a type of error in which the location of the error is known, but its value is not known. Coding theory claims that error-control methods can be applied to protect data from erasure, however, these statements are not accompanied by details. This work fills this gap. Algorithms for correcting erasures using arbitrary decoders for error correcting codes are constructed. Lemmas about the correctness of the constructed algorithms are formulated, some estimates of the probability of successful decoding are obtained.
Keywords: channels with erasures, noise-resistant code, algebraic code, error correction code decoder, erasure correction algorithm
The problem of vulnerabilities in the Robot Operating System (ROS) operating system when implementing a multi-agent system based on the Turtlebot3 robot is considered. ROS provides powerful tools for communication and data exchange between various components of the system. However, when exchanging data between Turtlebot3 robots, vulnerabilities may arise that can be used by attackers for unauthorized access or attacks on the system. One of the possible vulnerabilities is the interception and substitution of data between robots. An attacker can intercept the data, change it and resend it, which can lead to unpredictable consequences. Another possible vulnerability is unauthorized access to the commands and control of Turtlebot3 robots, which can lead to loss of control over the system. To solve these vulnerabilities, methods of protection against possible security threats arising during the operation of these systems have been developed and presented.
Keywords: Robotic operating system (ROS), multi-agent system, system packages, encryption, SSL, TLS, authentication and authorization system, communication channel, access restriction, threat analysis, Turtlebot3
The article discusses the author's methodology for designing and developing a test data generation tool called "QA Data Source", which can later be used in software testing. The paper describes the basic requirements, application functionality, data model, and usage examples. When describing the application, methods of system analysis and modeling of information processes were used. As a result of the application of the proposed model for the implementation of information processes, it is possible to significantly reduce the time and resources for generating test data and subsequent product testing.
Keywords: quality assurance, software testing, test data, information technology, data generation, databases, application development
The work is devoted to the problem of providing electrical energy to remote production enterprises in the absence of a centralized power supply. The purpose of the work is to develop decision support tools for choosing autonomous power generation projects from a large number of possible alternatives. To achieve this purpose, a hierarchy of criteria was constructed and a comparative analysis of existing technical and economic solutions in the field of small-scale autonomous energy was carried out. It is shown that when choosing a power generation project for a particular enterprise, there is a fairly large number of alternatives, which makes the use of commonly used decision support procedures based on the hierarchy analysis method/analytical network method (in the classical version) ineffective. An iterative procedure with dynamic changes in feedback between criteria and alternatives is proposed, which makes it possible to reduce the dimension of the supermatrix during the calculation process and, thereby, reduce the time complexity of the algorithms. The effectiveness of the proposed modification of the analytical network method is confirmed by calculations. The constructed procedure for selecting an autonomous power generation project makes it possible to increase the level of scientific validity of technical and economic decisions when expanding the production activities of small enterprises in remote and sparsely populated areas.
Keywords: autonomous power system, decision support, analytical network method
Among the vast range of tasks that modern advanced video surveillance systems face, the dominant position is occupied by the task of tracing various objects in the video stream, which is one of the fundamental problems in the field of video analytics. Numerous studies have shown that, despite the dynamism of processes in the field of information technology and the introduction of various tools and methods, the task of object maintenance still remains relevant and requires further improvement of previously developed algorithms in order to eliminate some inherent disadvantages of these algorithms, systematization of techniques and methods and the development of new systems and approaches. The presented article describes the process of step-by-step development of an algorithm for tracking human movements in a video stream based on the analysis of color groups. The key stages of this algorithm are: the selection of certain frames when dividing the video stream, the selection of the object under study, which is further subjected to a digital processing procedure, the basis of which is to obtain information about color groups, their average values and percentages of their occupancy relative to the object under study. This information is used for the procedure of searching, detecting and recognizing the selected object with an additional function of predicting the direction of movement on video frames, the result of which is the formation of the entire picture of the movement of the person under study. The materials presented in this paper may be of interest to specialists whose research focuses on issues related to the automated acquisition of certain data in the analysis of various images and videos.
Keywords: surveillance cameras, u2– net neural network, rembg library, pattern recognition, clothing recognition, delta E, tracing, direction prediction, object detection, tracking, mathematical statistics, predicted area, RGB pixels
Currently, patent documents contain graphic images of device drawings, graphs, chemical and mathematical formulas, and formulas often need to be recognized and brought to a unified standard. In this work, the analysis of graphic images extracted from the descriptions of patents of the FIPS of Rospatent is carried out. Thematic filtering of mathematical and chemical formulas contained in patent documents and their recognition is provided. The theoretical value lies in the developed algorithms for parsing patents in the Yandex system.Patents; recognition of chemical and mathematical formulas among graphic patent images; translation of graphic images of chemical formulas into SMILES format; conversion of graphic images of mathematical formulas into LaTeX format. The practical significance of the work lies in the developed software module for analyzing graphic images from patent documents. The field of application of the developed system is the study of patents and the reduction of graphic images to a unified standard for solving patent search problems.
Keywords: patent, image, mathematical formula, chemical formula, LaTeX, SMILES
The article discusses the use of a recurrent neural network in the problem of forecasting pollutants in the air based on actual data in the form of a time series. A description of the network architecture, the training method used, and the method for generating training and testing data is provided. During training, a data set consisting of 126 measurements of various components was used. As a result, the quality of the conclusions of the resulting model was assessed and the averaged coefficients of the MSE metric were calculated.
Keywords: air pollution, forecasting, neural networks, machine learning, recurrent network, time series analysis
The paper analyzes various approaches to identifying and recognizing license plates in intelligent transport networks. A deep learning model has been proposed for localizing and recognizing license plates in natural images, which can achieve satisfactory results in terms of recognition accuracy and speed compared to traditional ones. Evaluations of the effectiveness of the deep learning model are provided.
Keywords: VANET, intelligent transport networks, YOLO, city traffic management system, steganography, deep learning, deep learning, information security, convolutional neural network, CNN
The article presents aspects of the development of a device for wirelessly picking up a vibration acceleration signal from the surface of a ball mill drum. The results of measuring vibration acceleration for a ball mill model for various levels of loading with crushed material are presented. According to these results, with an increase in the load of crushed materials relative to the ball, the level of vibration decreases. The work also presents the obtained pie diagrams of the distribution of vibration load across the mill drum, from which one can judge its current operating mode.
Keywords: ball mill, wireless signal, vibration acceleration, mill loading control
The article presents a set-theoretic model that generalizes the concept of a monitoring system. The model is a tuple that includes a monitoring object, the infrastructure of the monitoring system, initial data and monitoring results, and a set of relationships between the components of the model. Each component of the model is detailed at 1-2 levels of detail. For some elements of the model, examples from existing monitoring systems are given. The model can be used to create new or modify existing monitoring systems.
Keywords: monitoring system, monitoring object, set-theoretic model, tuple, data processing, infrastructure, sensor, software
This paper examines the problems of optimizing the loading of client web applications and ways to solve them, taking into account various practical conditions. It provides ways to speed up the loading of web applications and remove blocking elements in the data processing chains in order to improve various aspects of the user experience. An approach is proposed that allows you to design an optimal application loading chain that meets the highest quality standards in the front end industry and provides the best user experience.
Keywords: front end, rendering, client web applications, load time, performance optimization, user experience
This paper analyzes the effectiveness of the Tree-Shaking mechanism, which is a key way to optimize the size of client web applications. Its implementation is compared in five popular tools for building projects: Webpack, Rollup, Parcel, Vite and Esbuild. Test results demonstrate differences in their behavior and overall effectiveness in removing redundant code, highlighting the relevance of Tree-Shaking in web development.
Keywords: tree-shaking, javascript, front end, web applications, optimization, loading speed
The paper discusses a stegoalgorithm with localization of the embedding area in the YCbCr color space to protect images of a license plate, a vehicle from different angles, a traffic event, as well as issues of developing a software system that implements the stegoalgorithm. Image protection allows you to effectively implement the concept of multimodal interaction of socio-cyberphysical systems in an automotive self-organizing network. Evaluations of the effectiveness of the developed method are provided.
Keywords: VANET, intelligent transport networks, city traffic management system, steganography, information security, watermark
The paper discusses methods for protecting logical elements of combinational circuits from single failures. Until recently, the problem of creating microelectronic devices resistant to single failures in logic elements was relevant primarily in the military and space industries. In these areas, increased requirements are placed on the fault tolerance of circuits due to the influence of external destabilizing factors. Such factors can be heavy charged particles that affect the operation of logic elements and cause their single failures. Due to the scaling of semiconductor devices, technological standards for the design and manufacture of integrated circuits are changing, and the problem of fault tolerance becomes relevant for devices on the civilian market. The article proposes a technique for resynthesising vulnerable sections of logical combinational circuits. To assess stability, it is proposed to use logical constraints obtained by the resolution method.
Keywords: resynthesis, combinational circuits, reliability, logical correlations, resolution method
The article thoroughly explores cloud, fog, and edge computing, highlighting the distinctive features of each technology. Cloud computing provides flexibility and reliability with remote access capabilities, but encounters delays and high costs. Fog computing focuses on data processing at a low level of infrastructure, ensuring high speed and minimal delays. Edge computing shifts computations to the data source itself, eliminating delays and enhancing security. Applications of these technologies in various fields are analyzed, and their future development is predicted in the rapidly evolving world of information systems.
Keywords: cloud computing, fog computing, edge computing, cloud technologies, data processing infrastructure, scope of application, hybrid computing, Internet of Things, artificial intelligence, information systems development
This article deals with the problem of analyzing and recognizing human emotions using sound data processing. In view of the increase in the scope of application, which is largely caused by the difficult epidemiological situation in the world, the solution of the described problem is an urgent issue. The main stages are described: the audio data stream is recorded and, in accordance with the “sound fingerprinting” approach, is converted into an image that is a spectrogram of the sound data set. The stages of training a convolutional neural network on a pre-prepared set of sound data are described, and the structure of the algorithm is also described. To validate the neural network, a different set of audio data was selected, not participating in the training. As a result, graphs were constructed demonstrating the accuracy of the proposed method.
Keywords: neural network; human emotion recognition; convolutional neural network; sound fingerprinting; Tenserflow; Keras; Matlab; Deep Network Toolbox
The paper presents a solution to the problem of accelerating the process of visualizing the results of numerical simulation. The volumes of such data can be very large and the development of tools to speed up the process of analyzing modeling results is an urgent task. This article proposes a solution to the problem based on the development of a set of programs that automate the process of processing large-volume scientific data of the same type to create high-quality visualizations of the results of numerical modeling. The results are presented using the example of solving problems in astrophysics, but the proposed methodology can be quite easily applied to other subject areas in which models based on the dynamics of particle systems are used. The research described in the work is devoted to solving issues related to converting data obtained from numerical modeling into a format that can be read by the ParaView softwart, which implements many methods that allow obtaining very high-quality visualization. The work also describes the process of automating batch processing of a large amount of data that has the same structure, presents the results of an analysis of the acceleration of the visualization process when using the NVIDIA IndeX plug-in, and considers the possibility of improving the quality of visualization results when applying Delaunay triangulation to the original data.
Keywords: data visualization, Delaunay triangulation, rendering acceleration, ParaView, NVIDIA IndeX, VTK
Currently, there is an increase in the number of scientific papers on models, methods and software and hardware for image processing and analysis. This is due to the widespread introduction of computer vision technologies into information processing and control systems. At the same time, approaches that provide fast image processing in real time using limited computing resources are relevant. Such approaches are usually based on low-level image filtering algorithms. One of the tasks to be solved in computer vision-based systems is the localization of round objects. These objects have the property of radial symmetry. Therefore, the approach based on the Fast Radial Symmetry Transform, which is considered in this paper, is effective for solving this problem. The paper describes the basic steps of the basic transformation, provides a procedure for determining the centers of radially symmetric areas for localization of round objects in images, and discusses examples of its application.
Keywords: computer vision, image processing, image analysis, localization of objects, methods of localization of round objects, fast radial symmetry transf, detecion of the centers of radially symmetric areas
This study is devoted to analyzing the capabilities of the Python programming language when creating information systems for detecting dangerous objects in luggage. As a result, a recognition system architecture was developed, including the following main components: an image processing module, a machine learning module, a database and a user interface. The software chosen is Python with the libraries PySide6, SQLite, Numpy, YOLO. The information system was implemented and tested on real data, which confirmed the correctness of the selected capabilities and technologies of the Python language for the development of security information systems.
Keywords: information system, security, neural network, machine learning, pattern recognition, performance