The solution was not found in the construction of new server, but in the cloud. During the period from August 2022 to the second quarter of 2023, the company deployed a powerful data platform into Yandex Cloud, turning information from a chaotic set of numbers into a strategic asset.
Phased digitalization
the foundation was laid back in 2019. The company began large -scale digitalization by introducing a SAP: S4/Hana system for planning enterprise resources, HCM for personnel management, PP/DS for detailed production planning and PM for equipment maintenance. These solutions created a common information space and made business processes more transparent.
However, data from different systems still remained scattered. A single corporate storage was required that could consolidate information and provide fast analytics.
Why a cloud? Why Yandex? The choice in favor of cloud infrastructure was strategic. The deployment of key components and the scaling of capacities is much faster than when building a traditional data center, and at the start it is not necessaryTo invest significant funds in the administration.
When choosing a provider, there were a number of mandatory criteria. The company was critical that the service provider place servers in Russia and has all the necessary data security certificates, including compliance with the requirements of 152-ФЗ and the ISO/IEC 27001 international standard. In addition, the key argument was the provider of complete steak of controlled services - from virtual machines Compute Cloud to the PostgreSQL and Clickhouse DBMS, as well as its own cloud Bi -system Datalens.
from pilot to Production
August 2022 - launching a pilot project of a corporate data storage. First of all, a report on the delivery of finished products was formed on test data. The first quarter of 2023 - the data in the company was divided into units, creating the basis for structured analytics. The second quarter of 2023 - the construction of BI reports for three key domains began: procurement, personnel and production. Initially, these three divisions participated in the project, now Rospolim " The decision to all services is replicated.
the company has deployed a powerful data platform into Yandex Cloud, turning a chaotic set of data into a strategic assetModern cloud platform of the enterprise is built on the principle of conveyor data processing: Apache Airflow, deployed on virtual machines Compute Cloud, serves as a central orchestra. data streams. Date engineers designed with it the methods of collecting, transforming and transporting information. The system collects raw data from several sources (including SAP systems) and sends them to Object Storage - a reliable digital warehouse of raw materials. Delta are also placed in the object storage, which ensures the historicity of the data. From Object Storage, data is received by the Yandex Managed Service for PostgreSQL cluster, where the DDS detailed layer is implemented according to the Snowflake. This architectural model provides the optimal structure for analytical queries.
to control requests and tracking the origin of the data, the DBT (Data Build Tool) is used. With its help, the data engineers transform the data, divide the detailed layer into entities and convert the information into convenientformat. Prepared data is moved to the Data Mart layer - specialized data display cases that are stored in the Managed Service for Clickhouse cluster. This provides a high speed of analytical requests. Analysts build dashboards in Datalens based on data from showcases. Important condition: reports are launched into production only after the data is described and taken into account in the Lottabyte data catalog.
numbers that are impressive
the created platform demonstrates the impressive characteristics: 250 tables with a total volume (of which 60 in a productive environment), up to 1.5), up to 1.5 billion records, about 1 terabyte of information in test and productive entourage and daily data load at the end of the working day.
In the Lottabyte catalog, analysts and business experts keep in accounting of the Date Aquations, fix the indicators and metrics, describe the date-products and keep a register of data quality checks. This solved the most important problem - the terminology in the company has become the same for all services. A specialized data management department has been created, which completed the main tasks of the project: the creation of a platform, the development of regulations for working with data, byRising the quality of decisions and the development of data competencies.
Safety is above all: hybrid model and tokenization. For a metallurgical company, data safety is a critical aspect. The company still stores on-Premises is very important (prices, personal data). However, for safe work with confidential data in the cloud, the company completed a pilot project for tokenization with the Damask solution together with the Yandex Cloud and the BSSG partner. This allows you to "disguise" sensitive information, without losing the possibility of its analysis.
results: from days to seconds
digital transformation has already brought the results: this is a cardinal reduction in the time of preparation of analytical reports - from days to seconds, and access to data for managers of directions and top managers, and top managers, and A single information space for all departments. Plus a growing knowledge base, focused on the independent construction of reports by employees.
Dimitri Volkov, Director of Digital Transformation of Rupolim JSC, explains: "All infrastructureThe platforms are located in Yandex Cloud, this simplifies management, increases the speed of work and guarantees reliability. In the future, we plan to download streaming data in the storage, for example, equipment indicators. The company is also going to build ML models, for example, to plan the repair of equipment and control of the quality of smelting. "
it is also planned to complete the Damask solution to the productive environment to protect confidential data during storage and further analysis. Platforms and optimizes the data loading process. Novgorod