Developing a High-Load Event Processing Service for a Major Client
The Evrone team developed a big data processing service for the advertising agency SberMarketing. The system is designed to handle and process at least 100,000 requests per second, serving as a Data Lake for enriched data, and is implemented in Python.
SberMarketing is a leader in the advertising market, comprising six business units. It offers a full range of services: from creating and promoting advertising content across various channels, marketing, digital, and AI products to organizing events, creating digital avatars, and branded merchandise.
One of SberMarketing's digital products is “Platina,” a web analytics platform. This platform collects data, securely stores it, and allows users to generate reports on user activity, track traffic sources, and evaluate campaign effectiveness.
The Task
The client approached us with a detailed technical specification, and our task was to select the technology and create a solution for receiving and storing incoming events — anonymized user actions, processing incoming requests, and performing initial analytics. The system had to accept, enrich, and temporarily store data before sending it to a central system for further storage and use. Events arrived with varying sets of information, and our job was to standardize them into a complete form and pass them to long-term storage in an archive, where they would be available for future access.
The biggest challenge was the load requirements. The service had to process 100,000 messages per second without losing a single event.
Choosing the Tech Stack
The client did not have strict requirements for the stack, except for a preference for Go and the use of tools that would allow the solution to be sold to a variety of clients in the future.
Go seemed appealing to the client due to its speed and ease of use in microservices. While this was true, we noted that the client’s primary development stack was Python, so we decided to explore options that could handle such high loads while staying within Python’s ecosystem. At the time, finding Go developers was significantly harder than finding Python developers, and we wanted to make sure that the client could comfortably work with our solution over the long term.
Our challenge was to find a framework that would help us adapt Python for big data processing and analytics. We tested AIOHTTP, Litestar, and Robyn, but none met our requirements. We chose Granian — a library for processing information implemented in Rust. It can be used with any stack, and we adapted it for Python. This created a successful combination: Granian boosted speed, while Python allowed us to write business logic quickly.
Testing
Evrone engineers wrote four test services: in Go, Python, Python+Granian, and Rust. Our goal was to achieve the required performance with a reasonable setup.
For load testing, we used Locust. We deployed the test environment in Kubernetes and automated it with Terraform. Docker images were created for each service, and the test scenarios involved gradually increasing the load while recording intermediate results.
The results were as expected: on a single core, Go achieved 5.7 thousand RPC, while Python+Granian reached 5.5 thousand RPC. This confirmed our theory that we could achieve the required performance without drastically changing the stack. Native Python, however, delivered significantly lower results.
It’s worth mentioning that Rust showed the best results, handling 14 thousand RPC. However, commercial development in Rust is quite rare and mostly found in crypto projects and cloud services for tech giants globally. Finding specialists is extremely difficult, development costs are high, and scaling the project would be unfeasible.
Data Lake
In addition to high input loads, the service we developed needed to store data temporarily for enrichment. We created a Data Lake to accumulate large volumes of streaming data.
One of the client’s key requirements was the complete preservation of data — no event or even part of an event should be lost. Therefore, we followed the acknowledge-concept, confirming data delivery when transferring it between parts of the project. For the main storage, we chose tools that support write-ahead logging, a technique that logs changes before they are written to the database. This helps to make sure that operations are not interrupted during a sudden system reboot and either resume or roll back partial changes.
Given the client’s requirements, we decided to use Tarantool, an in-memory platform with a flexible data schema. Initially, Tarantool was designed not as an external system or database, but as a full-fledged framework where other applications run, written in Lua. Essentially, it’s a runtime for launching other applications with temporary or persistent data storage capabilities. The product underwent several architectural changes, particularly in the core and deployment support tools. However, most of the big data tools we needed were only available in the enterprise version, which we couldn’t test. Therefore, we had to improve the community edition ourselves, which had stopped being updated.
One of the main challenges was the lack of an operator for Kubernetes. In our attempts to configure it for cluster deployment, our team even contacted the creator of Tarantool. Eventually, we were able to assemble a cluster with the required functions.
Result
We presented the client with a choice: we tested all four trial services live. The SberMarketing team agreed with our arguments and supported the development of the project in Python and Granian.
Evrone completed the project and handed it over to the client. We thoroughly tested the system, conducted unit, load, security, and automated component tests, and provided documentation. Python developers at the company, already familiar with the stack, will be able to maintain the system, and the client won’t need to complicate the architecture with strictly typed languages for dynamic rules.
Evrone can also help your project in the field of Customer Experience (CX). Reach out through the form below to discuss potential collaboration opportunities!