Competing in the gaming industry is becoming increasingly costly. To attract new players, millions of dollars are spent in user acquisition (UA), development, and cloud infrastructure. As the space becomes more expensive, revenue and cost optimizations can become a matter of survival.
The cost to business because of server outages, game response lag, or generally poor performance can translate to the loss of billions in market cap of gaming companies. The value of observability in modern games is about ensuring that revenue opportunity is not missed in a costly operating landscape. Observability ensures that investment in user acquisition is protected and gaming user experience (UX) is maximized. And user experience depends not only on how the UI is performing, but also on the performance and reliability of the entire gaming environment, as described below.
A modern gaming environment is usually highly distributed – a lot of pieces must come together. The engineering team develops code for the gaming application, which can consist of many microservices, containers, managed cloud services, third-party APIs, and serverless functions. Game designers create game levels, the art team produces visual effects and animations, and the audio department creates game audio. Once these moving parts are packaged together, the game service has to be deployed on cloud infrastructure, and the game itself is shipped as executables for consoles, mobile devices, and laptops.
All these elements generate massive amounts of observability data, particularly if a game is popular and played by millions of users. Thus, making sure that all gaming elements are working as expected means that gaming companies have to implement efficient ways of collecting and analyzing massive amounts of observability data. That data comes from gaming services, consoles, and other real-time sources. Depending on how successful the game is, DevOps teams need to ingest, store, and analyze dozens of terabytes of ingested log data per day. As a result, improving the gaming experience can be quite challenging for gaming companies with a large scale of operations. A company’s success depends on how efficiently it can harness all the data generated by its gaming environments.
This is where observability comes in. Observability provides critical insights into what is, or what may be, wrong with the game, all its moving parts, and user experience. Powerful observability capabilities are essential, including processing vast amounts of data (logs, metrics, etc.) in real time, managing fluctuating observability data from variable user loads, and running concurrent real-time queries on a massive scale.
In the paragraphs below, I’ll cover what needs to be measured for a great gaming user experience.
Measuring the gaming user experience
Gaming companies are under continuous financial pressure to deliver new features that attract and retain new users in order to offset mounting operating expenses. Attracting new users is a balancing act from the gaming companies’ perspective. If they move too slowly, they will lose gamers. If they move too quickly to deliver new features, the gaming user experience can deteriorate due to newly introduced bugs. Gamers can be very vocal on social networks about their dissatisfaction with compromised gaming quality. As a result, companies are forced to balance the rate of feature introduction with user satisfaction. That’s why they need to measure user experience continuously.
To improve gaming user experience, gaming organizations must establish measurable key performance indicators (KPIs), which can drive improvements and overall quality. KPIs include, but are not limited to:
User video quality (frames per second)
Crash reports count
Database performance measurements
User request count
Game-specific errors and warnings count
Code quality measurements, such as automated test coverage success criteria (a part of the CI/CD pipeline)
Monitoring gaming service latency essential for a high-quality gaming experience
Gaming applications are often bursty when it comes to user load, especially during the introduction of new games or new features. One of the critical indicators to monitor is service latency to manage fluctuating user demand during those times.
DevOps teams and SREs need to collect logs at a massive scale to analyze the performance of services, infrastructure, and individual host utilization metrics such as high CPU utilization. This information gives engineers clues about the sources of bottlenecks and potential deterioration related to service latency. And logging tools need to be able to deal with this fluctuating demand. This level of insight into the state of the gaming application helps engineers troubleshoot issues before gamers face increased service latency that can negatively impact their gaming experience.
Analyzing game-specific observability data
Game developers can create custom logs and metrics to combine with observability data from the gaming engines. Both kinds of data must be analyzed to ensure a high-quality user experience. Some common gaming data parameters available for analysis are:
Log data that indicates when a player joins and leaves the game (associated with the player’s name)
The location of a player reported at specified times
Logs related to significant gaming events associated with each player (such as goals achieved or actions performed)
Logs related to chat messages between players, player location, the time when the message was generated, and message context
The more data collected, the better the chances are to understand any issues that can turn players away from the game. Observability data management should evolve in parallel with game development. As new features are added, the code should be instrumented with KPIs to ensure the long-term quality of the newly introduced code.
Database performance monitoring is critical for gaming user experience
To store time-sensitive gaming data, DevOps teams must use fast databases. The database performance and how fast the game’s properties are retrieved from a database directly impact user satisfaction. Slow database performance negatively affects the gaming user experience. For DevOps teams to avoid any service issues, it is a must to measure key database performance indicators such as response-time metrics, out-of-memory error logs, and others.
Fast databases are potentially expensive in the extremely competitive market where gaming companies are always looking for cost optimization to survive. It’s non-trivial for DevOps teams to monitor user experience related to database performance and, at the same time, to look for ways to optimize the cost of fast databases.
Continuous build and delivery monitoring
For a large-scale video game, new code is often pushed many times per day. Builds can be significant and deployed across many gaming consoles and end devices. Continuous build and delivery monitoring is crucial for frequent and commonly large software deployments.
With large builds deployed daily over the network to thousands of gaming platforms where each platform receives a massive amount of data, monitoring of build deployments should be automated. Errors from log data should trigger alerts and drive redeployments. QA automation is necessary to reduce the burden of QA efforts and to make build deliveries less error-prone and less tedious.
Better observability improves in-game purchases and advertisements
Insights into user behavior are as important as monitoring possible issues affecting user experience. Observability tools can provide key insights into the overall engagement of all game users. High engagement then drives more excitement. Higher user excitement may increase the success of ad campaigns and accelerate in-game purchases.
Properties of observability tools important for managing gaming user experience
When thinking about observability tools that can be instrumental in driving continuous improvement of user experience in the gaming industry, the following functionalities are important:
Ability to ingest and analyze log data at petabyte scale and beyond in real time to avoid gaming experience deterioration.
Zero-schema and ability to accept observability data in various formats from various gaming platforms, cloud servers, gaming consoles, and mobile devices. Because gaming environments use different third-party software packages, logs may come in many formats.
Ability to create alerts on collected observability data. Alerting can improve user experience by monitoring properties such as the number of active users experiencing client errors.
Capability to handle backpressure when critical issues occur is essential. An alert storm can clog the data pipes, and the observability tool can be in the dark for hours unless it is designed to cope with backpressure.
Ability to collect data from build pipelines to monitor frequent and commonly large software builds being deployed daily over the network to numerous gaming platforms.
Capability to collect data from cloud environments such as cloud infrastructure monitoring.
In this article, I described how observability can protect gaming revenue opportunities and ensure the success of video games. I also outlined what to look for in observability tools used to gather data from gaming environments.
Join Era Software at the Game Developers Conference in San Francisco
We are excited to sponsor the Game Developers Conference in San Francisco. Stop by Era Software’s booth, chat with us, and see a demo.
If you prefer to explore Era Software products on your own sign up at cloud.era.co/signup.