Churn Prediction With BigQueryML to Increase Mobile Game Revenue

RudderStack
4 min readAug 9, 2021

Powering Google BigQuery ML with Rudderstack

One of my favorite parts about working at Rudderstack is having the opportunity to speak with our customers and hear how they’re using the data collected via the Rudderstack application to improve their bottom line. Yes, improving customer engagement through better personalization is the goal of all of our customers, but at the end of the day those efforts need to impact the bottom line.

I recently had the opportunity to speak with the team at Torpedo Labs about how they increased revenue for Wynn Casino’s Wynn Slots app to the tune of $10,000 a day by reducing customer churn. Here’s how they did it.

Phase 1 — Powerful Dashboards To Define Engagement

You may recall a blog post in 2020 where we discussed how Torpedo was leveraging the RudderStack Unity SDK to send over 1 billion events per month to Redshift to power a robust analytics dashboard built with SuperSet. This cost-effective solution shed light on what player activities were important for driving engagement and helped refine the definition of high-value player cohorts as well as identify player churn. After crossing the 100 million download milestone, the teams at Wynn Resorts and Torpedo were looking for innovative ways to continue improving the gaming experience, user engagement and most importantly, reduce the churn on those high-value customers making in-app purchases.

Phase 2 — Churn Prediction Models Using BigQuery ML

In early 2021, the Data teams from Torpedo and Rudderstack got together to discuss options for leveraging the data on their high-value players — the 5% of their customers who are responsible for 95% of their revenue. The idea proposed was to concurrently send the data being collected on certain cohorts of players to Redshift and BigQuery and leverage the power of Google BigQuery ML to create a churn prediction score for each user.

But before they could unleash the power of machine learning, the data needed to be loaded into Google BigQuery and Redshift, a simple task that was handled by connecting a second destination on the existing Unity SDK source within Rudderstack.

Once the data was loaded into BigQuery, the process of training and testing the model could begin. The player data set was split by active vs. inactive and high value vs. non high value players. In addition to the churn flag, a player feature set was defined which included various metrics on different player activities as well as a player_pool value. The player_pool value is used to differentiate the players into two buckets, one for Testing and the other for Training. To ensure this process is repeatable and not overly weighted, the player_pool flag is reset based on the total number of player events.

A standard multiclass logistic regression model with automatically calculated weights was created within BigQuery ML using the player set tagged as Training.

CREATE OR REPLACE MODEL `rudder_project..tbl_create_model_1`
OPTIONS (
model_type='LOGISTIC_REG'
,auto_class_weights=TRUE
,data_split_method='NO_SPLIT'
,input_label_cols=['churn']
,max_iterations = 12)
AS
SELECT *
FROM `rudder_project..tbl_player_reg`
where player_pool = "training" and high_value = 'false'

The model creation process is iterative and the above steps repeated until the optimum performance was achieved.

SELECT * 
FROM ML.EVALUATE
(
MODEL `rudder_project..tbl_create_model_1`,
(
SELECT *
FROM `rudder_project..tbl_player_reg`
WHERE player_pool = "test" and high_value = 'false'
)
)

To predict the churn of our high value players, the same player feature set table will be filtered for high_value equals true.

SELECT * FROM  (
SELECT u as user_id, predicted_churn
FROM ML.PREDICT
(
MODEL `rudder_project..tbl_create_model_1`,
(
SELECT *
FROM `rudder_project..tbl_player_reg`
WHERE high_value = 'true'
)
, STRUCT(0.5 AS threshold)
) order by predicted_churn
)

The output of the prediction model is a player rating from 0 to 1 with values of .5 or higher indicating a likelihood of a player churning. A JSON results file is then imported back into the application itself to generate push notifications and emails with incentives and offers to persuade players to re-engage with the app.

The player ratings make it easy for the team at Torpedo labs to identify which of Wynn Casino’s highest value customers are dangerously close to churning, and their re-engagement efforts are generating revenue gains of $10,000 per day. Innovative use cases like this are what get us excited here at RudderStack, and this RudderStack — BigQuery combination is just one example of how you can leverage Rudderstack to fuel your machine learning applications.

Sign up for free and start sending data

Test out our event stream, ELT and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app. Get started.

This blog was originally published at:
https://rudderstack.com/blog/churn-prediction-with-bigqueryml

--

--