Digital Turbine's Multi-Testing mechanism allows publishers to try out different variants within the same placement, to achieve the best returns from their configurations. Once you have analysed the results of your test, you can then decide to either increase or decrease the percentage of traffic it receives. Alternatively, you can deprecate it.
The main benefit of Multi-Testing is that it allows you to test different placement configurations on just a subset of your users which enables you to understand the effect of the change prior to rolling it out as part of your entire traffic.
What Can be Tested?
Set out below are options that can be tested using Multi-Testing:
- Introducing a new network to your waterfall and measuring the impact of a new network instance to the waterfall
- Comparing a waterfall with instances with Auto CPM vs the same waterfall with Fixed CPM instances
- Adding bidding mediated instances vs. traditional mediation instances
- Sending different price floors to DT Exchange
- How the number of ads each user is receiving impacts performance
- How the frequency of ads each user is receiving impacts performance
- Comparing different country targeting options
- Banner refresh rate - test different rates starting 10 Sec
Multi-Testing - Key Components
The key components of a Multi-Testing experiment include:
|Experiment||The test run by showing different configurations to different groups of users|
|Variant||Defines the number of groups and the distribution of users to those groups. One variant is always the control group|
|Frequency||The percentage of traffic to be allocated to this variant at user level|
|Control Variant||A control variant or control group receives the current or default configurations of your placement. In other words, the control variant does not receive any changes and is used as a benchmark against which other test results are measured|
|Goal||The metric that you want to improve with the experiment. An experiment compares the goal metric across the variant groups so that you can see which change has the most desirable effect|
For an accurate split of traffic, make sure you are using DT FairBid SDK 3.1.0 and above.
When configuring Multi-Testing, follow these steps.
Step 1: Pre-Planning
Ask the following questions regarding the purpose of the test:
- What do you want to test?
Example: Introducing another network to my waterfall for a rewarded placement in my app, called "Rewarded iOS on launch"
- What is the experiment goal?
In other words, which metric are you optimising? Example: I'm optimising ARPDEU, which means average revenue per daily engaged user (an engaged user is a user who had at least 1 impression per that day). I want to make sure I make the most revenue from each unique impression
- How long should the experiment run?
Example: My placement has 20K impressions and 10K unique impressions per day. This means, it would take about a week to get sufficient data for effective results.
- What is the “actionable” value below which you keep the control and above which you move to one of the test variants?
Example: If the test variant performs better than the control variant for at least 5% or more, I will use the test variant settings for my entire placement
Step 2: Configuring a Multi-Testing Experiment
Follow these steps to configure a Multi-Testing experiment.
- Go to Placement Setup page
- Move the toggle to start the test
A variant list opens with one variant. This variant uses your existing placement settings and should be used as the control group.
You can either:
- Create a new variant from scratch; or
- Duplicate your existing variant and edit it (recommended)
The only item on the list of variants is the current configuration of your placement, its name is derived from the name of your placement.
Edit the name to make it more descriptive so that it includes (you can always rename later):
- The word: Control
- Experiment name
- Traffic Allocation percentage
- Experiment Start Date
For example: "Control Adding Verizon 80% 06 30 20"
- If you want to keep these configurations and add a single change, click the Duplicate button.
Another option appears in the list, this is your test group, also called a variant
- Give the duplicate variant a name. Preferably a descriptive name, that includes the following (you can always rename later):
- The word: Test or Treatment
- Experiment name
- Frequency of traffic allocation percentage (to equal 100%)
- Experiment Start Date
For example: "Test Adding Verizon 20% 06 21 20"
- An estimate is provided for how long it is recommended to run the test. This data ensures you are running a statistically significant test.
The number of recommended days relies on the placement ARPDEU and the traffic allocation set. The recommended duration is meant to guarantee (with a 90% confidence level) that the expected results of the multi-test are reliable. Pay attention, for any unusual days in terms of your users' behavior, such as a holiday or special selling day, that may occur during the multi-test period. We recommend adding an additional day to the number of recommended days above.
- Click on the 3 vertical next to the test variant to change its configurations, such as Targeting, Floor Price, Capping and adding or changing ad network instances.
To test whether adding a network instance helps to increase your Average Revenue Per Engaged User, add a mediated network instance to your waterfall.
The experiment is set, and it will start generating data.
To create an A/B/C Test follow the above steps, just add an additional variant and make sure that your traffic allocation equals 100%.
After an experiment has started, we advise that you no longer make changes to variant configurations to avoid invalidation of the test results.
Stopping Multi-Testing Experiment
To end Multi-Testing experiment, do the following:
- Choose either the test or control to be the leading placement configuration
- Click the Multi-Testing toggle button to turn the status to off for the variant that you want to stop
Analysis of Multi-Testing Results
Results for the Multi-Testing are found using DT's Dynamic Reports.
- Go to Dynamic Reports
- Select App Performance
- Filter by: Publisher > App > Placement
- Split the data using the Variant Name dimension and then add the dimensions for your testing, such as Publisher Name, App Name and Placement Name
- Add metrics: Avg. Rev. per Engaged User, Fill Rate
- Compare your variants to see which one performs better.
The results of the Multi-Testing variants can be viewed in the report body:
|Avg. Rev. per Engaged User||Average Revenue (publisher payout) Per Daily Engaged User; Engaged Users are users that saw at least 1 ad from the particular ad placement being analyzed|
|Publisher Revenue||This metric depends on the allocation of the impressions. To perform a comparison, it should be normalized|
|Fill Rate||Fill rate is calculated by dividing the number of times an ad request is filled by an ad network (percentage)|
To normalize the Publisher Payout:
|Variant||Avg eCPM||Publisher Payout|
|Test Adding Verizon 20% 06 30 20||1.5||1000|
|Control Adding Verizon 80% 06 30 20||1.2||1500|
Add a column to calculate how much the revenue would have been, had the variants would be on the same part of the traffic (for example, each of them on 50%):
|Variant||Avg eCPM||Publisher Payout||Frequency||Normalized Payout (Assuming 50% Allocation)|
|Control Adding Verizon 80% 06 30 20||1.2||1500||80%||937.5|
|Test Adding Verizon 20% 06 30 20||1.5||1000||20%||1250 Winner|
Now you can split the impressions between the 2 variants for several days and explore the results, or use the test version as the main configuration of the placement.
Multi-Testing Best Practices
The benefit of DT FairBid's Multi-Testing is that it allows you to run separate tests on each placement simultaneously. With sufficient traffic volume, you can safely run several experiments for each app.
Although technically you can even run several experiments on one placement at the same time, DT recommends that you stick to one experiment per placement at a time. The reason behind this is because different experiments may interact since some users will see the configurations for both experiments.
For example, running a test to see the impact of introducing a new network to your interstitial placement waterfall and then running a different test to see ad pacing on the same placement, may lead to difficulty understanding which experiment caused which changes in the results.
If you do not perform your test on enough DEUs (Daily Engaged User), the results received are likely to be unreliable.
If your test sample is too small, the results of the test may not be accurate. For example, if one variant receives 150% more ARPDEU but the DEU was only 5, then the results will be statistically insignificant.