A/B Testing Myths

Hadi published on
7 min, 1351 words

Categories: Android

Wallpaper

Do you use more than two versions of implementation when you’re A/B testing? Or do you implement them with some if conditions in your Presentes, ViewModels, etc? If yes, you’re probably A/B testing wrong!

Here we will not talk about every aspect of A/B testing, assuming you know what it is, we just want to mention some important myths that usually can be found in any project. Also you may know that A/B testing is a practice of Scientific Method in a daily basis, so the image above is not a myth itself ;)

Before reading this post, you may want to make yourself more familiar with the concept by reading Firebase A/B Testing for instance.

In Theory

You can read about it in a good article like A Refresher on A/B Testing to find out the definition:

A/B testing, at its most basic, is a way to compare two versions of something to figure out which performs better.

You can search more to find out about “not statistically significant” results or “inconclusive A/B test result”. But I can give you a simple insight here. In What Do You Do With Inconclusive A/B Test Results? article you can read

According to Experiment Engine’s data, anywhere from 50% to 80% of test results are inconclusive, depending on the vertical and stage of the testing program.

More than that 80% would be a statistically significant result where you can decide who is the winner of the A/B test. This means users must use one of implementations with respect to all their attempts more than 80% to reach the target, so we can conclude that that implementation is the winner. Not 50%! This is the myth that we wanted to talk about here. There are people out there who decide the winner when it’s just more than 50%. Being between 50% to 80% for A version simply means a pretty big part of users still prefer B version. In another word, you’re Garbage Collecting your users when you decide like that! Shame on you :D

Anyway, why two versions? Because for more than two versions the standard deviation of the distribution would increase, so you need a higher result. For instance, for three versions the 80% above would be 85% or something similar(I don’t know the exact number)! So in case of more than two versions, your test would probably reach the inconclusive results more than when you just have two versions of implementation. What would happen if you reach inconclusive results? Then you need to read What Do You Do With Inconclusive A/B Test Results? and run more tests, or no result, no change! Which is probably more costly than having just two versions.

So if you have a task to implement more than two versions of something, ask your product owner what the f.. are you doing? Then if they asked why? Show them this article and I hope I gave them enough clues to convince them. If they didn’t become convinced and have a plan to run those implementations two by two or any other tactics, then they probably know what they’re doing and do implement what they asked for! :D But always make sure you know what’s going on too.

In Practice

Here we ask ourselves as an engineer what would be important advice to implement A/B tests. There are at least two myths that I believe we need to end here.

Implement them like it’s the final implementation

The final implementation itself is the correct myth here if you’ve heard the above sentence! Every code base has a legacy part and it should have a legacy part. If your code doesn’t have the legacy part your skill set may not be improved for a while! Or … Just rethink it.

So if we don’t have any valid final implementation, there is no need to bother to make your code fit to it. Even more bother to make it fit after finishing the A/B test when you want to remove one of the implementations! Don’t do it! It wastes your time and resources. Instead refactor your code continually, because there is always a better way to implement it.

Just simply write some if conditions to switch between A/B tests

This is a bad approach because you may put that if condition in a ViewModel, Presenter, Repository, use-case, etc. Then in an actively testing app with A/B tests, you may end up having multiple if conditions of different A/B tests in those classes, then you need to cover all branches of your code in your unit tests. So if you want to write some bunch of tests for a method then you almost have to multiply the number of actually needed tests by a power of 2 to the number of if conditions of different A/B tests in that method! This is happening when the A/B test condition is not even a business logic, UI logic, or any kind of logic that you need to unit test. Don’t waste your time and resources. you’re actually heating up the earth with this approach.

So no! There is a better way! Do you remember “dependency inversion principle”? Do you remember why it’s useful? It’s useful to replace implementations, for instance, replacing fake or mock instances when unit testing. So it’s perfect to replace implementations and we do replace implementations while we run A/B tests. Also whenever the A/B testing is finished and we want to remove one of implementations, it would make your life so much easier. So let’s use this principle. The tool to use it is called “Dependency Injection” as you know, so we need to inject different implementations of different versions of A/B tests on runtime.

Imagine you can implement two classes of an interface, which are the implementations of different versions in the A/B test. Let’s call this interface Button. Then you just need a factory to decide which implementation should be instantiated for Button. Let’s have an example by sticking to Dagger 2 API.

@Module
object Module {

 @Provides
 fun provideButton(@SpecificABtest redOne: Boolean): Button =
   if(redOne)
     RedImplementationOfButton()
   else
     BlueImplementationOfButton()
}

With this explicit condition, Dagger would generate the factory for us. In this way, you don’t even feel that this condition is a part of your business logic or UI logic to bother to unit test it! This feeling is right and it feels great :)

You may need to be a little creative that your Button interface could instantiate View, Presenter, Reducer, Repository, etc. of the A/B test and be able to attach them to their right position.

This approach can be generalized for the cases when you have two totally separate implementations. For instance you have FirstButton and SecondButton for one A/B test, where they don’t share an interface. You can create factories like this.

@Module
object Module {

  @Provides
  fun provideRedButton(
    @SpecificABtest redOne: Boolean
  ): RedButton? =
    if(redOne) RedButton()
    else null

  @Provides
  fun provideBlueButton(
    @SpecificABtest redOne: Boolean
  ): BlueButton? =
    if(!redOne) BlueButton()
    else null
}

Where you can inject an optional(?) wherever you need it.

Finally with this approach you would achieve “Separation of Concerns” and “Single Responsibility Principle” for free, where are the goals of any good software engineer.

I cannot think of any other myth, but if you heard about one please let me know to add it to this list.

Your feedback is the way you can pay me back for what I just shared with you. Or you owe me forever and ever and after ;) Thanks.

References

Original Post

This article originally posted on ProAndroidDev.