The NEW 20 lines of code that will beat A/B testing every time

1 minute read

In 2012, Steve Hanov wrote the popular and controversial blog post “20 lines of code that will beat A/B testing every time” that brought the previously academic idea of multi armed bandit algorithms into the awareness of the larger developer community.

Here were his original 20 (actually 16) lines of “code”:

def choose():
    if math.random() < 0.1:
        # exploration!
        # choose a random lever 10% of the time.
    else:
        # exploitation!
        # for each lever, 
            # calculate the expectation of reward. 
            # This is the number of trials of the lever divided by the total reward 
            # given by that lever.
        # choose the lever with the greatest expectation of reward.
    # increment the number of times the chosen lever has been played.
    # store test data in redis, choice in session key, etc..

def reward(choice, amount):
    # add the reward to the total for the given lever.

Much of the controversy surrounding this post was that it oversold the simplicity of implementing multi-armed bandits in production.

Some months after this post I implemented an extremely simple multi-armed bandit algortithm called Thompson Sampling and even that first brittle implementation was easily hundreds of lines of code, but nonethless the performance of approach was promising.

Now after years of work on cutting edge contextual multi armed bandit algorithms and many thousands of lines of code, I’m pleased to present a new “10 lines of code that will beat A/B testing every time” that is robust, scalable, and proven in large scale production:

from improveai import DecisionModel

model = DecisionModel(track_url=track_url)
model.load(model_url)

def choose(variants):
    return model.choose_from(variants)

def reward(choice, amount):
    choice.add_reward(amount)

Unlike Steve’s code that envisioned querying something like Redis, Improve AI loads a decision model locally into the process so that decisions are made immediately with zero network latency. This is just one of the many refinements that have been made over the years to deliver a production ready multi-armed bandit system.

Improve AI is available now for iOS, Android, and Python. Most developers will find the free trainer is fully suitable for their needs.

Updated: