Terminology

Here are the term and concepts for experimentation

When planning & executing a-b tests, you should define required level of confidence and power.

Confidence is the inverse of the probability of making a false positive error (1-α)
Power is the inverse of the probability of making a false negative error. (1-β)

α. - denote statistical significance level

1-α. - denote confidence level

β - probability of making a false negative error

1-β - Power of experiment

False positive errors are also called Type 1 error.

False negative errors are also called Type 2 error.

Thing to note: For a given sample size and effect size, there is a generally a trade-off between confidence and power:

decreasing the odds of one type of error will increase the odds of the other.

Feedback loop Length

The time it takes from when a prediction is served until when the feedback on it is provided is the feedback loop length.
The time it takes from when treatment is applied until when the feedback on it is provided is the feedback loop length.

Tasks with shorter feedback loops can be evaluated quickly.

Recommending a product to a user online has shorter feedback loop.

Recommending a product to a user by mail/flyer has a longer feedback loop.

Fraud detection also has longer feedback loop. As user may see transaction after days/week and dispute later.

Some experiment has shown that "running an ad campign" give immediate click but many clicks come after 8-10 hours.

Feedback categories

While doing online experiments in ecommerce or online ad - one can get different categories of feedback.

Ecommerce example

User clicked on the recommended product,

Used added product in wish list
User added the product in the cart
User bought the product
User returned the product
User left good review

Online Ad

User clicked on ad
User spent time on page after clicking ad
User did transaction after going to ad page
User pressed back button

Depending on business goal for experiment , one can pay attention to specific category of feedback and use it as main metrics.

Example - user bought the product after recommendation or user did made a transaction after clicking on ad will focus on conversion metric. These experiemnt will have less click/example. However it may give insight what product user want to buy at this price.

On other hand focusing on click thru rate (ecommerce product,ad click) will give you an idea whether user have interest in such capability/product.

Page updated

Google Sites

Report abuse