📖 Problem Statement
An early-stage start up in Germany has been working on a website redesign of their landing page. The team believes a new design will increase the number of people who click through and join the site.They have been testing the changes for a few weeks, and now they want to measure the impact of the change and need to determine if the increase can be due to random chance or if it is statistically significant.
Aims and Objectives
- Analyze the conversion rates for each of the four groups: the new/old design of the landing page and the new/old pictures.
- Can the increases observed be explained by randomness?
- Which version of the website should they use?
💾 Data Summary
#load libraries library(tidyverse) #load data df <- read_csv("C:/Users/Adejumo/Downloads/redesign.csv") head(df)
## # A tibble: 6 x 3 ## treatment new_images converted ## <chr> <chr> <dbl> ## 1 yes yes 0 ## 2 yes yes 0 ## 3 yes yes 0 ## 4 yes no 0 ## 5 no yes 0 ## 6 yes no 0
|Number of rows||40484|
|Number of columns||3|
|Column type frequency:|
Variable type: character
Variable type: numeric
treatment- “yes” if the user saw the new version of the landing page, no otherwise.
new_images- “yes” if the page used a new set of images, no otherwise.
converted- 1 if the user joined the site, 0 otherwise.
- Their are 40484 users who have visited the site and we have no missing values in the dataset.
Group Ausers with “yes” in both columns: the new version with the new set of images.
Group Busers with yes" in column one and “no” in column two: the new version of website with old set of images
Group Cusers with “no” in column one and “yes” in column two: old version of website with new set of images
Group Dthe control group is those users with “no” in both columns: the old version with the old set of images.
Increase in users is due to chance and their is no statistical difference between the four groups.
Their is statistical significance difference between the four groups and increse in users was not as a result of chance.
The A/B testing or bucket testing is a statistical methodology for comparing between two versions of a web page or mobile app to see which one drives more users. The version with the highest conversion rate wins. This will be used to answer our questions and see which of the landing page design and images is better.
#unite the treatment and images column to form a new column which contains our groups df_new <- df %>% unite("group", treatment:new_images, sep = "-", remove = T) # create the table with the absolute proportion prop <- table(df_new) prop_abs <- addmargins(prop) prop_abs
## converted ## group 0 1 Sum ## no-no 9037 1084 10121 ## no-yes 8982 1139 10121 ## yes-no 8906 1215 10121 ## yes-yes 8970 1151 10121 ## Sum 35895 4589 40484
Conversion Rate(relative proportion)
# create the table with the relative proportion prop_rel <- prop.table(prop, 1) prop_rel <- round(addmargins(prop_rel, 2), 3) prop_rel
## converted ## group 0 1 Sum ## no-no 0.893 0.107 1.000 ## no-yes 0.887 0.113 1.000 ## yes-no 0.880 0.120 1.000 ## yes-yes 0.886 0.114 1.000
Group A(new landing page design and new images) has a conversion rate of 11.4%
Group B(new landing page design and old images) has a conversion rate of 12%
Group C(old landing page design and new images) has a conversion rate of 11.3%
Group D(old landing page design and old images) has a conversion rate of 10.7%
Clearly the highest conversion rate is Group B i.e new landing page design while retaining old images and the lowest conversion rate is Group D i.e old landing page design with old images.
Pearson’s Chi squared test of proportion
#Pearson chi squared test for proportion prop.test(prop)
## ## 4-sample test for equality of proportions without continuity ## correction ## ## data: prop ## X-squared = 8.5261, df = 3, p-value = 0.0363 ## alternative hypothesis: two.sided ## sample estimates: ## prop 1 prop 2 prop 3 prop 4 ## 0.8928960 0.8874617 0.8799526 0.8862761
The Pearson’s chi squared test for proportion shows us that that the p-value is less than 0.05 which implies that the 4 groups are significantly different from each other. The null hypothesis is rejected indicating that the increase in users is not by chance.
The website design was successful, the top conversion rates came from the new landing page designs, but the company should retain old images in the new design. This is better and will attract more people to join the site.
If you find this analysis interesting, please upvote