R Shiny Project: A/B Test Dashbord

Posted on Jun 30, 2023

Git | Presentation YouTube Link | App

Project Goal 

The purpose of this project is to to build an intuitive data dashboard that allows non-data peers and leaders to understand data results. 

I selected an A/B test dataset from kaggle. My app can be found here

Introduction

Taken directly from the kaggle page:

“A company recently introduced a new bidding type, “average bidding”, as an alternative to its existing bidding type, called “maximum bidding”. One of our clients, ….com, has decided to test this new feature and wants to conduct an A/B test to understand if average bidding brings more conversions than maximum bidding.

The A/B test has run for 1 month and ….com now expects you to analyze and present the results of this A/B test.”

While kaggle didn’t specify the difference between these bidding types, nor the platform they are on, with research I found these are forms of Google Ads. The key difference is that Maximum Bidding is best when you are advertising a specific product, say a specific watch from your watch store. And Average Bidding is better when you are advertising your story or a category of products from your store, say watches.

Missing Values

Maximum Bidding was missing one day, which amounted to one row of data. There were several options of which I mainly considered two: a) To impute values based on the average for both groups b) To delete the same row of data from Average Bidding. Considering that we only had 30 rows of data, the more balanced solution appeared to be option ‘b’, deleting both rows.

Power Test

I utilized a power test to see if twenty-nine days provided enough data to statistically answer if one variation was better than the other. The power test indicated that 30 days was enough. 

Findings

Conversion Metrics

I engineered Clickthrough Rate, Purchase Rate, and Cart Completion Rate as the Conversion Metrics to answer the clients question: does Average Bidding have a higher Conversion Rate?

We see Maximum Bidding has a higher Conversion Rate for each metric. 

But does that mean Average Bidding results in fewer conversions (i.e. purchases)? This is where Key Metrics come in

Key Metrics

Key Metrics are top level, decision driving metrics. Relating to the clients question, we determined the Key Metrics to be Impressions (number of ad views), Website Clicks (number of ad clicks), and Purchases (number of purchases).

From this standpoint, Average Bidding performed better.

Average Bidding generated a million more Impressions. This means Average Bidding ads had one million more unique views, a valuable consideration. And Average Bidding edged out more purchases from fewer clicks.

But how about the costs? 

Cost Metrics

I engineered Cost Metrics to answer these questions: Cost per Impression, Cost per Click, and Cost per Purchase. These are calculated by dividing the number of each Metric by Spend (i.e. Impressions / Spend = Cost per Impression)

Average Bidding costs less at each step of the funnel. Notably, Cost per Impression was 50% less and Cost per Purchase 14% less, indicated by the Lift volume (the difference of which Average Bidding was less or more than Maximum Bidding). 

Average Bidding Spend was 10.43% less, though it achieved 1,000,000 more impressions, and a few hundred more purchases.

Conclusion

While Maximum Bidding had a better Conversion Rate, it had a significantly lower number of  Impressions and slightly lower number of Purchases. 

I anticipate Maximum Bidding users may be more loyal and spend more over time. Unfortunately, this can’t be confirmed with the available data. Had this been a real-world example, I would continue to collect data on each group's customers.

Recommendation

I recommend Average Bidding. Considering the limited data, Average Bidding appears to be the more effective strategy overall because it generates significantly more awareness (Impressions) and slightly more Purchases at a lower cost. 

Future Work:

  1. Simplify App Presentation
  2. Easy import from data collection platforms
  3. Prompt Queries:LLM integration to engage in conversational Q and A
  4. Would love to have data on future purchases of each group

About Author

Sam Miner

https://www.linkedin.com/in/samminer/
View all posts by Sam Miner >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI