Archive | DAW Recaps

March 2025 Recap – A/B Testing with Melanie Bowles

headstone from Google Graveyard

When we last had Melanie Bowles as a speaker in 2019, she lead an informative session on building a sustainable experimentation strategy. Since nothing at all has changed since 2019, we just replayed that talk and then everyone went home. While I’m obviously being facetious — much of the strategies that Melanie laid out in that talk are still very relevant! The landscape has changed a lot since then, from big changes in browser privacy and client-side technology to the shutdown of the most widely used tool in the industry, Google Optimize.

While there’s no clear successor to Optimize, there are many good testing tools out there, including popular options like: AB Tasty, Convert.com, Visual Website Optimizer, Optimizely, etc. Most of these tools do also offer integration with GA4.

Deployment count of tools via BuiltWith data

As you can see from this chart based upon deployments in the top 1M sites, none of these tools are exactly catching fire with popularity. A big reason for that may be that none of them are free for unlimited usage like Optimize was. Melanie also pointed out that A/B testing and similar functionality like feature flagging has in some cases moved into all-in-one suites like Amplitude, Salesforce, etc. The sunset of Optimize can be looked at as a chance to mature our A/B testing practices, and focus them where they can have the most impact.

Melanie also suggested that we embrace AI tools, especially on the ideation side of testing. There’s no substitute for human expertise when building out tests, but it’s certainly not cheating to let ChatGPT come up with some potential variations for your test! Just remember to give the AI as much context as you can. Melanie ran through a quick example which included providing the AI with a customer persona, which you can find in her slides below!

 

As a new twist for the meetup in 2025, we’re making a donation to a speaker-selected non-profit at each event. Melanie chose to designate Columbus Cultural Orchestra — a program for young people 13-25 to develop their musical skills and enhance diversity in orchestral music — for a $250 donation!

February 2025 Recap – Analytics the Right Way with Tim Wilson

Our first event of 2025 was a book release party for CBUSDAW’s very own Tim Wilson!

If you’ve ever been to a CBUSDAW event before (or listened to his podcast the Analytics Power Hour) you’ll know that Tim has a lot of things to say about analytics. Smart things, funny things, cranky things, etc. To our benefit, he’s organized many of these thoughts together into a book (with co-author Dr. Joe Sutherland) called “Analytics the Right Way: A Business Leader’s Guide to Putting Data to Productive Use“.

This is an excellent book that may be targeted towards “business leaders” with its title, but can also be incredibly useful for analysts themselves in terms of how to think about doing analytics in a productive way. There’s a lot of books out there covering tools, methods, and technology — but Tim and Joe’s book stands out in being about actually using analytics within an organization to further business goals. (NB: this is Jason writing this recap, and not Tim awkwardly hyping his own book in the third person. Also Tim I promise I will get around to writing that Amazon review at some point.)

But this wasn’t just a book signing with free beer, Tim did also give a talk about some of the topics he covered in the book! We had a great crowd, with friends and colleagues of Tim’s coming in from as far as Chicago, Nashville, and Boston.

 

We also had Jim Gianoglio jump in behind the camera (Tim’s normal job) and get some great action shots:

 

November 2024 Recap – Piwik PRO and Clarity with Josh Silverbauer

For our November 2024 event, we brought Josh Silverbauer in from Philly to talk about behavior analytics (in the form of MS Clarity) and marketing analytics (in the form of Piwik PRO) and when you might want to use each one.

Since Josh is well-known for writing parody songs to introduce speakers, here at CBUSDAW we flipped the script on him and opened the event with a surprise parody song about Josh.

We present “It’s Josh Silverbauer” to the tune of “In the Midnight Hour” by Wilson Pickett, sung by Jim Gianoglio and featuring Jason Packer on kazoo.

 

While Josh is a fan of both Clarity and Piwik PRO, he’s pointed out that he’s not paid anything by either organization… so he’s free to tell it like it is. And “how it is” is that both tools are great additions to any analyst’s arsenal — and with a generous free tier for Piwik PRO and a totally free product with Clarity there’s not much barrier to entry.

Josh pointed out how the two tools can easily used to supplement each other. For example, one could use Piwik PRO to find a particular aggregate group of users that aren’t converting well, and then review those users’ entire sessions with the session recording feature in Clarity. Or check the heatmap of the landing page for that same group.

If you’ve used session recording tools in the past you know that it can be pretty tedious to watch the recordings one-by-one. It’s like, “geez, just click the button already user #23341, it’s RIGHT THERE”.

Microsoft has recently integrated CoPilot into Clarity so it can now help save you from watching a ton of videos and instead can summarize and do some basic analysis for you.

Josh described Piwik PRO as “what Google’s Universal Analytics 2.0 could have been if GA4 didn’t exist”. If you were a serious user of our dearly departed UA you’ll feel right at home in Piwik PRO, and you’ll be pleased to see how well thought-out the platform is.

Josh’s slides:

Ok, so Clarity and Piwik PRO are both pretty cool tools, but what about rock operas?

Josh (and Jason) are releasing Volume 2 of Josh’s epic analytics rock opera entitled “User Journey Volume 2: The Custom Dimension” on November 18th.

You can listen to the first volume, “Universal Sunset” now on Spotify and most other streaming platforms.

Finally, don’t forget to join us next month for our yearly holiday event at the Grandview Theater. No speakers, but there will be a movie this year and you can vote on what you’d like it to be!

Disclaimer: Piwik PRO is a sponsor of CBUSDAW — but they only pay for the monthly pizza, not our (or Josh’s) endorsements.

October 2024 Recap – Geo Testing with Sanjay Tamrakar

As analysts, we love to optimize everything that we possibly can — so when we have a speaker that gives us a new way to think about testing we are here for it!

For our October event we had Sanjay Tamrakar talk to us about doing geo-testing. Sanjay covered basic methods like traditional pilot testing, to difference in differences, all the way to the current state-of-the-art with Geolift.

Back in the pre-digital era, Columbus was considered to be a top location in the US to pilot test new products, since its demographics closely matched the country as a whole. This allowed companies to try out new products, but only gave marketers an idea of how well a new product might do nationwide, not how much incremental lift different product variations might give or how different Columbus might perform vs. Chicago or Charleston. This kind of granularity requires more powerful tools.

These days there are much more expedient and statistically rigorous methods to test things, like Geolift. Geolift is a an open-source package from Meta that allows the creation of artificial control groups which we can use to test against treatment groups without having to worry as much about building control groups using user-specific information and the privacy issues which that can bring. Geolift’s synthetic control methods can create control groups by amalgamating different untreated areas whose performance was expected to match the treated areas.

There was even some R code showing up on the big screen, which sadly Tim Wilson missed!

Sanjay was also kind enough to provide us with his slides:

 

September 2024 Recap – Data and analytics Interns? At my company? I’d never really thought about it!

We mixed things up a bit for our September event, both with location and format. Given the topic, we hosted the event at Denison Edge, which is a really cool venue!

The topic was inspired by the experiences of a rising senior at Kenyon College that, despite excellent qualifications and impeccable due diligence, barely managed to land an analytics internship in the summer of 2024. Some relevant details of that internship:

  • The company that hired him was a small agency that had not really thought about having an intern
  • Through a string of improbable but fortunate events, they hired him for the summer
  • The student had a great experience, and the agency found that he added real value to their work
  • Things went so well that the company kept him on for ~10 hours/week once he returned to school in the fall

That’s the happiest of endings, sure, but the CbusDAW organizers were struck that it was almost certain that this specific tale represented countless simlar stories that never came to pass. And that’s a miss.

Consider:

  • Companies of all sizes (including small ones!) have data at their disposal that is underutilized due to a lack of resources
  • College students today—across all types of programs—are developing valuable skills in programming, statistics, and analytics in the classroom
  • Academic programs recognize the importance of their students getting hands-on, real-world experience, and there are any number of resources in place to support getting them that experience

We brought together four panelists from central Ohio-based higher education to have a discussion about getting all of those factors to work together to create more win-win situations. The panelists:


Matt Miller
Denison University

Nimet Alpay
Franklin University

Tom Metzger
The Ohio State University

Kristen Astorian
Ohio Wesleyan University

While the initial idea for the panel was “internships,” the panelists made it clear that internships are simply one way for students to get real-world experience while delivering value to organizations. Many data and analytics programs—both undergraduate and graduate level—require a capstone project that works with an organization and their data to deliver value (and capstone projects have the benefit of having instructor oversight and coaching).

Some keys to making an internship successful:

  • The project should be meaningful—using interns to work on projects that are menial doesn’t benefit the intern or the organization that hired them
  • The project should be manageable—dropping an intern into a monstrously complex data environment with overly ambitious ideas for what they will be able to deliver in a finite period of time is setting them up for failure
  • The intern should have a primary point of contact for oversight—this should be someone who actually wants to take on the work. They’re playing the role of a guide, mentor, and manager all at once.
  • Consider pairing the intern with someone deeply knowledgeable of the data itself—it can take months to ramp up on the intricacies of many organizations data environments. While students do need to get exposure to the messiness of real-world data and the often-daunting level of effort to “clean it up” as part of a project, it can be useful to have someone who knows the ins and outs of the various tables assist them in getting queries written.

There are also a surprising number of programs (if only the moderator of the panel was not also the author of this post—something of a hindrance to note-taking!) that provide support to companies who are open to taking on interns (or to working with students on capstone or other projects):

  • The career centers at most universities have staff who are deeply familiar both with there students and what it takes to scope work and provide support in order to make student work productive and impactful
  • Through various programs (a range of funding sources), companies can actually have interns’ pay subsidized (partly or fully)! The career centers at any school can point interested companies to resources for that.

It was very clear that, once an organization tries out tapping into student talent, they consistently extend and expand their programs over time. Have you given that a thought? Reach out to one or more of the people above to find out more!

August 2024 Recap – Being Intentional, Privacy by Design, and More with Matt Gershoff

For this month’s event, Matt Gershoff, CEO of Conductrics, traveled from the land of triple-digit heat on the regular (Austin) to the land of pleasant-temps-even-in-August (Columbus) to share some thoughts and examples about being intentional when it comes to data collection. If you attended the meetup, then you’re here because we promised that we’d post some useful supplemental information, and we’re going to flip the script of a normal recap post by putting those up front:

  • [20-Minute Video] Matt’s talk at PEPR ’24—a Venn diagram with the talk he gave at DAW has something like a 63% overlap, although his DAW talk is a larger circle, as there was additional material! But, since we don’t record DAW talks, the PEPR talk is a good one to share with a colleague who is kicking themselves for not attending DAW.
  • [<2-Minute Video] Matt talking about intentionality—not even remotely by design, this was a piece of an interview that another one of the DAW sponsors, Piwik PRO, did with Matt. Useful and thoughtful stuff.
  • [5-Page PDF] Privacy by Design: The 7 Foundational Principles—a very worthwhile read; Matt focused primarily on Principle #2, but you’ll never believe what Principle #4 and #7 are! (Seriously, if you give it a real read, it will blow your mind a little bit; it’s almost three decades old and was an underpinning of GDPR!)
  • Matt will also be on an upcoming (as of this writing) episode of the Analytics Power Hour podcast, so, if “audio only” is your or a colleague’s jam, smash the subscribe button there.

Matt’s presentation—with annotations added to make it an upgrade from “just the slides”—is included at the end of this post, but a few of the highlights from his presentation were:

  • “Just Enough” vs. “Just in Case” Data Collection—Matt made a stronnnnnng case that the industry bias is for the latter, while “privacy by defaultdemands the former. “Just Enough” data means aligning to a specific and explicit task or objective and then collecting as little data as needed to complete the task. “Just in Case” is a “maximize optionality” play—hoovering up as much data as possible at as granular a level as possible so that there are as many possible “options” for doing “stuff” with it in the future. We are so wired to the latter that it’s uncomfortable to recognize why that Is. Not. Good.
  • This doesn’t mean there are no cases where high granularity / high cardinality data is warranted—throughout the talk, Matt was clear that he was not speaking in any absolutes (unless we count as an absolute that, “all data collection should be performed with intentionality”).
  • Many types of A/B tests, both univariate and multivariate, can be statistically evaluated without recording data at the individual user level—if you’re like the author of this recap, you’ve always operated under the assumption that A/B tests require capturing each user’s session, including which variant they were assigned to, maybe some other meta data about them (what level of a loyalty program they belong to, for instance, for “deeper dive analysis”), whether or not they converted, and maybe the amount of their purchase. 10,000 visitors in the test? That’s 10,000 rows of data! What Matt demonstrated was, um, no. That’s incorrect thinking. By using equivalence classes, some understanding of the maths underlying the statistical tests of interest (t-test, OLS regression, and more), it’s possible to simply capture/increment aggregated counts (visitor count, sum of sales, sum of the squares of sales) and perform the exact same statistical tests in a way that is: computationally less intensive, data storage much less intensive, and aligned with privacy by design principle #2: privacy by default (and privacy by design principles #3 and #4 and #7). Matt outlined a lot of this in this blog post (although he has since extended his research and thinking on the subject… and it continues to hold up!)
  • There are different techniques and concepts that are good to be familiar with when embracing privacy by design—K-anonymity, differential privacy, global vs. local privacy, and more! The key with all of them is that they’re best employed when approaching them as privacy by design rather than privacy-tacked-on-later-to-maintain-regulatory-compliance.

A lot of ground was covered with pretty lively audience engagement and more than a few laughs!

The annotated slides:

And, as always, a few pictures to capture the atmosphere:

 

July 2024 Recap – Solo Data Science with Lauren Burke-McCarthy

Fresh from another successful DataConnect Conference, Lauren Burke-McCarthy led our July session of Data & Analytics Wednesday talking about how to survive and succeed as a solo practitioner of data science.

Being a “solo practitioner” could mean being the only data scientist on your team, being siloed in some way, or even being a freelance contractor. The strategies that Lauren presented were focused on how to best communicate and set expectations with stakeholders. We’ve all been there when a project has gone off the rails because what a practitioner implemented didn’t match at all what a stakeholder had envisioned. Let’s nip these misalignments in the bud as best we can before they can blossom into fully grown issues.

In fact it turns out many (perhaps most!) of these techniques could work for us in any data-related role we were in. What after all even is a data scientist? Lauren also took a crack at answering that age-old question off the top of her head. To paraphrase her answer, a Data Scientist focuses on models and experiments to make future-looking prediction — vs a Data Analyst works on analysis of current and historical data to identify trends and develop insights. If those two things seem to blur into each other at times, that just shows how Lauren’s advice on processes and communication works for both! Perhaps even those of us who have now added “AI” in our job titles? Could well be…

Looking to learn more about these techniques? Lauren was kind enough to provide us with our slides so you can take a look for yourself:

And, of course, pictures!

Please join us next month when the ever-delightful Matt Gershoff will be in town to discuss how to think purposely about data as we move towards privacy by design.

 

June 2024 Recap – Under the Hood of A/B Testing

Our June 2024 meetup featured Dr. Maria Copot from OSU delving into some of the underlying theories behind our favorite A/B testing platforms. Though before we get into the fun math part (yes, it’s fun, don’t look at me like that) — we need to all remember that there needs to be a question behind your experiment. If you don’t have a hypothesis you’re trying to validate, then what’s the point of testing something? Once you’ve got something you want to test, then you can test it, but testing just for the sake of saying how many A/B tests your department ran last year isn’t going to get you where you want to be.

A lot of us have been asked, “is this result statistically significant?” And maybe we’ve even said, “well, the P-value is <0.05 so it’s significant”… But what exactly is a P-value and why is 0.05 the number a big deal? Dr. Copot explained the basics of P-values, including that 0.05 is an arbitrary benchmark, and that it can’t tell you anything about the size of an effect, its validity, or reason behind it. If that still sounds a bit confusing, it’s time to queue the memes about scientists being unable to explain P-values in an intuitive way. We think Dr. Copot’s explanation would be in the top quantile of that distribution at any rate. Even if math is fun, it isn’t always intuitive.

Dr. Copot also talked about sample sizes and power analysis (one such online calculator I’ve used many times here: https://www.evanmiller.org/ab-testing/sample-size.html), but then moved on to talking about Bayesian methods. Traditional A/B tools (like Google Optimize, RIP) have typically used Frequentists methods like we’ve been talking about with P-values. Newer tools have folded in some Bayesian methods, which thankfully are a little more intuitive, if perhaps more mathematically & computationally expensive.

Finally, we talked about how privacy regulations, sampling, and cookie limitations can make doing these kinds of experiments more difficult. One way around these limitations is to use paid platforms like Prolific where you can make your own sample group and run a group of fully consented users through an experiment of your choosing.


Please join us next month when Lauren Burke-McCarthy will talk about how to succeed as a solo data scientist.

 

May 2024 Recap – Getting Real with AI

At our May 2024 event, Nick Woo from AlignAI shared a thoughtful and pragmatic perspective about how to approach figuring out what use cases are (and are not!) appropriate for AI. The turnout for the meetup was strong, and the discussion was lively!

Nick started off with a handy definition of machine learning:

“Machine Learning is an approach to learn complex patterns from existing data to make predictions on new data.”

Oh. Sure. Seems simple enough, right? But that doesn’t include generative AI, does it? As a matter of fact, it does:

  • The existing data is what was used to train the model
  • The new data is the prompt that is provided to the model (!)
  • The response to the prompt is really a prediction when the model processes that new data (!!!)

Nick also outlined the anatomy of an AI use case:

  1. Business Problem
  2. Data
  3. Training
  4. Model
  5. Accuracy Metrics
  6. UX/UI

Which step is the most common stumbling block for organizations’ proposed use cases? The “Data” one—there needs to be sufficiently scaled, cleansed, and complete data to actually develop a model that is useful. Oh, and then that model will likely need to be refreshed and refined with new data over time.

The most neglected step in the planning of an AI project? The last step: actually thinking through what the user experience should ultimately be when the model is put into production!

Nick was quick to point out that it is easy to treat AI as a hammer and then seeing all the world as a nail. If there is a simpler, cheaper, equally effective way to address a particular business problem, then addressing it with AI probably doesn’t make sense! He also acknowledged (as did several audience members) that we’re currently at a point where there are executives who truly do just want to be able to say, “We use AI,” which means some projects can be a bit misguided. This phase shall pass, we assume!

Another discussion that cropped up was measuring the ROI of an AI use case. Nick noted that this can be shaky ground:

  • AI technology platforms pushing to measure impact simply based on the adoption of the technology (rather than quantifying actual business impact)
  • Minimal use of techniques like controlled experimentation to quantify the impact (there is simply too much excitement currently to create interest in withholding the magic from a control group in a disciplined way)
  • The ROI of an AI project can be thought of as “the ROI of an OPEX project”—organizations that are disciplined about measuring the impact of non-AI OPEX projects should be pretty good about quantifying the impact of their investments; it’s just another tool in their toolkit, so the measurement mindset can be the same

And… there was more, including an example scoring matrix for prioritizing use cases across multiple criteria!

A recap post and the slides really can’t do the evening justice, but it’s better than nothing. The recap was above. The slides are right here:

And some pics from the evening:

April 2024 Recap – Data Science & AI Trends: an Audience-Guided Discussion

We tried something a little different in this month’s DAW. We actually tried two things that were a little different in this event.

What we intended to be different was that we were going to have a panel of experts who would field a bunch of questions from the audience, capture them on a whiteboard, and then talk through them. Ultimately, we did that—not exactly as it had been drawn up (so to speak), but it worked out.

The unintended difference in the event was to see how many things could go wrong and still have us pull off a successful and engaging meetup. Speculation was that the questions and answers were going to be so good that our robot overlords became concerned and flexed their AI capabilities to undermine the meetup. To wit:

  • On Monday, one of the three intended panelists pulled out of the event. No problem, Brian Sampsel was hastily recruited and graciously accepted the last-minute invitation.
  • On Wednesday morning at 4:00 AM, one of the other panelists went into labor. Did she take the time to email us that she had become unavailable? Yes. Yes she did. Katie Schafer is a machine in her own right (as our other panelist, Pete Gordon, had already noted several days earlier). But, no problem. We could do this with two panelists. What else ya’ got to throw at us, HAL? Well…
  • Weather anyone? The venue and the surrounding area had a tornado watch issued late afternoon, and the venue—Rev1—was squarely inside the tornado watch area. The tornado watch lasted until 7:00 PM (the event started at 6:30 PM). There was rain. There was wind. There was hail for Pete’s sake!

Apparently, though, analytics types take their cues from the USPS. Or have poor judgment. Or some combination? We wound up with a great turnout, with lots of good pre-talk discussion over pizza and beer:

Conveniently, the event is in the interior of the building! #tornadosafety

The discussion itself covered a wide range of topics—skewing heavily towards AI and less to date science (data science is involved in AI, of course, so it was still there):

A Range of Topics to Discuss

There is no deck to share, no recording, and this attendee didn’t take scrupulous notes, so we’ll go with a smattering of the discussion that could be retrieved from his brain the following day:

  • When will AGI (artificial general intelligence) be achieved? Pete’s estimate (which seemed serious) was: 2033. But, he also noted that AlphaGo’s infamous Move 37 (in 2016) was a glimpse into that future.
  • To RAG or not to RAG? Well… that’s a hot topic. It depends.
  • Poisoning of training data? Why, and what are the ramifications? It sounds bad, but it’s got it’s uses—see Nightshade.
  • Should newly minted software engineers be worried about AI making their jobs obsolete? No. Full stop. They’ll have some powerful new tools to employ, but their jobs aren’t going anywhere.
  • What about marketing analysts? Will AI take their jobs? This prompted quite a bit of discussion. Brian made the point that AI can do some pretty impressive exploratory data analysis (EDA), which is definitely useful! One attendee asked if he could see getting to a point where you could tell an AI-based tool what your KPIs were, and it could then just analyze the campaign. The answer was, “Yeah… but a human still needs to set appropriate KPIs!” Even MMM came up—is that AI, or is that just… sophisticated linear regression (statistics). Kinda’ more the latter, but “AI” gets slapped on it for branding purposes and we get excited!

And, of course, lots and lots more! Some pics from the event: