Archive by Author

October 2025 Recap – Custom GPTs with Bryan Huber

For our October event, Bryan Huber walked us through how he’s developed and deployed a custom GPT within his organization. As the Global VP of Digital Marketing and Analytics at Comfort Keepers, his team fields a wide variety of marketing questions from his organizations franchisees at different levels of technical sophistication. To empower those questioners as well as lighten the load on his own team, Bryan developed a custom GPT that leverages the question-answering power of ChatGPT, but also grounds it in his own organization’s best practices and adds some guardrails.

So what is a custom GPT anyways? It’s regular ChatGPT, but with a series of available customizations, including:

  • Custom instructions as you would in general ChatGPT, but sharing them across all users of all chats with the custom GPT.
    • This helps your users engineer better prompts, and put them on the right path from the start with each conversation.
    • Instructions can also help control what the custom GPT can do, steering users away from problematic areas.
  • Uploading your own documents to a knowledge base.
    • For example, you could make your own internal best practices documentation or research interactive by uploading them to a custom GPT.
    • These uploaded documents serve as a way to ground conversions in your own vetted information and also make those documents searchable.
  • Restrict the features users have access to.
    • Bryan shared some examples of egregiously poor marketing images created in ChatGPT. By turning off the image generation feature in the custom GPT, this prevent users from making those images and instead guides them to using the custom GPT to help create marketing text and come up with ideas rather than making slop images that might not follow organizational guidelines.
    • Similarly removing the web search capability can help focus the output on the vetted knowledge base and not just whatever web search can dig up.
  • Create “actions”, in the form of external API calls.
    • For example if you wanted up-to-date currency conversion numbers in your custom GPT you could connect to an external API using your own API key and get accurate numbers there rather than relying upon outdated training data or slow web search (which might be disabled in your GPT!)
    • Part of Bryan’s roadmap is to connect the custom GPT to the Google Ads API which allows its users to get detailed real-time information about things like CPC costs of keywords.

All of this for zero additional dollars, as custom GPTs are included on all paid ChatGPT plans!  Please note that on lower-level plans the custom GPTs you create will be public by default and include their conversation data into future OpenAI training data (the latter can be turned off under “Additional Settings” once the GPT is created).

This functionality is not exclusive to OpenAI, Claude offers similar functionality in “Projects” and Google Gemini does in “Gems”.

He also walked us through his journey of rolling out this tool to users, from early adopters to a happy user base of over 300 users.

Bryan also provided us with his slides! Since he’s also an organizer of this event, he would’ve had a stern conversation with himself if he had not.

As always, the crowd had lots of practical questions!

September 2025 Recap – Piwik PRO

For our September events we welcomed sponsor Piwik PRO to Columbus for not one but two events!

On Wednesday evening we had Jason Packer of Quantable Analytics talking about tracking methods, and Piotr Słonina of Piwik PRO talking about how their product fuses different kinds of tracking methods together in reporting.  Piotr and Marcin Pluskota flew all the way to Columbus from Wrocław, Poland for the event! We’d like to apologize for their flight delays and inform them that a multi-hour delay in Atlanta is, in fact, a rite of passage.

Due to an unforeseen scheduling snafu we had our first “al atrio” presentation in the atrium of Rev1 rather than in our typical room location, but we made it work. Jason and Piotr pitched a double-header of a presentation that covered things like:

    • How can we track anonymized users in a privacy-respectful way, even if those users decline cookies?
      Our answer – by not “tracking” those users in a way that lasts beyond a short period of time, or in ways that could clearly identify a particular user. Jason felt this should probably all be done with cookies, but the regulations around cookies have caused many vendors to look for work-arounds. Some of those work-arounds are more privacy-respectful than others.
    • Where is the line between a session hash (like IP + User-Agent) and browser fingerprinting?
      Our answer – it’s not a distinct line, but tools that use invasive methods and provide durable fingerprints are on the wrong side of it, at least when used for tracking.
    • How do tools like GA4 and Piwik PRO handle these different types of users: logged-in, cookied, and non-cookied?
      Our answer – GA4’s default blended user identity has a tiered hierarchy of user id, cookies, and modeling based upon “cookieless pings”. Piwik PRO has a more flexible solution that uses session hashes and allows individual sites to choose their own adventure when it comes to dealing with the so-called “consent gap”.

Jason’s Slides:

Piotr’s Slides

For those looking to learn even more, we also hosted a follow-up seminar on Thursday. This was the first official Piwik PRO meetup in the US, and we were were proud to have done that in Columbus! Some of the highlights from Thursday included:

  • A deeper discussion of the Piwik PRO suite including Tag Manager, CMP, and CDP.
  • Discussion of Piwik PRO’s migration away from their freemium model. Pricing now starts at $40/mo, more pricing info here.
  • A sneak preview of how Piwik PRO will be integrating the Fraud0 anti-bot system into their platform.
  • Delicious food from Brassica, and a very restrained amount of griping about GA4 from the audience.

Still hungry for even more Piwik PRO? Check out the upcoming Piwik PRO day on October 21, a virtual event featuring speakers such as Simo Ahava, Brian Clifton, Steen Rasmussen, as well as CBUSDAW veterans Matt Gershoff and Josh Silverbauer!

Some pics from both events:

June 2025 Recap – Reducing LLM Hallucinations

Our June event featured Ash Lewis from Ohio State Linguistics talking about why LLMs give us incorrect information so often, and strategies we can use to reduce this behavior.

While the standard term for this is of course “hallucination”, Ash pointed out that the term “confabulation” more accurately describes what is happening. Hallucination implies that the LLM is incorrectly perceiving something, however what we’re describing is not misperception. It’s AI creating statistically probable, yet incorrect information.

Wikipedia agrees with Ash and described hallucination thusly:

This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneously constructed responses (confabulation), rather than perceptual experiences.

Of course as any linguist would point out, we don’t get to prescriptively say how language should be used… so we’re pretty much stuck with “hallucination”.

Whatever the term, this happens because everything that an LLM creates is simply what is statistically probable. Output which is also true is coincidental to the process. In other words: it’s guessing about everything, it just happens to be right enough to be very useful.

If you think that this is an issue of the past limited to older models, here’s current example from o4-mini:

During the talk we verified our group’s nerd cred by knowing how this guy found out who his dad was.

So how do we reduce this problem as much as possible? Here’s Ash’s helpful field guide notes:

Our prompt about our group’s history has set ourselves up for failure by breaking most of Ash’s rules.
We made the following mistakes:

  • Not breaking down (decomposing) our ask into small components. We didn’t ask for a more granular question, like the year the group started or a list of previous topics, we asked for a whole history.
  • Not encouraging the LLM to check its own work, step through its reasoning, or provide sources, or indicate uncertainty. As soon as we follow-up and ask things like, “what is your source for attendance doubling” it will say it has none.
  • Not letting the LLM search the web (this is a form of RAG). The o-series models from OpenAI are pretty good at knowing when they should do this, and likely would have done so in this scenario.
  • Not getting more than one response. When asked a second time with the same prompt, it said it didn’t have enough information to give a response.

Ash then dug into details about the work she is doing with COSI (the Columbus science museum) creating an AI agent that can help visitors with questions about the museum. This work attempts to limit hallucinations as well as provide the museum a more affordable and privacy-friendly solution than just sending things to ChatGPT.

She also helpfully has provided us with her slides!

As usual — especially when it’s a talk on AI — the crowd had a lot of great questions!

And a few pictures from the event. All totally real. Really. Maybe?

April 2025 – Using Predictive Modeling to Prevent Homelessness with Ty Henkaline

Our April 2025 event featured Ty Henkaline talking about work that he has done with non-profits in Franklin County to help better understand homelessness. Ty has been working with Smart Columbus’ Columbus Community Information Exchange Initiative (CIE) to produce research that utilizes data from the Mid-Ohio Food Collective (MOFC) and the Community Shelter Board (CSB) to help us better understand this growing problem.

As Ben Franklin — for whom our county is named after — famously said, “An ounce of prevention is worth a pound of cure.” No question this is doubly true for homelessness, and providing early warning to agencies that help prevent these crises is a great use of data.

But as Ty pointed out, this data is not always easy to come by. Our existing systems were all built separately, and data integration was never a priority. Sensitive data about at-risk individuals is a challenging arena to work in, and Ty emphasized both the value of having partners that were truly invested in making this system work as well as the potential value of additional data sources.

This “spike chart” was a huge hit with the audience, and shows the following things:

  1. A growing increase in services usage (in particular food banks) was a strong leading indicator of a homelessness.
  2. With far fewer data sources compared to LA, Franklin County was able to see a very similar effect. How often do you see that in data modeling?
  3. Individuals experiencing first-time homelessness continue to need an elevated level of services after the initial crisis. This reinforces the notion that prevention can do a lot to improve the overall load on the system.

As promised, Ty provided us with his slides, which contain lots of links and some calls to action! Try scrolling to navigate the slides, or check out the direct link here.

If you’re interested in helping or learning more, please feel free to message Ty on LinkedIn.

Check out the engaged audience!

March 2025 Recap – A/B Testing with Melanie Bowles

headstone from Google Graveyard

When we last had Melanie Bowles as a speaker in 2019, she lead an informative session on building a sustainable experimentation strategy. Since nothing at all has changed since 2019, we just replayed that talk and then everyone went home. While I’m obviously being facetious — much of the strategies that Melanie laid out in that talk are still very relevant! The landscape has changed a lot since then, from big changes in browser privacy and client-side technology to the shutdown of the most widely used tool in the industry, Google Optimize.

While there’s no clear successor to Optimize, there are many good testing tools out there, including popular options like: AB Tasty, Convert.com, Visual Website Optimizer, Optimizely, etc. Most of these tools do also offer integration with GA4.

Deployment count of tools via BuiltWith data

As you can see from this chart based upon deployments in the top 1M sites, none of these tools are exactly catching fire with popularity. A big reason for that may be that none of them are free for unlimited usage like Optimize was. Melanie also pointed out that A/B testing and similar functionality like feature flagging has in some cases moved into all-in-one suites like Amplitude, Salesforce, etc. The sunset of Optimize can be looked at as a chance to mature our A/B testing practices, and focus them where they can have the most impact.

Melanie also suggested that we embrace AI tools, especially on the ideation side of testing. There’s no substitute for human expertise when building out tests, but it’s certainly not cheating to let ChatGPT come up with some potential variations for your test! Just remember to give the AI as much context as you can. Melanie ran through a quick example which included providing the AI with a customer persona, which you can find in her slides below!

 

As a new twist for the meetup in 2025, we’re making a donation to a speaker-selected non-profit at each event. Melanie chose to designate Columbus Cultural Orchestra — a program for young people 13-25 to develop their musical skills and enhance diversity in orchestral music — for a $250 donation!

February 2025 Recap – Analytics the Right Way with Tim Wilson

Our first event of 2025 was a book release party for CBUSDAW’s very own Tim Wilson!

If you’ve ever been to a CBUSDAW event before (or listened to his podcast the Analytics Power Hour) you’ll know that Tim has a lot of things to say about analytics. Smart things, funny things, cranky things, etc. To our benefit, he’s organized many of these thoughts together into a book (with co-author Dr. Joe Sutherland) called “Analytics the Right Way: A Business Leader’s Guide to Putting Data to Productive Use“.

This is an excellent book that may be targeted towards “business leaders” with its title, but can also be incredibly useful for analysts themselves in terms of how to think about doing analytics in a productive way. There’s a lot of books out there covering tools, methods, and technology — but Tim and Joe’s book stands out in being about actually using analytics within an organization to further business goals. (NB: this is Jason writing this recap, and not Tim awkwardly hyping his own book in the third person. Also Tim I promise I will get around to writing that Amazon review at some point.)

But this wasn’t just a book signing with free beer, Tim did also give a talk about some of the topics he covered in the book! We had a great crowd, with friends and colleagues of Tim’s coming in from as far as Chicago, Nashville, and Boston.

 

We also had Jim Gianoglio jump in behind the camera (Tim’s normal job) and get some great action shots:

 

November 2024 Recap – Piwik PRO and Clarity with Josh Silverbauer

For our November 2024 event, we brought Josh Silverbauer in from Philly to talk about behavior analytics (in the form of MS Clarity) and marketing analytics (in the form of Piwik PRO) and when you might want to use each one.

Since Josh is well-known for writing parody songs to introduce speakers, here at CBUSDAW we flipped the script on him and opened the event with a surprise parody song about Josh.

We present “It’s Josh Silverbauer” to the tune of “In the Midnight Hour” by Wilson Pickett, sung by Jim Gianoglio and featuring Jason Packer on kazoo.

 

While Josh is a fan of both Clarity and Piwik PRO, he’s pointed out that he’s not paid anything by either organization… so he’s free to tell it like it is. And “how it is” is that both tools are great additions to any analyst’s arsenal — and with a generous free tier for Piwik PRO and a totally free product with Clarity there’s not much barrier to entry.

Josh pointed out how the two tools can easily used to supplement each other. For example, one could use Piwik PRO to find a particular aggregate group of users that aren’t converting well, and then review those users’ entire sessions with the session recording feature in Clarity. Or check the heatmap of the landing page for that same group.

If you’ve used session recording tools in the past you know that it can be pretty tedious to watch the recordings one-by-one. It’s like, “geez, just click the button already user #23341, it’s RIGHT THERE”.

Microsoft has recently integrated CoPilot into Clarity so it can now help save you from watching a ton of videos and instead can summarize and do some basic analysis for you.

Josh described Piwik PRO as “what Google’s Universal Analytics 2.0 could have been if GA4 didn’t exist”. If you were a serious user of our dearly departed UA you’ll feel right at home in Piwik PRO, and you’ll be pleased to see how well thought-out the platform is.

Josh’s slides:

Ok, so Clarity and Piwik PRO are both pretty cool tools, but what about rock operas?

Josh (and Jason) are releasing Volume 2 of Josh’s epic analytics rock opera entitled “User Journey Volume 2: The Custom Dimension” on November 18th.

You can listen to the first volume, “Universal Sunset” now on Spotify and most other streaming platforms.

Finally, don’t forget to join us next month for our yearly holiday event at the Grandview Theater. No speakers, but there will be a movie this year and you can vote on what you’d like it to be!

Disclaimer: Piwik PRO is a sponsor of CBUSDAW — but they only pay for the monthly pizza, not our (or Josh’s) endorsements.

October 2024 Recap – Geo Testing with Sanjay Tamrakar

As analysts, we love to optimize everything that we possibly can — so when we have a speaker that gives us a new way to think about testing we are here for it!

For our October event we had Sanjay Tamrakar talk to us about doing geo-testing. Sanjay covered basic methods like traditional pilot testing, to difference in differences, all the way to the current state-of-the-art with Geolift.

Back in the pre-digital era, Columbus was considered to be a top location in the US to pilot test new products, since its demographics closely matched the country as a whole. This allowed companies to try out new products, but only gave marketers an idea of how well a new product might do nationwide, not how much incremental lift different product variations might give or how different Columbus might perform vs. Chicago or Charleston. This kind of granularity requires more powerful tools.

These days there are much more expedient and statistically rigorous methods to test things, like Geolift. Geolift is a an open-source package from Meta that allows the creation of artificial control groups which we can use to test against treatment groups without having to worry as much about building control groups using user-specific information and the privacy issues which that can bring. Geolift’s synthetic control methods can create control groups by amalgamating different untreated areas whose performance was expected to match the treated areas.

There was even some R code showing up on the big screen, which sadly Tim Wilson missed!

Sanjay was also kind enough to provide us with his slides:

 

July 2024 Recap – Solo Data Science with Lauren Burke-McCarthy

Fresh from another successful DataConnect Conference, Lauren Burke-McCarthy led our July session of Data & Analytics Wednesday talking about how to survive and succeed as a solo practitioner of data science.

Being a “solo practitioner” could mean being the only data scientist on your team, being siloed in some way, or even being a freelance contractor. The strategies that Lauren presented were focused on how to best communicate and set expectations with stakeholders. We’ve all been there when a project has gone off the rails because what a practitioner implemented didn’t match at all what a stakeholder had envisioned. Let’s nip these misalignments in the bud as best we can before they can blossom into fully grown issues.

In fact it turns out many (perhaps most!) of these techniques could work for us in any data-related role we were in. What after all even is a data scientist? Lauren also took a crack at answering that age-old question off the top of her head. To paraphrase her answer, a Data Scientist focuses on models and experiments to make future-looking prediction — vs a Data Analyst works on analysis of current and historical data to identify trends and develop insights. If those two things seem to blur into each other at times, that just shows how Lauren’s advice on processes and communication works for both! Perhaps even those of us who have now added “AI” in our job titles? Could well be…

Looking to learn more about these techniques? Lauren was kind enough to provide us with our slides so you can take a look for yourself:

And, of course, pictures!

Please join us next month when the ever-delightful Matt Gershoff will be in town to discuss how to think purposely about data as we move towards privacy by design.

 

June 2024 Recap – Under the Hood of A/B Testing

Our June 2024 meetup featured Dr. Maria Copot from OSU delving into some of the underlying theories behind our favorite A/B testing platforms. Though before we get into the fun math part (yes, it’s fun, don’t look at me like that) — we need to all remember that there needs to be a question behind your experiment. If you don’t have a hypothesis you’re trying to validate, then what’s the point of testing something? Once you’ve got something you want to test, then you can test it, but testing just for the sake of saying how many A/B tests your department ran last year isn’t going to get you where you want to be.

A lot of us have been asked, “is this result statistically significant?” And maybe we’ve even said, “well, the P-value is <0.05 so it’s significant”… But what exactly is a P-value and why is 0.05 the number a big deal? Dr. Copot explained the basics of P-values, including that 0.05 is an arbitrary benchmark, and that it can’t tell you anything about the size of an effect, its validity, or reason behind it. If that still sounds a bit confusing, it’s time to queue the memes about scientists being unable to explain P-values in an intuitive way. We think Dr. Copot’s explanation would be in the top quantile of that distribution at any rate. Even if math is fun, it isn’t always intuitive.

Dr. Copot also talked about sample sizes and power analysis (one such online calculator I’ve used many times here: https://www.evanmiller.org/ab-testing/sample-size.html), but then moved on to talking about Bayesian methods. Traditional A/B tools (like Google Optimize, RIP) have typically used Frequentists methods like we’ve been talking about with P-values. Newer tools have folded in some Bayesian methods, which thankfully are a little more intuitive, if perhaps more mathematically & computationally expensive.

Finally, we talked about how privacy regulations, sampling, and cookie limitations can make doing these kinds of experiments more difficult. One way around these limitations is to use paid platforms like Prolific where you can make your own sample group and run a group of fully consented users through an experiment of your choosing.


Please join us next month when Lauren Burke-McCarthy will talk about how to succeed as a solo data scientist.