February 2024 – Are We Dangerously Obsessed with Data Collection?

This month’s meetup was our first since we were rechristened as “Columbus Data & Analytics Wednesday.” In an unintentional twist, the topic for the event was centered around the speaker’s contention that we (the broad, collective “we”) devote too much of our time and energy to the collection and management of data, and not enough effort to actually putting that data to productive and impactful use within our organizations.

Tim started out by calling out that, if we consider any task that has any relationship to data as “data work,” then we can further categorize each of those tasks into one of two buckets:

Data Collection and Management
Data Usage

He noted that there is no inherent business value in the collection and management data. There is only the potential for value. To realize that value requires putting the data to meaningful, applicable business use.

All too often, data workers get so caught up in data collection and management tasks, though, that they start to believe that there is inherent business value in those tasks alone. Tim pointed to three reasons for this happening:

Technology vendors tend to have business models that are high fixed cost and low variable cost, which means they’re incentivized to drive aggressive customer growth. This results in heavy investments in marketing and sales organizations that wind up distilling down their messaging to, “Buy our technology and you will realize business value.” And they spend a lot of time and money promoting that message.
Consultants have the opposite business model—low fixed costs and high variable costs—which means they grow profitably by selling engagements that use repeatable processes that can tap into a scalable workforce. That pulls them to “technology implementation” work over “deeply engage with the businesses of their clients and all of the complexity therein.” So, they wind up promoting a similar message: “Buy (our partners’) technology, let us help you implement it, and you will then realize business value.
Human nature within organizations drives us to do tangible “things”—adding new data sources, cleaning up data quality issues, building or augmenting dashboards, etc. This leads us to telling ourselves that these tactical activities, which skew heavily towards data collection and management, bring value to the business in and of themselves.

According to Tim, recognizing and pushing back against this mindset means embracing the messiness and hard work required to actually use data productively. He proposed that organizations need to put the same level of rigor around their data usage processes as they put around their processes for collecting and maintaining data. As an example, he outlined a framework he uses (but was clear that this wasn’t “the only” framework that’s valid for data usage) that pointed to three distinct ways data can be used to provide value:

Performance measurement—objectively and quantitatively answering the question: “Where are we today relative to where we expected to be today at some point in the past?” He described using “two magic questions” for this: 1) What are we trying to achieve, and 2) How will we know if we’ve done that?
Hypothesis validation—this is all about improving decision making by reducing uncertainty when determining what to do going forward. For this, he described a 3-part fill-in-the-blank technique: “We believe [some idea] because [some evidence or observation]. If we are right, we will [take some action].”
Operational enablement—data when it is actually part of an automated or mostly automated process (for instance, ordering shoes online generates data that is used by the order fulfillment process). He went on to say that every generative AI use case he’s seen put forth falls into operational enablement.

He ended by imploring the crowd to look at the work they and their colleagues do day in and day out through a “data collection & management” vs. “data usage” lens and consider working to shift the balance of their efforts towards the latter!

The slides are available below, as well as at :

And, of course, some pics from the event, which had a large and lively showing!

November 2023 – Google Analytics 4 and BigQuery with Scott Zakrajsek

Over fifteen years ago, Scott Zakrajsek was one of the founding organizers of Columbus Web Analytics Wednesday (he’s the guy in the green shirt right in the middle of this picture taken in May, 2008; three people in this picture, as well as the photographer, were present at our November 2023 event, and none of us could remember the name of the restaurant; the passage of time does wonders to one’s memories):

Several role changes, co-founding a company, spending several years in Boston before returning to central Ohio, getting married and having a couple of kids, and we finally convinced him it was time to re-take the stage at one of our meetups!

The topic: Google Analytics 4 and BigQuery. That’s a Big(Query) topic to cover briefly, but these two platforms are increasingly intertwined, as it has been evident for a while that Google has decided that the road to flexibility and robustness in accessing and analyzing GA4 data is a path that passes directly through BigQuery.

Scott provided a brief recap of how the fundamental data model in GA4 differs from Universal Analytics. He then made the case for why the ease with which that data can be piped into Google BigQuery (he outlined the steps for turning on that integration, including highlighting the key choices to be made when doing that) enables both deeper analysis as well as easier integration of website and mobile app behavioral data with data from other sources.

Once the data is in BigQuery, though, it has to be made accessible, both to analysts and to business users. For the former, that means SQL, and it means going beyond simply SELECT, FROM, WHERE, ORDER BY, and GROUP BY to also be comfortable with UNNEST, subqueries, and CTEs. He demonstrated how generative AI—Bard, as one option (which led to a brief discussion of Duet AI and Copilot as other options)—could be used to get an initial pass at functional SQL, although some tweaking is generally required. That led to a discussion of the difference between SQL-for-exploration-and-one-time-analysis vs. SQL-to-be-productionalized.

To wrap the session, Scott conducted a live demo, including pushing the results of a query into Looker Studio.

The presentation was followed by a great discussion that demonstrated the value of in-person meetups—attendees included both several individuals who are elbows-deep in GA4 with BigQuery as well as a number of BigCurious individuals who were able to tap into the experience of Scott and the attendees to get a much better since of what is involved in bringing the two platforms together.

More details? Check out the slides:

And, hey, the same guy who took that picture at the top of this post with 2008 digital photography tech has upgraded his gear a few times since then, so there are pictures from the event, too:

Bonus: in the intro, Bryan brought up the User Journey – Vol. 1 rock opera that long-time WAW co-organizer Jason Packer was instrumental in producing!

August 2023 — No Business Is Ever Free from Pain, Uncertainty, and Constant work.

For this month’s event, Brett Buchanan from Pathfinder Product delivered a talk inspired by Jonah Hill and the documentary he directed centered on his friend and therapist, Phil Stutz. The inspiration for the talk was Stutz’s thesis that, in life, we are never free from pain, uncertainty, and constant work. If you have some pain (physical, psychological, emotional, or some other form that it pains me to say I cannot think of at the moment) and you address it, well, it just gets backfilled by some other pain. The same goes for uncertainty. And for work! Brett realized that this applies to businesses, as well as to our professional lives, as much as it applies to our outside-work world.

Step 1 is to recognize that. But, step 2—the gist of the evening’s talk—was that the way to keep this reality from forcing us to operate at indefinitely sustained levels of high stress is to bring a relentless focus to our work. At a high level, that “just” means figuring out what really matters (to you, to your organization) and then aggressively prioritizing the things that impact that (and let things that don’t impact that fall by the wayside). Going a level deeper, Brett walked through examples of organizations—Uber, CarNext, and Gap—and how they have applied this idea. He provided a useful way of framing a lot of buzzy/popular business management tools—north star metrics, OKRs, business process mapping—through this lens: applying them as a tool to gain that focus.

Do we have the slides? You can be certain that we do!

And does it pain you to ask if there are photos of the evening? No need! The posting of those flowed through our constant work pipeline and are available for your perusal should you wish.

June 2023 — Harnessing the Power of ChatGPT with Embeddings and Chat

Here at Columbus WAW, we’ve never claimed to be trendsetters, but we can hop on a bandwagon like a crowd of teenagers chasing a TikTok challenge.¹ The challenge? The landscape is evolving quickly, so we wanted to cover a topic that would provide content with staying power, and we needed a speaker who could do that! Luckily, Pete Gordon fit the bill, and he delivered (and he delivered while sporting a Columbus WAW shirt)!

His slides are available here. And, Pete himself can be found around town in all sorts of forums that he runs or supports, including GDG Columbus, Ohio DevFest (the next one will likely be in Toledo), and Columbus Code & Coffee.

While the talk danced into more technical territory than we generally get to at one of our meetups, it did so in the service of helping attendees think through the actual applications for this brave new world of large language models (LLMs). At this point, even the most non-technical of us have created an OpenAI account and lobbed some questions at ChatGPT. Maybe we’ve even tried out Bard. We’ve read more posts than we care to admit with Thought Leaders explaining 25 ways that YOU can put these tools to AMAZING USE! In short, we’ve lived “in the web interface” or “in the app” when it comes to exploring these platforms.

Pete’s talk, while relevant to this approach, came at the topic from more of a developer perspective—thinking about interacting with these platforms through their APIs. This included a glimpse into what this looks like, but, more importantly, provided a perspective on the give-and-take of an application interacting with a large language model.

And, he framed the presentation around the greatest cartoon series ever created.

Midjourney (Perhaps) Successfully Avoids Copyright Infringement with Its Rendering of “Pinky and the Brain hunched over a computer and writing code. Both creatures have rounded ears, pink tails, and red noses.”

Before diving into Pinky-the-Prompt-Engineer and The-Brain-Doing-Embeddings-and-Vector-Similarity, Pete provided some background and history of natural language processing (NLP), noting that 2012-2013 was one big jump forward with the emergence of recurrent neural networks (RNNs), and 2018 was the next leap with the emergence of Bidirectional Encoder Representations from Transformers (BERT). He recommended attendees give Andrej Karpathy’s (co-founder of OpenAI) recent talk at the Microsoft Developer Conference on The State of GPT a watch.

For prompt engineering (the Pinky role), Pete emphasized that there is an important difference between the “base model” in one of these platforms and that base model actually being employed as an “assistant.” A base model is not an assistant, but, with effective prompt engineering, it can be made to behave as one! That prompt engineering certainly can be a human being (or a dopey mouse) who has read the right articles on the subject and then practiced to hone their techniques, or it can be an application that is designed to iteratively prompt a base model via API calls. The exact same concepts apply either way—a developer just needs to have codified the techniques!

Pete then shifted to explaining embeddings and vector similarity (the Brain side of things), where at least a few attendees’s minds (including the author of this summary) were blown. Unfortunately, much more of this was demo’d live with code than being available in his deck, which is why it’s always best to attend in person rather than rely on a mostly-human-written recap after the fact!

In a nutshell², when you have one of these large language models, you have a “model of unstructured data.” When you have other unstructured data (which could be a prompt, but it could also be just a statement or a document—some coherent string of words), you can use that as a query against the model to find out “where” in the model the data you’re passing in fits. That “where” can be represented with a vector of floating point values (think of those as being coordinates in an n-dimensional system that will melt your brain if you try to create a mental picture of it). “Yeah? So what?” you’re thinking. Well, that’s where things start to get pretty cool. If you’re working with a vector of numbers, then you can start doing “distance” comparisons of your unstructured data, be it other unstructured data you’ve passed into the model or unstructured data that exists within the model. The image above actually shows the resulting vector of floating point values when “Hello World how are you today?” was passed to the model. Then, the bottom part of the screen shows the sets of unstructured data within the model that are “closest” to that phrase (which Pete indicated were things like the “Hello, World” Wikipedia entry, since Wikipedia is one of the sets of unstructured data used to create the ChatGPT LLM). This part of the session prompted quite a bit of discussion as to potential use cases.

It was a broad, deep, and complex topic, but Pete kept it moving, and, as is the norm at these meetups, the audience was engaged!

Next month’s event will be tackling the same world of LLMs, but coming at them from an entirely different angle!

And pictures? Of course!

¹ The “like a crowd of teenagers chasing a TikTok challenge” line was provided by ChatGPT. Some of the other suggestions from our future overlord were: “like a kangaroo on caffeine”, “like a clumsy penguin sliding on ice,” “like a squirrel on a sugar rush,” “like a herd of cats chasing a laser pointer,” and “like a herd of wildebeests following the migration.”

²The post author does not make any guarantees regarding the accuracy of the contents of this nutshell.

May 2023 — Heatmapping and Session Recording

With all of the focus on digital analytics platforms—Google’s impending end-of-life for Universal Analytics, privacy regulations, forced cookie expiration, and the like—it’s easy to forget that there is a world of data beyond platforms that do (or don’t…but then do again) report bounce rates and conversions. For our May event, Lindsay Peck, Conversion Optimization Director at Adept, explored two such types of data: heatmaps and session recordings.

Lindsay started by making the point that any self-respecting analyst, CRO, or marketer takes a fairly broad view of their data ecosystem and recognizes that different tools are useful for different types of data work:

She then elaborated on where these two specific types of tools fit in such an ecosystem: what they are, what they’re good for, and what their limitations are.

For heatmaps, well…we know what heatmaps are, right? In a digital context, they’re most often used to visualize where users are clicking (or tapping), how far down pages/screens they are scrolling, and, when working in a mouse world (desktops/laptops), where they’re actually moving the cursor before they click.

While heatmaps visually aggregate all of the users’ activities on a web page, session recordings capture individual users’ experiences on the site in a video format. Typically, these recordings are only captured for a (~random) sample of visitors to the site, and the analysis can be more time consuming. Lindsay recommended setting aside 2-4 hours for an initial review of session recordings—it’s got a lot of brain work involved to actually observe and process what is happening and “see” patterns across multiple recordings, so she recommended not trying to just fit the work into a series of 15-30 minute slots. It’s too deep work-y type work to do that!

There are lots of vendors that provide these engagement analysis features, with Microsoft Clarity being the option that is the most free and unlimited (although it’s functionality has some limitations) and hotjar being the most popular free/low cost (but you may need to be selective about which pages you capture data on). They are all typically pretty easy to implement—they can be deployed through a tag manager—but need to be considered through a privacy lens just like a digital analytics tool would be.

Some common uses for these engagement analysis tools are:

Visualizing the most popular (hot) and unpopular (cold) elements of a web page
Understanding what content users are seeing and engaging with (and sometimes, more importantly, what they’re not seeing or engaging with)
Seeing where users are experiencing friction, hesitation, or possible frustration
Understanding how users are scrolling and moving their mouse, and if they are interacting with your page’s main links, buttons, CTAs, etc.
Supplying data and insights to inform a hypothesis around critical business and marketing questions like, why aren’t users converting, or why isn’t the CTA getting clicks?
Uncovering and prioritizing bugs or display issues across devices.
Putting yourself in your visitor’s shoes.

There was a lot of discussion throughout her presentation, which was fun! But, she also provided the slides if you weren’t able to attend and are content with reviewing those as the loosest approximation of having actually been there. If you’d been there, the gallery below shows some pictures of what you would’ve seen!

: The event photographer arrived early and got all artsy

: This was NOT a staged photo, despite how it looks

: See those logos on the screen? Those are the best sponsors a meetup could ask for!

: Emcee Bryan checks his notes to make sure he’s covered all the announcements

: Door prize drawing time! Lindsay did the drawing honors.

: Lindsay kicks off her presentation

: The audience had questions (and they were good ones)!

: And the audience had equally good observations!

: This event was the opposite of bullsh**!

March 2023 — Hollywood Storytelling Secrets You Aren’t Using in Your Data Presentations with Lea Pica

One of the benefits of having been running a meetup for 15 years and having awesome sponsors is that we are able to occasionally bring out-of-town speakers back by popular demand. It’s almost like we sometimes produce sequels, which, thematically, lined right up with this month’s event!

Lea Pica last presented at WAW in 2017, and we were beyond excited to bring her back to be ~~on the~~ in front of a big screen to once again provide attendees with tips and tales of storytelling done well..

This time, Lea wrapped her content in something of a thought experiment: what would happen if blockbuster movies and TV shows were delivered the way that most data presentations get delivered? For instance, imagine Game of Thrones as a corporate presentation:

“Game of Thrones” Reimagined as a Corporate Presentation

With that basic premise, Lea then shared five “Hollywood tips” for creating and delivering data presentations that are high impact:

Begin with an irresistible hook—think about movie posters that tease an upcoming release…and then revisit the title slide for your presentation
Learn to create anticipation—did you know that every TED Talk presentation is required to have a “throughline” identified—a single sentence that summarizes the entire talk? Put the effort in to determine the throughline of your data presentation, and you will be setting it up to be high impact and effective.
Take your audience on a journey of transformation—the “narrative arc” (exposition, rising action, climax, falling action, resolution) is part of cinematic and prose Storytelling 101 for a reason. It’s effective! Approach your data presentation as an Insight Journey existing within a narrative arc!
Present a clearly defined plan of action—S.M.A.R.T. recommendations are Lea’s twist on S.M.A.R.T. goals. In the context of recommendations, the acronym stands for: Specific, Measurable, Assigned, Relevant, and Time-Bound.
Conclude with a definitive ending—don’t the the presentation metaphorically just fade to black and drift off.

Whether you attended or not, you can request a really handy 15-page guide to these tips at leapica.com/wawhandout!

To demonstrate the impact of the approach, Lea shared at the end of the presentation that what she had delivered…had actually illustrated and applied all five tips. How meta is that?

The Narrative Arc Was Present in the Presentation We’d Just Witnessed

And, now, while it lacks a narrative arc, below are some pictures from the event!

January 2023 – How to Optimize Your Google Business Profile

Google Business Profile (GBP)? What happened to Google My Business (GMB)? The former replaced the latter! At this month’s event, Katie Bradley split the difference between a primer and a deep dive on the subject and armed attendees with what they needed to know to claim, manage, and drive results using the free* GBP platform. In a nutshell, the session went through:

The key differences between GBP and GMB (with a conclusion that the change is generally for the better)
How to claim and verify a (Google) business profile
The various ways to optimize a profile: proper categorization, a solid description, good (geo-tagged) photos, and adding products (if relevant)
How to add Google posts (and what makes for good ones)
The importance of soliciting customer reviews, and how to get as much mileage and impact as possible from good ones
How to track and measure the impact of the profile over time
And more…!

The slides from the event are here, and some pictures from the evening are below!

* “Free” in that there is no monetary cost for claiming a business and then working with the listing. But, TANSTAAFL, amiright? Putting Katie’s many tips to use does take some time, but it can often be an investment with a seriously positive ROI!

October 2022 — Customer Research As A Differentiator

Is your business competing as a commodity or competing on experience? Or…trying to do both? At this month’s event, we learned that a “both” approach is an awfully tough row to hoe!

Sarah Ahern from PATH presented on how to smoke your competition by listening to what they want and then fostering experiences to give that to them! Ultimately, this can be boiled down to a 3-step process:

Listening…to customers, employees, and the market.
Building a process that is cross-functional, primed for action, and prepared to take action with what is heard.
Monetize the results through evolving and improving the experiences for the right customers

You can see Sarah’s slides here.

And a few pictures from the evening:

: Emcee Bryan Kicked Off the Presentation

: The Otterbein Crew Getting Their Extra Credit Selfie with Tim

: Sarah Setting Up the Topic

: The Spectrum from Commodity to Experience

: Oh. And There Was Some Tasty Texas Beer

: Are You Listening?

: Andy Chimes Into the Discussion

: Wrapping Up!

June 2022 – Google Analytics 4 (Meetup AND Bonus Seminar)

As we are just over a year out from when Google has announced they will end-of-life Universal Analytics as of July 1, 2023, this month’s meetup included a bonus follow-on seminar on Thursday morning.

Ken Williams (creator and maintainer of in-depth Google Analytics 4 resources on ken-williams.com) and Cory Watson from Search Discovery presented at both events. Some of the highlights and takeaways from the event:

The entire philosophy of how the collected data is structured is different from Universal Analytics—it relies solely on events, and each event can have multiple parameters (as Cory put it, “the event is what happened, and the parameters are the context around that”)
This model does have more flexibility, but it’s still got some gaps in its functionality; mostly, these are things that Google is working on
Bounce rate was going away…to be replaced by a vastly superior concept of “engagement.” This was awesome…but there was a backlash from a sufficiently large number of users (who should really be ashamed of themselves) that Google is going to add bounce rate back in as an available metric
Planning is key: actually accessing the data once it’s captured will be reasonably straightforward or an absolute nightmare depending on how well the implementation is planned. Ken and Cory suggested that organizations should plan for 11 to 17 weeks to implement Google Analytics 4
Something to think (freak out) about is that lots of organizations expect to have access to year-over-year comparisons. That means, starting July 1, 2023, they’ll be wanting apples-to-apples comparison data that goes back to…July 1, 2022. <gulp>

This post can’t possibly do a complete recap of the material. You really had to be there.

But, if you weren’t (or, if you were, but you’d like to review the content), the next best thing are the slides, which Ken and Cory graciously shared!

The slides from Wednesday’s meetup:

The slides from Thursday’s seminar:

And, hey, just in case you wanted to see a few more pics of the event:

April 2022 – Modern Culture of Data

For our April event, Thomas Kilbane, Jeewan Singh, and Eric Hayslett from Slalom Consulting presented on the people and process side of the data: pillars that are critical for organizations to establish if they want to meaningfully put data to work on an on-going and ingrained part of the organization.

The five pillars:

Bold Vision—and it needs to be a vision that is clear and business-aligned (“We’re going to do AI” doesn’t count)
Access & Transparency—users need to be able to be able to get to the data and understand how to interpret it
Data Literacy—users need to have both the hard skills (using tools) and the softer skills (hypothesis development) and the incentives to be putting the organization’s data to use
Guardianship—governance of the data so that it can be trusted by the users as to its accuracy, as well as ensuring it is compliant with regulatory requirements
Embedding Insights—a pillar that is dependent on all of the other pillars: actually incorporating the use of data into all relevant aspects of the organization’s day-to-day operations

Slalom’s support for the event also enabled us to up our game on the food, with a delicious spread from Chef Jeff!

The event photographer has finally learned how to set a custom white balance on his camera. He even occasionally remembers to reset it when he moves from the meeting space out into the atrium. Occasionally.