Blog

April 2023 — An Introduction to DataOps

April’s meetup saw Jeewan Singh and Tomy Rhymond of Slalom Columbus give us an introduction into the concept of DataOps.

We all deal with data on a regular basis, though we usually are focused on only a small part of the whole lifecycle process of that data. Thinking more broadly about the whole journey of data, from ingest to insight, requires broadening our horizons past the details of analytical tactics towards a more holistic approach.

How do we better deal with managing that whole lifecycle? That’s where we need to change our mindsets and move towards the kind of agile processes that define DataOps.

The traditional way of dealing with data starts from a fundamental assumption that the process and business requirements are static. A naive idea that once we figure out how to ETL the data and get the business the reporting it needs, that we’re done. Of course we all know that’s not how things work in the real world, thus we need to have methods to deal with this constant change and evolution. We’re still allowed to complain when the business stakeholders change the requirements just when we’re almost done with the project (that’s a basic human right), but perhaps if we’ve built more flexibility into our process from the start that won’t mess us up quite as much.

Automation, continuous integration and delivery, test cases right from the start, infrastructure as code rather than clickops — these are the kinds of things that are at the heart of DataOps. If you’ve ever worked within DevOps you’ll find the approach to be similar, but tweaked and extended for the specific needs of the data world.

In case you missed it, Tomy and Jeewan have also generously provided their slide deck:

https://www.slideshare.net/JasonPacker/dataops-cbusdaw-april-23