Many thanks to all those who participated in our Technology Stack poll for data leaders. In that single question survey, we asked which elements in the technology stack data leaders need to understand. Now we have some interim results to share.

As explained more fully with that poll, our interest is in which technology data leaders need to understand sufficiently to make informed decisions. So, we are not encouraging micromanagement or hands-on involvement. What we are interested in is the technology that is so important that selection decisions should involve the data leader.

Since then, Tony Boobier has challenged us to be clear what we mean by the term data leader. For this poll, I’m assuming it is what Tony would call ‘data leaders‘ not ‘data-driven leaders‘. Francesco Corea has also joined the debate with his advice on how to make technology stack selections. I hope that advice helps you if you are nudged by the results below, e.g. to focus on an overlooked element.

Now it’s time to share with you the interim results. Which elements of the technology stack identified in “Practical DataOps” will have received the most votes?

Interim results: which elements got the most votes

To inject more variety in how I present the results of our one-question surveys, I have this time used a Treemap. Using the free online data visualisation tool from Flourish Studio, I produced the following summary graphic…

We have a winner = Data Analytics

Maybe not surprising given the diversity of data leaders who read this blog. The term Data Analytics (and the technology elements covered by it) is meaningful to most. From BI to Data Science leaders, analysts to CDOs, I’m sure all could see the need for effective data analytics technology.

Given the overlapping nature of imprecise terms, it is unclear whether those voting meant anything from coding languages (like R or Python) to packaged solutions (like SAS). However, given the lower votes for Data Science specific elements, I suspect this is a vote for still needing the more traditional analytics software packages.

It makes sense that Data Leaders need to understand this technology. Indeed my experience tells me they need to push for other leaders to understand how this differs from other elements. I’ve seen too many analytics teams expected to use solely data visualisation and/or BI tools.

Two equally important runners-up

Alongside the welcome priority given to the need to understand data analytics software, I’m pleased to see the next two. Hopefully, you can see (or confirm via the popups) that these are Data Management and Data Visualisation.

It has been so heartening over the last decade to see the rise in interest in data visualisation. As more & more leaders recognise the power of presenting data well and the risks of not doing so. Hopefully, votes for Data Visualisation are a confirmation that data leaders also now recognise the need for specialist tools. Too many analysts are left with Excel and unrealistic expectations.

Given the title ‘data leaders‘, the focus on data management is also to be expected. Most CDOs and those with data in their job title have needed to ensure GDPR compliance is ‘front of mind‘. Hopefully considering data protection for all new developments (privacy by design) has also identified the need an infrastructure that is up to the job. I am encouraged if this vote means recognition of the need for dedicated data management & data quality monitoring tools.

The next four take us more into the buzzwords

Warning we are approaching ‘buzzword central‘, but via some too often overlooked elements. The buzzword elements are Data Science Platforms and Cloud Computing. That said, the huge progress in using both together is testament to their power for many organisations. It has been encouraging to see major cloud computing providers like Amazon, Google & Microsoft also provide data science platforms.

Such more complete data science environments have allowed newer data science teams to take more ‘pay as you grow‘ approach. On that journey, they have still had access to a growing diversity of tools. What is still missing is greater consistency on development methodologies, but I’ve shared that before (with some recommendations).

The two elements that join those buzzwords for ‘joint third place‘ are Data Integration & Data Processing Pipelines & BI Tools. These once again reflect the diversity of complexity needed in modern data functions. It would be wrong to overlook the need for efficient, accurate, timely & well presented BI. But increased demands for Big Data, plus greater integration of models & data products into live systems – means that data integration and pipelines matter too. We have come a long way from simple batch ETL. This shows how much data leaders now need to understand what was previously the sole preserve of IT departments.

Spare a thought for the 3 least loved

It is as informative to look at the other end of the scale. Interestingly each of these is elements of technology that I hear mature data science leaders often reference. The 3 elements with less than 3% of the votes are:

  • Development tools, workspaces & software libraries
  • Reproducibility, deployment, orchestration & monitoring
  • Data Products

Seeing that list reminds me of two books that make a case for greater rigour and focus in these areas. In my review of “Guerrilla Analytics” by Enda Ridge, I mentioned the use made for development workspaces and reproducibility. In “Practical DataOps“, Harvinder Atwal makes the case for Data teams to focus more on producing data products & the deployment tools needed.

I suspect these low votes show most data leaders are working in businesses that are not yet at this level of maturity. If that is you, then I would recommend looking at both those books to help you plan ahead. You don’t want to invest in technically skilled staff only to discover too late that your technology stack is not up to supporting their delivery.

What is a step too far? Well, no-one gave any votes to APIs. So, perhaps the coding work for integration with other systems is still to be left in the hands of DevOps. There is still a vital role to be played by IT departments, but if you consider the votes above, data leaders should have a lot of common ground with CIOs. How could you collaborate rather than have turf wars?

What have you noticed?

I hope those interim results are of interest to you. Maybe they even help inform what you should be considering when planning and investing in your technology stack.

If you have a view to share, either on what’s needed or the votes we have seen so far, please get in touch. I’d love to hear what matters most to you.