Questions for BlockchainEver since completing our popular series, sharing resources from the #CityChain17 blockchain conference, I’ve been ruminating on some fundamental questions for blockchain.

These are questions relevant to Data & Analytics leaders, especially if blockchain technologies are to fulfil their potential of moving beyond pilot stage.

Each deals with a different aspect of the key challenges (facing data & analytics leaders today) and asks, could blockchain help achieve this? My three questions cover: ways to amalgamate disparate data sources; performing meaningful analytics on new data structures; and facing into one of the key challenges presented by GDPR.

As I am no blockchain expert & robust thinking on this topic is still at a relatively early stage (just as most ‘use cases‘ are still pilots), so I have looked to others.

In this post, I share 3 other articles reflecting on the 3 questions for blockchain (that I pose below). Each demonstrates some real expertise and sensible thinking, at least about the work still needed. I hope you find them useful.

Q1: Could blockchain help address the problem of amalgamating the disparate data sources needed for Analytics?

Here it is worth reflecting on both the similarities of blockchain & database technologies (as Gideon shared in our previous post) and also what database theory has learned over recent decades of practice.

Despite not having the answer (yet), in his article, Dr Barry Devlin explains the potential of blockchain in the context of how the management of historical data has developed. Understanding the pros & cons of different approaches helps highlight both the potential & some of the remaining challenges for using blockchain technology here. Dr Devlin’s experience since the early days of data warehouses means he is well worth hearing (to balance the enthusiasm of your blockchain developers & sales people):

Historical Data: From Data Warehouse to Immutable Blockchain – TDWI Upside

Managers and other decision makers have long trusted the data warehouse as the best available source of a consistent, historical record of business performance. I discussed the topic of consistency in my previous article.

So, we might conclude that there is real potential here, but still unproven. I hope some businesses take upon that challenge, as the metadata benefits of blockchain timestamps, transparent provenance & immutability could be significant.

But, if blockchain solutions could replace some data warehouses, federated solutions or provide indexing – will analysts be able to use it?

Q2: How could analysts or data scientists interrogate data held in blockchains, to gain insights?

As I raised during a Q&A session at #CityChain17, there does appear to be much opportunity for collaboration between Data Science teams & Blockchain developers. However, despite the understanding of the similarities between blockchains & databases, most use cases are focussed on operational systems, rather than solutions that analysts could interrogate to provide analysis or insights.

This must be a relevant topic for IT leaders at the moment, as I note Gartner has recently published new research on this question.

So, I was pleased to see this post, from Benedikt Koehler, on how to query the blockchain using R. It is interesting to see that he also references a paper using Graph Analysis, a technique I’m hearing mentioned more & more by analysts – so we will return to that in a future post.

Anyway, given the increasing use of R by analysts (and especially Data Scientists), I hope you find the coding examples & outputs shared by Benedikt to be helpful. Do let me know if you have an alternative proven approach to query blockchain data.

Querying the Bitcoin blockchain with R

The crypto-currency Bitcoin and the way it generates “trustless trust” is one of the hottest topics when it comes to technological innovations right now. The way Bitcoin transactions always backtrace the whole transaction list since the first discovered block (the Genesis block) does not only work for finance.


Finally, we should return to the viability of blockchains holding your customer data in a post-GDPR world.

3) If Blockchain is immutable (insert only), then how will it support GDPR requirements for ‘right to erasure’?

As we shared in an earlier post on the challenges presented by GDPR, one of them is the ‘right to be forgotten’ or ‘right to erasure‘. It was notable that despite all the useful information shared at #CityChain17, this challenge was never raised or answered. Indeed the intrinsic immutability of blockchains was lauded as one of the key benefits. But, if it can’t be changed, how can a business comply with customer’s request for erasure?

In this useful short article, Steven Farmer (legal counsel at Pillsbury Winthrop Shaw Pittman), outlines the problem and suggests a couple of possible mitigations. It will be interesting to see if such workarounds do reduce the risk of court action. They seem sensible within the constraints of current blockchain technology, but I’d prefer to see them proven as accepted by case-law first.

Well worth reading Steven’s clear & simple explanations:

Blockchain technologies and the EU ‘right to be forgotten’ – an insurmountable tension?

As blockchains are being used across an ever-growing list of applications, equally European privacy laws are becoming more sophisticated, evolving, as they have been for some time, in an attempt to play catchup with technological developments. Key attractions of blockchains are, of course, their permanency and transparency, ie the data store is added to and is very difficult to take away.

Any more questions for the blockchain?

I hope the above resources were worth sharing. Given the number of visitors to our recent blockchain posts, it seems this is a topic of real interest to our readers.

If you are a Data or Analytics leader, considering blockchain, what other questions for blockchain do you have?

Feel free to raise your questions in our Comments section below & I will curate the best answers or even interview an expert for a future post.