Not just ‘big’ – bringing big data into line

What’s in a name? When it comes to taming big data, it’s easy to just see the challenge as being one of scale – just as the name implies. But as a non-specialist working in this area, I quickly discovered that it’s more complicated than that. And I’d like to use this post to explain why that’s the case.

It’s not that big data isn’t BIG, of course. The quantity of data that’s out there is now measured in zettabytes (with one zettabyte being equivalent to one billion terabytes). How big is this? Well, one zettabyte of data is enough to fill 15,625 billion 64GB iPads. Taking that a step further… according to their Q1 2016 stats, Apple has sold 308,16 million iPads to date, assuming those were all 64GB then this single terabyte would equate to around 500 years of iPad sales.

2 - Not just big - Infographic

What’s more, the speed in the growth of the data tsunami that’s washing over the globe is reflected in the commonly used stat that 90% of the data in the world today was generated in the last two years. But inside this mega-mountain all kinds of valuable information are hidden – information that industry and businesses could potentially turn into revealing intelligence and valuable insights.

“It’s estimated, for instance, that something like 80% of all data is unstructured,”

A key obstacle that’s often overlooked, though, is the incredible variety of data types, and that’s compounded by the huge differences that exist in the quality of that data. Roger Downing, our Big Data Technical Lead (and my local expert) here at Hartree, explains: “It’s estimated, for instance, that something like 80% of all data is unstructured – in other words, data not held in an actual database or some other structured form. A classic example is the data generated by mobile phones and things like Twitter. But it’s also important not to overlook ‘traditional’ data sources such as old files that often carry potentially useful historic information about organisations, behaviours, decisions, processes, products, performance, maintenance regimes, infrastructure assets and so on.”

One of the keys to exploiting big data successfully, then, is to extract actionable information from – and detect links and patterns in – all these data types. And that really does require specialist skills in terms of identifying the right data sources in the first place, capturing the data effectively and then preparing it for detailed analysis by high performance computing clusters.

“Our goal is to transfer this expertise and understanding into UK industry,”

“Getting to grips with big data and mining it effectively doesn’t just demand big computers and clever software, it also requires expert understanding of what needs to be done to get hold of helpful data and make it usable from a data analytics perspective,” Roger says.

And ok, here’s the mild sales pitch, but at Hartree, we’ve got all those capabilities. Our goal is to transfer this expertise and understanding into UK industry  by working alongside them on their projects with other leading experts from our partner organisations. We want to demonstrate big data’s potential to inform decision-making and to strengthen products, services, processes and strategies.  We see it as part of our mission to ‘democratise’ big data and cognitive technologies and make them accessible to non-experts so that they can use these tools for economic or societal benefit.

Big data may have been around for a while now, but one thing is for sure – there’s plenty more that can be done to fully exploit the opportunities.

If you want to see and hear a bit more about the Hartree Centre and big data, then please view our new animation ‘big data -delivering better decisions to drive productivity’.

Give us your views in the comments below.