Can engineers take the weight of big data?

<b>With such hype behind the Internet of Things is all this information going to be useful or will it become a hindrance to the engineer and a distraction in finding real innovation? Is Big Data in danger of being too big? Justin Cunningham investigates.</b>

It's the New Year, and although not totally new, there are a couple of phases you are likely to hear a whole lot more of this year; the Internet of Things and Big Data. Both are essentially phrases that have been coined to encompass the concept of embedding sensors into 'things' – like consumer products and industrial devices – and then continually capturing data from them. This idea of capturing data from 'things' everywhere, offers the tantalising potential to uncover a fair bit of hidden innovation. It's hoped that this mass of data will shed new light on all kinds of societal issues by exposing trends and offering fresh insight into highly complex, and long term, problems from personalised healthcare to city planning and everything in between. A recent report by the Lloyd's Register Foundation entitled 'The Foresight review of Big Data – towards data-centric engineering' said: "Data will be used to predict and anticipate, plan and decide every aspect of the 21st century. "The new scale of data availability will change all the strategic sectors... from design to manufacturing, maintenance to decommissioning." The Internet of Things has been described as the overnight success that has been 30 years in the making. Indeed, there is no doubt that it offers huge opportunities, today, for design engineers to add value by embedding all kinds of sensors in products as standard. While this level of monitoring might have once been only carried out at the test and validation stage, now continuous monitoring and connectivity will be demanded in every product from personal health monitoring devices, to cars and combine-harvesters, to factory equipment, all the way up to colossal power plants. And it will happen on a global scale. The difference between data and knowledge However, this all represents one fairly major problem: too much information. Big data, in many cases, becomes too big to handle and getting anything useful from it can be near impossible. This vast sea of raw data will take such a massive amount of processing power to analyse, it begs the question whether Big Data is in danger of weighing down engineers and innovation itself? Indeed the Lloyd's Register report went on to add: "Big data can be complex to analyse because it comes in many varieties, shapes and sizes and may have been collected over different timescales. It can be uncertain, noisy, and incomplete. "It imposes demands on infrastructure and on humans. Big data needs analytics; not only the techniques of statistics and machine learning, but also the human skills of insight and pattern recognition to find genuine meaning. "Collective responsibility and action by citizens, governments and businesses will be needed to realise the potential." Sensing a problem? The use of some basic sensors within the average smart phone has led to all kinds of unforeseen apps, and soon almost every product will demand the same element of connectivity and varied embedded sensor suites from a kitchen appliance to CNC machines. But, the data rates on many sensors vary greatly, with some that can take readings on a near continuous basis. So how, and where, can all this data be processed? Andy Chang, senior program manager for academic research at National Instruments, said: "There is no way to process all of this data in a centralised location – it is just not going to work." Centralised cloud storage has become a big hit for consumers and industry alike in the last few years, with the migration to centralised storage almost reminiscent of the centralised mainframes of old. But this volume of data, from billions of devices, all streaming continuous measurements would fill all the world's data centres before long. It's likely we would simply run out of room to store all the information. Dr Joe Salvo, a director of global research at General Electric said: "By 2020 there will be 8 billion people in the world, 50 billion connected devices, and each year 50 trillion GB of data will be captured. If we were to store all that on old floppy disks, you'd need a stack of them that would go all the way to the sun and back... 300 times!" There has always been a give and take between the technologies used to create data and then the technologies used to process them, and there is no doubt we are on the cusp of a significant leap in terms of capturing data, but for many the processes to turn it all in to something useful is an elephant in the room, and something that is grossly underestimated. It is not that solutions are not being developed, it is just that those thinking about embedding sensors – in the hope to gain critical advantage from the information gained – may not be anticipating the potential rod they are making for their own back. "Data almost becomes a problem as there is so much of it," said Peter Haigh, a power systems engineer at National Grid, which is developing systems to continuously monitor the electricity distribution network throughout the UK. "We produce so much data. We'd actually rather there was less of it, or that it was easier to distil it down to something useful." Driving value Enabling value to be driven from Big Data is key to its success. But, how can anyone, or anything, cope with the near infinite information that is likely to be generated? "We have to learn how to forget things," said Dr Salvo. "That is going to be the differentiator in this new age. How many people save every newspaper they ever got? You process it and throw it away. We need to get comfortable with processing it and then forgetting it. That is going to be differentiator going forward." Processing near infinite quantities of data on a continuous basis is essentially what the brain does: using the useful bits while forgetting the rest. And this is what intelligent sensors and analytic software needs to be able to do going forward; filter the data and boil it down to patterns, charts, 'real-statistics', and knowledge that allows people to make confident decisions from company directors assessing macro trends driving a business or design engineers seeing real world performance of products over months, years, and decades. This is where there is true value in Big Data for engineers, the insight in to how to improve design and ultimately where to innovate. However, getting the relevant information out is not straightforward and is something that varies from industry to industry and person to person, depending on what is relevant and useful to them. Jim Robinson is general manager of the Internet of Things solutions group, segments and broad market division at Intel. He said: "All the processing that is happening from new and existing devices is useless if you can't do something with the data that you are pulling from them. And this is where Big Data analytics has really come into its own in the last few years. "I'm talking to companies and industries that are trying to drive value from Big Data and analytics, and they are trying to convert all this data into useful information. But it is very complex, so we believe it's critical that these Internet of Things platforms are built on open and flexible, and scalable, platforms that will ease the deployment." Machine intelligence This is also a key demand for Big Data going forward: machine intelligence. Machine to machine connections need to be able to know what data to keep and what to dump. While Airbus, for example, has eluded to its Factory Of The Future being able to pull up the video accounts of technicians or fitters in order to see exactly what bolts, tools and so on were used years after a build, it raises the question of where will they store all the data? And in a highly regulated industry like aerospace, how will they know what data can be omitted or dumped? Shelley Gretlein, a senior group manager for the robotics, real-time, and embedded software team at National Instruments, said: "I don't want to throw something away that in 10 years I'm going to regret, especially in heavily regulated sectors. How do I know I won't want to go back and have a look at it to see happened later on? "So, I think you have to have increased intelligence at the node, so these devices are not one way devices. We need machine intelligence so if it hits a limit, a fault, it will then send the data to a more centralised type location. Intelligence at the node is available now, but it is for users to define exactly how and when – so setting limits on when to take data – and that is the challenge." There is no doubt that Big Data, and the Internet of Things, offer amazing potential. However, getting there is likely to be less straightforward than many would believe, and it's likely that many engineers will have a problem with too much information or perhaps not be completely sure where to set the limits of when to keep information, and when to throw it away. But, isn't this what engineering is all about - overcoming the challenge? While it Big Data might well weigh down many, it is certainly only a matter of time before all this information is exploited in ways that will revolution life and industry as we know. It is a question of when, and not if. -------------------------------------------------------------------- THE DATA DRIVEN SOCIETY It is data on a global scale and will include everything from washing machines to personal health monitoring devices to smart energy meters in the home to the bearings in the gas turbines producing the power... and everything in between. The data is not just big, it is near infinite. And it's quite likely that any future society is not just going to be data reliant, but data driven – using the insight from all this to meet, and predict, the expectations of the population. It is a daunting, but exciting prospect. It could revolutionise healthcare. Imagine your doctor giving you a call to inform you of a possible medical condition, months or even years ahead of current diagnosis, turning reactive medicine into proactive lifestyle changes. It is being called many different things from data-centric engineering to Industry 4.0 to cyber physical systems. But they all essentially boil down to the same thing. A data driven world, based on real world measurements, which will shape and steer society in virtually every area imaginable.