Are you trusting to luck?

It is in the nature of things to go, wrong, but there are many ways to minimise the risk of a mishap turning into a major disaster.

When things go wrong, they generally do so rapidly. This makes it essential that maximum use is made of simulation and formal risk assessment and that systems are designed to help human operators do the right thing. Training, knowledge and the right information supplied in the right place at the right time are often all it takes to ensure that the manufacturing system functions without mishap, the plane lands safely, and the malfunction on the oil rig comes to nothing more than a note in a report and a story told over drinks. Get it wrong and the very least the designer can expect is having to defend his or her shortcomings to management if not to a law court. Good tools to avoid disaster are widely available, some of them free, so it would be wise to make use of them. The disaster in the Gulf of Mexico that started to unfold on April 10th 2010 when the blowout sank Deepwater Horizon is a case in point. While the enquiries and court cases will probably go on for years, it is apparent that an operation that should have been routine went wrong when a number of engineering failures conspired to kill 11 people and cause possibly the worst oil pollution incident in history. In the words of BP chief executive Tony Hayward: "The honest truth is that this is a complex accident, caused by an unprecedented combination of failures". The same can be said of most aircraft accidents, most shipping losses, most accidents with machinery in the industrial workplace and a large number of car accidents. BP says it has been investigating seven separate aspects of the mishap, most of them connected with the 'Blow Out Preventer', a massive assembly of valves and rams that sits on top of the well casing on the sea bed. If things go wrong, it is designed to shut off the flow of oil and gas from the well head automatically, and if that fails, cut through and block the riser pipe. Because it is so crucial, it should be tested regularly – some say every day. Tim Southam, proprietor of human factors and ergonomics consultancy Progress Through People asked of the events that led to this disaster: "Was it tested? What was the emergency response? Blowouts have happened before and have been dealt with successfully, but there have been problems with sea bed shuttle valves for a long time." He spoke of the need for simulation to model disasters on a computer, or better, learning how to deal with them in a realistic virtual environment. He also spoke of the need for practice and the development of good habits so they become what he called "a sub-conscious routine". A retired RAF Squadron Leader, much of Southam's experience has come from hands-on experience with aircraft and undertaking human factors research at the Royal Aircraft Establishment at Farnborough. Much of his present work is with the oil and gas industry. Preventing serious accidents, he said, starts with engineering design and some apparently simple questions. He says: "How are alarms handled? How is the design of the interface? Is instrumentation confusing? Is a crucial valve 15 feet up in the air?" As an expert on human factors, he is concerned with how people respond to an emergency situation, noting that to complicate matters, different humans are liable to respond in different ways according to gender, nationality, education, upbringing and organisational culture. There is also a need to consider the fact that tired operators are likely to perform more poorly than rested operators. He noted that lack of sleep reduces communication by 30%, the making of valid judgements by 50%, the remembering of facts and figures by 30%, and attention to alarms by 75%. Clearly, one of the crucial factors in preventing disaster in any critical situation is presenting the right information to operators so they have time to make use of it, much research has been undertaken by NASA into both routine and emergency procedure checklists and how these should best be presented to aircrew. Most of these are in the form of paper cards, held in the pilot's hand. There have been numerous attempts to develop products that could automate checklist generation and checking off, culminating in Boeing's Engine Indicating and Crew Alerting System and the even more sophisticated Airbus Electronic Centralised Aircraft Monitoring system (ECAM). Similar solutions can be applied in other situations. For instance, a working illustration of electronic management technology, including the handling of fire alarms, can be found in Rockwell Automation's Demonstration Suite in Milton Keynes. Rockwell is probably best known to our readers for its industrial automation systems, but these same systems are equally applicable to control and management of almost anything. As well as managing CCTV, public address systems, escalators, doors and fire alarms, Rockwell's Andrew Smith told us that in the event of a fire, a system managing a railway station is designed to be able to come up with a 'to do' list. Such a system based on use of ControlLogix has been installed and used by Heathrow Express for 12 years. Smith said that it has been substantially modified and expanded since its original installation, something which is made easier by basing it on standard software and a standard PLC system. The construction of Terminal 5 saw the system expanded even further. The entire network now covers a geographical area of some 25 sq km and comprises 53 PLCs and its tasks include the management of the tunnel ventilation system. ICS Triplex, now part of Rockwell Automation, applies such technologies to industries that include oil and gas. Allan Rentcome, director of SSB Technology, explains that ICS Triplex produces two main technologies for Process Safety: 'Trusted', which uses a triple modular redundant architecture - IEC 61508 SIL (Safety Integrity Level) 3, which, he says, is "very successful in the oil and gas industries"; and "AADvance", which is scalable from small systems to thousands of I/O points and is suitable for IEC 61508 SIL categories 1 to 3 including the ability to be configured as a triple modular redundant architecture. Rentcome says that ICS Triplex's solutions include very sophisticated self-testing and diagnostic capabilities, but can only form "part of the chain", the other parts being risk assessment, proper design methodology, maintenance and routine testing. Strategies can be built into the software to organise maintenance and testing and documentation that this has been done. In an industrial automation environment, application of the Machinery Directive to machine and automation system design has ensured that potentially dangerous machines have sophisticated control and safety systems that can be relied upon to shut them down if anything goes wrong or they are accessed or used improperly. However, shutdown is not always the safest response to emergency, but applying the disciplines imposed by formal risk assessment such as those required by the Machinery Directive and standards EN ISO 14121 and EN ISO 12100 can go a long way towards ensuring safety. According to Paul Considine, electronic sales specialist with Wieland Electric, this process can be greatly assisted by the availability of the tool SISTEMA – Safety Integrity Software Tool for the Evaluation of Machine Applications, available as a free download from the website of the IFA – the Institut für Arbeitsschutz (Institute of Safety at Work) in Germany. This software determines risk parameters such as those for determining the required performance level (PL), measures against common-cause failures on multi-channel systems, mean time between dangerous failures, and the average test quality of components and blocks. These factors are entered step-by-step in dialogues and each parameter change is reflected immediately on the user interface with its impact upon the entire system. Users are spared time-consuming consultation of tables and calculation of formulae, since these tasks are performed by the software. The final results can then be printed out. But, according to Rentcome, it is not enough to look at single machines, such as the now notorious blow out preventer that was below Deepwater Horizon or even control computers and networks. Instead, he says: "Success is achieved by providing a complete solution and we like to be involved early in the process" whereupon, ICSTriplex can provide a, "Complete lifecycle solution," including not only the design of its part of the hardware, but maintenance, testing, management, and provide support for the complete life cycle of the project through to de-commissioning. In fact, Rockwell has a complete 'mission control' room at its Milton Keynes headquarters, which allows the demonstration of how such systems, processes and equipment might be deployed in an industrial context. As mentioned earlier, simulation is often key to discovering whether emergencies can be adequately coped with under realistic conditions and thus should often be seen as an essential part of the design development and risk assessment process. Virtalis, originally spun out of the National Advanced Robotics Research Centre in the late 1980s, has particular skills in this field. At a 'Showcase' event at Northampton University Virtalis demonstrated a large projected 'ActiveWall telewall, viewed in 3D using shutter glasses, a 3D cave called 'ActiveCube, tracking of body movements using a Vicon camera system and reflective markers, and how to train helicopter winch men in a head-mounted immersive 3D environment. The company's particular skill is in being able to work with a wide range of 3D software, including that produced by Dassault Systèmes and Autodesk, and hardware interfaces that include iPads, iPods and iPhones. Virtalis technical director Andrew Connell used an iPad to control all three systems installed at Northampton during the demonstration day. While training and hazard assessment in a virtual environment is never the same as the real thing, it does reveal which tasks are likely to be difficult to execute.