
|
Examples of Failures - Embedded
Systems |
While most embedded systems suppliers are working to produce compliant
systems, the sheer number of different systems any one manufacturing concern must deal
with, in a short time, is daunting. Find them all and fix them in time.
Keywords: Embedded Systems, Y2K, embedded chips, microcontrollers, microchips,
chips, data acquisition, SCADA, programmable logic controller (PLC), process control,
manufacturing automation,
Chemical Safety and Hazard
Investigation Board - PIPELINE SAFETY ADVISORY BULLETIN July 7, 1999 The Office of
Pipeline Safety, U.S. Department of Transportation, has issued a Pipeline Safety Advisory
Bulletin following a June incident in Washington State which claimed
three lives.
- Background: During an Office of Pipeline Safety (OPS) investigation of a recent
pipeline incident, OPS inspectors identified inadequate SCADA performance as an
operational safety concern. Immediately prior to and during the incident, the SCADA system
exhibited poor performance that inhibited the pipeline controllers from seeing and
reacting to the development of an abnormal pipeline operation
- Preliminary review of the SCADA system indicates that the processor load (a measure of
computer performance utilization) was at 65 to 70 percent during normal operations.
Immediately prior to an upset condition occurring on the pipeline, the SCADA encountered
an internal database error. The system attempted to reconcile the problem at the expense
of other processing tasks. The database error, coupled with the increased data processing
burden of the upset condition, hampered controller operations. In fact, key operator
command functions were unable to be processed immediately prior to and during the abnormal
operation. It is possible that post installation modifications may have hampered the
system's ability to function appropriately.
- The combination of the database error, the inadequate reserve capacity of the SCADA
processor, and the unusually dynamic changes that occurred during the upset condition,
appear to have combined and temporarily overburdened the SCADA computer system. This may
have prevented the pipeline controllers from reacting and controlling the upset condition
on their pipeline as promptly as would have been expected. For further information,
contact Chris Hoidal, Director, OPS Western Region at 303-231-5701.
The Institution of Electrical Engineers (IEE.org.uk) The Millennium Problem in
Embedded Systems - Casebook
- Much has been written about Year 2000 problems in embedded systems, but the emphasis has
been principally on the process of investigation, with little information about real cases
of failure. While the incidence of Year 2000 problems in embedded systems has been found
to be relatively low, the impact of the problems has in some cases business been business
threatening. Action 2000 in conjunction with The Institution of Electrical Engineers has
undertaken a data collection initiative to collate facts about actual Year 2000 failures
in a wide range of embedded systems.
- Action 2000 through the IEE requested leading consultant engineering companies in the UK
to list the occurrence of actual faults found in equipment. Because of the range of
specialisms and industries worked in by these companies, a good representative sample is
thought to have been found. - AEA Technology, BSC Consulting, ERA, IBM, ICS, Real Time
Engineering, The Houndscroft Partnership
- The equipment categories used in the collection of data for the non-computer entries
(60% of the total) were:
- Logging / monitoring
- Other PLC
- SCADA
- Smart Instruments
- Stand alone instrument
- The areas reported as having most problems with non-computer based systems
are (in decreasing order):
- Calibration, monitoring, data logging, detectors, analysers
- Building management, including HVAC, fire and security systems
- Manufacturing and process systems (SCADA, PLC, DCS)
- Telecommunications and networking
- Other
- The dates which caused the problems were:
- millennium rollover 71%
- leap year problem 9%
- multiple date problems 6%
- other dates or unknown 14%
Tava Technologies : A White Paper that Discusses the
Significance of the Effect of the Millennium Bug (Y2K) on Process Control, Factory
Automation & Embedded Systems in Manufacturing Companies. Feb 98. (pdf)
"
- "To date, with plant floor Y2K experience at over 400 sites, the company has yet to
find a single site that did not require some degree of remediation; and, to date,
having researched tens of thousands of manufacturing automation systems and components for
Y2K readiness, the company has found more than 20% to be either non-compliant or
"suspect", that is non-compliant under certain circumstances."
- Problems range from major operational nuisances to erratic production shortages to
complete plant shutdowns. But, perhaps the worst case of all will be systems that
continue to work but make bad decisions effecting product yields. It may be on
January 1, 2000, or it may be days or even months later."
Industry Wakes Up to
the Year 2000 Menace Fortune article
- Ralph J. Szygenda, chief information officer at General Motors, whose staff is now
feverishly correcting what he calls "catastrophic problems" in every GM plant.
In March the automaker disclosed that it expects to spend $400 million to $550 million to
fix year 2000 problems in factories as well as engineering labs and offices.
- Rob Baxter, Honeywell's vice president in charge of making his company's line of
industrial control products "year 2000 compliant" From what he has seen among
Honeywell customers, Baxter fears that "some plants will have trouble operating and
will have to shut down. Some will run at a reduced scope. I expect considerable system
outages during December 1999 through February 2000."
- Manufacturing's task is compounded by the multiplicity of its computer programs. Below
the layers of more or less standard software is a vast range of equipment run directly by
built-in chips and programs, which outnumber those in the rest of business by a factor of
ten.
- General Motors - "At each one of our factories there are catastrophic
problems," says the blunt-talking executive. "Amazingly enough, machines on the
factory floor are far more sensitive to incorrect dates than we ever anticipated. When we
tested robotic devices for transition into the year 2000, for example, they just froze and
stopped operating."
- Only a few companies offer software that can deal with factory problems. Among them are
Raytheon Engineers & Constructors, Fluor-Daniel, and Peritus Software Services of
Billerica, Mass., as well as the service operations of companies that sell industrial
controls, such as Foxboro and Honeywell.
- Tava Technologies. Its Plant Y2kOne software includes a database on 10,000
microprocessors, related control devices, and software from more than 1,000 vendors that
is used on the factory floor. Among other things, Plant Y2kOne can check out software in
robots, PCs, and PLCs; operating systems such as Unix, DOS, and Windows NT; and embedded
software such as a program used to guide automated vehicles.
- Leap-year snafus damaged production lines when programmers failed to account for the
extra day in February 1996. At a small U.S. manufacturer of industrial solutions that
prefers to remain unnamed, production ground to a halt on Jan. 1, 1997. Before workers
could remedy the situation, the liquids hardened in the pipelines, which had to be
replaced at a cost of $1 million. That caused late deliveries and the loss of three
customers. A similar leap-year oversight caused $1 million of damage at Comalco's aluminum
refinery in Tasmania, when controls at all smelting-pot lines shut down, damaging five pot
cells beyond repair.
Year 2000 Problem Sightings ( http://info.cv.nrao.edu/y2k/sighting.htm
) Excellent source for general Y2K failures
- report
Anesthesia machines non-compliant - supplier tries to sell new systems
- report
Congressional Subcommittee survey
- Phillips Petroleum Y2K test - an oil rig hydrogen sulfide detector system stopped
working.
- Chrysler plant lock out
- NORAD Y2K - total system blackout
- Cara Corporation Embedded Systems Specialist David C. Hall stated that there are over 40
billion microprocessors worldwide, and anywhere from one to ten percent may
be impacted by the date change. Hall described an oil company that has determined the need
to replace thousands of chips controlling an oil dispensation system. The chips, he said,
do not fit on the existing motherboards and new motherboards do not fit into existing
valves. As a result, the valves themselves will have to be replaced, Hall said
- report Users Demand
Y2K Lemon Aid, Control Magazine
- Y2K failure rate in semiconductor plants - 3.3 billion micro-controllers embedded in the
automation infrastructure, 50 million will have Y2K anomalies.
- As a reference point, Woll reviewed the Dept. of Defense Year 2000 project inventory
report. He said of 3,962 applicable systems, 582 were OK, 623 were being renovated, 628
were retired, and the balance of 1,900 was being assessed. The numbers suggested that
about 25% of all the systems would require some level of fixing.
- Patrick Meehan, Y2K program manager, DuPont Operations, presented the large-user
perspective. "Let's face it, there's not much upside and a lot of downside," he
offered. He sees that 50% of DuPont's work will be with process control devices and
systems and his current estimate is that, while 100% will be examined, 10-15%
will need remediation.
- "Towards the end of 1998, those who haven't yet worried about Y2K will find
themselves forced to. If they don't, Y2K becomes the best thing that happened to lawyers
since divorce."
http://www.xs4all.nl/~zooko/Y2k-real-life.html
- full story General
Motors tested robotic devices - they "just froze and stopped working"
- full story control valve for generator
cooling integrated over time for smoothing
- full story Chrylsler plant test
locks the doors on testers
- We're pretty sure our first tier will work," Chrysler President Thomas Stallkamp
said of his company's largest suppliers. "It's the second and third and fourth tier
who supply not just our industry but others. As you get further down the food chain,
you've got a guy making widgets for us as well as for Boeing and Maytag, and
those guys are the ones we're worried about."
- "We got lots of surprises," said Chrysler Chairman Robert Eaton. "Nobody
could get out of the plant. The security system absolutely shut down and wouldn't let
anybody in or out. And you obviously couldn't have paid people, because the time-clock
systems didn't work."
http://www.euy2k.com/reallife.htm
- a power plant in the United Kingdom - control valve for generator cooling is integrated
over time for smoothing
- ITRON meter reader decks and associated upload/download equipment fail on 2000
- NRC-NEI Meeting (If a plant can be shut
down because flooding prevents proper emergency response, then Y2K failures of emergency
procedures could require shutdowns)
- details Hawaiian Electric Company
- Western Power - Many of the control systems represented in power systems, have dates
associated with them. These could be reclosers, Voltage regulators, Governors, PLCs etc.
The list is endless. You then have a swathe of actual 'applications' involved in the
delivery of electicity such as your Distributed Control Systems and your SCADA (System
Control and DATA (eg.dates) Acquisition) systems, all of which have dates associated with
them. Much of what happens throughout the process of generating and delivering electricity
is 'DATE AND TIME STAMPED'
http://www.sysmod.com/embexamp.htm
- North Sea Expro (Shell-Exxon JV) Platform, Pipeline and Gas Plants - 12% failure rate
- Alcoa Steel Plants : 50% of control systems will fail
- BP Refinery - vendor not found for 20, 3 will fail, 2 will cause shutdown
- Capelrig Millennium Test Centre for Shell
- demonstates how failing system controlling an oil rig pump would float the platform
- oil rig typically has 8000-10000 embedded systems
- details Hawaiian Electric Company energy
management system (EMS) failure would haveresulted in HECo's transmission network
crashing, and by default, a major power outage and loss of all generating capacity
- Programmable thermostats fail, one cannot be restarted.
- Chip failure would cut off cooling system and cause explosion in chemical plant
- Fossil power plant control and downstream PLC clock mismatch would trip plant
- Gas pipleline metering failure
- PLC's locking up due to Year field overflow
- Sewage controls fail to track tide tables properly
http://www.year2000.com/archive/similar.html
(Computer problems similar to Y2K)
- telephone outage that occurred in New York on September 17, 1991
- Gulf War Patriot missile system had an unrecognized clock drift over a 100-hour period -
tracking error of 678 meters
- the software for the F-16 fighter would cause the plane to fly upside down whenever it
crossed the equator
- Berlin 1993, two trains collided - the track was set on the holiday two-way traffic
setting
- Cement factory chip failure drops rocks on cars
- 99 year old man's blood count judged by infant norms
- In Colorado Springs one child was killed, another injured - the traffic light systems
continued in weekend mode and ignored the school schedule -failure getting the time
transmitted to them from the atomic clock in Boulder
- Several leap year problems noted including aluminum smelter
http://www.granite.ab.ca/year2000/incidents.htm
Embedded SystemS Problem (ESSP) Ltd
- Embedded systems are used extensively to control and monitor engineering and
manufacturing processes. They underpin the whole of the worlds manufacturing and
engineering base. Energy (oil, coal, gas, nuclear), planes, ships, pharmaceutical
industries .. food, drink and clean water ...car manufacturing, national and international
defence, railway networks, telecommunications, medical equipment, broadcast media.
Washing machines, microwave ovens, video recorders, alarms/intruder detection systems and
central heating controllers. control temperature, lighting, air conditioning and security
access in many offices. And they also support point of sale equipment, cash dispensers and
traffic management in a typical High Street.
- During 1995, more than 200 million PCs were shipped worldwide. In the same period, the
number of embedded systems shipped exceeded 3 billion. According to research conducted
over the past year, around 5% of simple embedded systems were found to fail
Millennium Bug tests. For more sophisticated embedded systems, failure rates of
between 50% and 80% have been reported (Action 2000 UK Government Taskforce). In our own
experience however we have found it closer to 15%-20% in processor intensive
industries.
http://www.ccta.gov.uk/mill/embed.htm
Golden Y2K Immunization Rules by
ICONOCLAST, CORUM Research Group, Geneva - 16
February 1999
[http://www.TheInternetFoundation.Org/foot-tif.htm]