170: Errors Could Not Be Corrected/Mitigated

Definition/Typical Issues
Was the system designed such that personnel were unable to recover from errors before a failure occurred?
Was there insufficient time to respond (i.e., take action) to address or correct the situation?
Examples
Example 1
- A computer operator started an automatic operating sequence controlled by a distributed control system before the valving lineups in the process area had been completed. Even though operators in the field called in to tell the operator to stop the operation, the computer was not programmed to allow interruption of the sequence once it started. As a result, process flow was routed to waste.
Example 2
- During startup, the operator failed to check the tank level prior to starting the pump. Shortly after starting the pump, a low tank alarm occurred, indicating insufficient level for the pump drawing suction from the tank. By the time the operator was able to stop the pump, the pump was already damaged.
Example 3
- Samples were drawn from each batch prior to shipment. However, the batches were often sent out before the analysis of the samples was complete. As a result, when a sample indicated an unacceptable batch, the delivery could not be stopped before it reached the customer. The customer had to be called and asked to ship the batch back.
Example 4
- A high level alarm sounded in the control room. To prevent an overflow, the operator had to locally shut down a pump. By the time the operator was able to get to the location of the pump, the tank had already started to overflow.
Example 5
- A low temperature alarm on a heater activated. The operator immediately began to increase fuel gas flow to the heater to bring temperature back up. However, before the temperature could be stabilized, the heater tripped out on low temperature.
Typical Recommendations
- Design safety- and quality-related equipment so that the detected errors can be corrected before system failure occurs.
- Design tasks and related procedures to allow employees time to detect and correct errors for safety- and quality-critical tasks.
- Modify equipment to go to a safe state or mode when problems are detected.
- Design equipment to recover from abnormal conditions with no or limited human intervention.
- Develop means to alert personnel to situations requiring attention sooner in order to allow additional response time.
- Modify the system to reduce the time required to travel to the location where the task needs to be performed. For example, use a valve handle extender to allow a valve on the third level to be operated from the first floor.
- Provide a remote means of actuating the component or performing the task. For example, provide a control for a heater in the control room in addition to the local control.
- Modify the system to allow additional response time. For example, reduce fill rates to allow additional response time following a high level alarm.
Cross-References
| Version 10 Element(s) | |
|---|---|
| Node ID | Node Name |
| 162 | Errors Not Correctable |
| Maritime Element(s) | |
|---|---|
| Node ID | Node Name |
| 148 | Insufficient Time to Respond |
| 177 | Errors Cannot Be Corrected/Mitigated |