To navigate our frequently asked questions page, please click on the main topic and further subtopics will open, or use the search function.
HAZOP means Hazard and Operability Study and is a systematic method for identifying hazards and evaluating safeguards. The methodology is in line with internationally recognised Risk Assessment techniques, such as: IEC EN 61511-2, EN ISO 17776.
HAZOP is a structured “brainstorming”, based on assumed deviations from normal operation that could represent hazards. The system is divided into manageable sections (nodes), deviations are determined using parameters and key words, then possible consequences as well as existing safeguards are identified. The HAZOP-Team decides whether the safeguards are sufficient based on risk evaluation (usually based on Client’s Risk Acceptance Criteria). If residual risk is not acceptable or ALARP, additional measures or further investigation are recommended.
IEC 61511 defines the Safety Life Cycle for Functional Safety Management. The initial Hazard and Risk Assessment (HRA) shall take place during the Analysis phase, based on an early version of the P&IDs and C&E chart. After completion of the initial HAZOP, MOC procedures should be applied to monitor and evaluate any changes during further design and implementation phases that might impact HAZOP assumptions or safeguards. HAZOP should be reviewed/updated prior to commissioning as part of a pre-startup safety review (PSSR) procedure and throughout the plant operating life, latest every five years.
HAZOP is attended by key members of Client’s (Owner / Operator), Engineer’s and Contractor’s project team covering following disciplines:
- Process / Design Engineer
- Project Engineer
- Instrument/Electrical Engineer
- O&M Representative
- Safety Engineer
The HAZOP is led by an independent Chairman and findings are recorded by the HAZOP Scribe.
- HAZOP is not an engineering review meeting. If problems are identified, they shall be referred for separate action
- HAZOP is generally qualitative based on experience of participants. Risk ranking is carried out based on main criteria ‚safety‘ and ‚environment‘ (financial impact, by agreement). In case of ALARP residual risk it shall be considered if additional action is required
- ‘Frozen’ documents required, changes shall be evaluated by separate MOC procedure
- Causes are generally to be taken within the node only; however, influences upstream / downstream project interfaces shall be considered
- Consequences and safeguards may occur anywhere inside and outside the node (however, only within HAZOP scope as defined in PIDs)
For a remote HAZOP, an additional etiquette for video-conferencing should be applied.
Guidewords are the tools that are used to systematically direct the HAZOP Study. They are words or phrases that, when considered together with a parameter, form a hypothetical deviation for the Team to consider. The basic guide words are: NO, MORE, LESS, AS WELL AS, PART OF, REVERSE and OTHER THAN.
Guidewords are combined with process Parameters such as: FLOW, PRESSURE, TEMPERATURE, and LEVEL, or General parameters, such as MAINTENANCE, START-UP / SHUTDOWN and others, describe operations more so than physical aspects of the process.
The combination of Guideword and Parameter gives a Deviation Matrix which identifies departures from the design intention. This is the starting point for the HAZOP-Team to brainstorm causes, consequences, safeguards and residual risk of the identified hazards.
Each HAZOP deviation is analysed identifying the potential Causes, including:
- Malfunction of process control systems;
- Operational Error (e.g. opening wrong valve, failure to respond to signals, etc.);
- Faulty maintenance activities;
- Failure of utilities (power supply, control system, or other utilities).
Further guidelines in relation to causes are:
- Equipment is assumed to be designed to appropriate earthquake, wind, and storm and other environmental requirements, therefore these are not treated as Causes
- Manual valves marked as LO/LC/CSO/CSC will not be considered to change their position by operator error as a cause
For each realistic deviation identified, the associated Consequences are analysed without accounting for the existing safeguards. Related hazards are listed such as: fire, explosion, release of flammable or toxic material, environmental damage, off-spec. products, loss of production, etc.
Consequences may cause hazard/risk to the following protection categories:
- Personal safety (site personnel, off-site third parties)
- Asset, Financial impact
- Owner/Operator Reputation
In evaluating the consequences, it is normal to account for the design margins to assess whether a loss of containment (LOC) can occur, e.g. overpressure and LOC is typically credible when pressure can become higher than the hydraulic test pressure, rather than the simple design pressure (DP) or Maximum Incidental Pressure (MIP) of the pipe.
For credible deviation/consequence combinations, the HAZOP-Team considers what safeguards/mitigating features exist and whether these sufficient or not, depending upon the severity of the expected outcomes. Valid safeguards have to be independent from the systems/devices which are causing the deviation under analysis.
Typical safeguards are:
- Independent systems (e.g. SSV, ESD)
- PSV’s / TSV’s
- Alarms in manned Control Room (only when operator / dispatcher is granted with a sufficient intervention time and there is a clear procedure for action)
Devices generally NOT considered as valid Safeguards:
- Local Alarms
- Alarms in Control Room when the time gap between Alarm activation and Consequence is not sufficient for operator / dispatcher effective intervention
- Indication without Alarm (even if in Control Room); Check valves are not considered as a means for eliminating backflow, but credit may be given for reduction of backflow
Typical examples of Safety Instrumented Safeguards are:
- High High Level Trips
- High High Pressure automatic shutdowns
- Low Low Level shutdowns
Non-Instrumented Safeguards are functions which do NOT initiate an automatic trip or shutdown action by themselves, but that can either prevent or help control or mitigate the issue and may be considered as an Independent Layer of Protection (ILP). Typical examples of Non-Instrumented Safeguards are:
- High Level Alarms
- Mechanical Overfilling
- High Pressure vents (PSVs)
- Procedures (clearly defined, tested and auditable)
- Check valves are not considered as a means for eliminating backflow, but credit may be given for reduction of backflow
Residual risk is the combination of Consequence x Likelihood after consideration of safeguards (note: safeguards generally reduce the likelihood of a top event). Consequence is evaluated qualitatively in the range from 1: Minor … 5: Catastrophic . Frequency is defined in the range A: Not expected … E: Highly probable. The residual risk is normally allocated a colour code:
GREEN: Acceptable – no action required
YELLOW: ALARP – monitoring required, consider to reduce risk further according to BEP / RAGAGEP, if feasible
ORANGE: ALARP – take action, reduce risk further according to BEP / RAGAGEP
RED: Unacceptable – take immediate action to mitigate risk
ALARP: As low as reasonably practicable
BEP: Best Engineering Practice
RAGAGEP: Recognized And Generally Accepted Good Engineering Practice
Double jeopardy is the occurrence of two independent deviations or two initiating events at the same time. In determining if the causes are independent, careful consideration should be given to common mode failures and process dependencies. Failure of a safety device in response to a control system failure or human error would not be considered double jeopardy. For example, a PSV failing to open on demand in response to high pressure is not double jeopardy.
Care needs to be taken when excluding events because of ‘double jeopardy’. If the HAZOP-Team decides that a particular cause is credible, this shall be evaluated regardless of whether this is ‘double jeopardy’ or not.
CHAZOP (Control Hazard and Operability Study) identifies hazard and operability issues associated with control and computer systems. It follows the same methodology as HAZOP, however, using appropriate documents and Guidewords.
CHAZOP documentation requirements include: Control System Network Architecture Schematic, Cause & Effect Charts, Control / SIS Block Diagrams, ESD/ F&G interface & electrical protection schematics, Alarm / Signal List, Control Philosophy / Narratives
Delta-HAZOP (‚HAZOP Revalidation‘ or ‚HAZOP-by-Difference‘) is method of updating the HAZOP for existing (or similar plant), while ensuring the conclusions of previous HAZOP studies remain valid and are not lost. It requires a clear understanding of changes compared to the previous HAZOP, which are generally documented via MOC procedures.
The following information should be available:
- Overview of changes made since the last HAZOP (preferably as mark-up of current PIDs)
- Is MOC up to date?
- Were there any changes to an alarm or safety system?
- Did any of the changes require modification to:
- Op. Conditions outside the operating range?
- Chemistry of the process?
- Timing or sequencing of the operations?
- Maintenance procedures or staffing level?
- Does the change affect safety or the environment?
- Operator experience, Lessons learned, Root-cause analysis?
If significant changes have taken place, a new HAZOP (redo) should be done.
EHAZOP (Electrical Hazard and Operability Study) identifies hazard and operability issues associated with electrical systems. It follows the same methodology as HAZOP, however, using appropriate documents and guidewords.
EHAZOP documentation requirements include: Single line diagrams, Layouts, Consumer Load List (normal/essential), ESD/ F&G interface & electrical protection schematics
EHAZOP may include:
- SAFAN – Safety Analysis
- SYSOP – System security and operability review
- OPTAN – Operator task analysis
Cyber-HAZOP (or Cyber-PHA) identifies cybersecurity risks associated with an Industrial Control System (ICS) or Safety Instrumented System (SIS). It follows the same methodology as HAZOP, however, using appropriate documents and guidewords in accordance with IEC 62443.
Cyber-HAZOP documentation requirements include: System Architecture Diagram, Zone & Conduit Drawings, Cyber Network Schematics
HAZID (Hazard Identification) is a method for general hazard identification normally carried out prior to HAZOP (HAZOP focuses more on process hazards). HAZID allows a high level evaluation of risk at an early stage of a project, covering issues such as: facility siting, human factors, MAH scenarios, environmental impact and best engineering practices. It follows the same methodology as HAZOP, however, using appropriate documents and Guidewords.
HAZID documentation requirements include: Location and Layout Plans, Process Flow Diagrams, MSDS for handled fluids, Table of Emissions (Air, Water, Ground, Noise, etc.)
SIL selection is the process of assiging a SIL (Safety Integrity Level) to a safety function in order to close the gap between residual risk (evaluated in HAZOP) and tolerable risk (as per Client Risk Matrix).
SIL (or its inverse – Risk Reduction Factor RRF) is defined in IEC 61508/61511 for low demand operation, as follows:
|SIL||PFD (avg)||PFD (power)||RRF|
|1||0.1–0.01||10−1 – 10−2||10–100|
|2||0.01–0.001||10−2 – 10−3||100–1000|
|3||0.001–0.0001||10−3 – 10−4||1000–10,000|
|4||0.0001–0.00001||10−4 – 10−5||10,000–100,000|
Layers of Protection Analysis (LOPA) is a semi-quantitative method for analsying the likelihood of a hazardous event, considering initiating event (IE) frequency and the mitigating effect of various independent protection layers (IPL).
Initiating events are defined in the HAZOP, such as:
- Piping leak or rupture
- BPCS failure
- Equipment failure (loss of containment)
- Human error (depending on frequency and criticality of task performed)
- Loss of utility (e.g. power supply, instrument air)
- Mechanical overpressure protection (e.g. PRV, burst disk)
- BPCS interlock (where control loop is not initiating event)
- Independent safety systems (e.g. BMS for a fired heater)
- Operator response to alarm (sufficient time to adequately respond must be available)
Additional mitigating factors (e.g. F&G systems, spill containment) and conditional modifiers (e.g. ignition probability, presence of personnel) are considered.
After consideration of all IPLs, the gap between residual risk and tolerable risk gives the required SIL of instrumented safety functions (SIF).
The calibrated risk graph method is a semi-quantitative method for determining the required SIL for safety instrumented functions (IEC 61511-3, Annex D). Its basis is a graphical decision chart where, in addition to Demand Rate W and Consequence parameter C, additional factors F (exposure time) and P (probability of avoiding the hazardous event) enable ‚order-of-magnitude‘ steps to define the final SIL.
The standard does not quantify demand rate and consequence, therefore parameters W1, 2, 3 and CA, B, C, D must be ‚calibrated‘ in line with Client risk acceptance criteria. Consequence criteria are defined for personal safety, environmental damage and asset loss. The calibrated risk graph method generally results in more a conservative SIL-selection than the LOPA method.
The calibrated risk graph method as per IEC 61511 defines SIL-a for safety functions with a required risk reduction factor lower than 10 (i.e. below SIL-1). SIL-a functions may be implemented in the BPCS as ‚low integrity safety functions‘, but shall be subject to Functional Safety Management, including:
- Safety and quality management
- Independence of safety and control functions
- Documentation control
- Management of change
- Periodic testing
IEC61511-1:2016, § 3.2.24 defines FSA as: investigation, based on evidence, to judge the functional safety achieved by one or more SIS and/or other protection layers.
FSA should be led by a qualified independent assessor and shall result in a statement that functional safety is achieved for the corresponding SLC-stage. FSA focuses on technical aspects and achieved results in addition to procedures and paperwork (see FS-Audit). Five FSA-stages are defined in IEC 61511-1 Figure 7, of which FSA-3 and FSA-4 are compulsory.
IEC61511-1:2016, § 3.2.25 defines functional safety audit as: systematic and independent examination to determine whether the procedures specific to the functional safety requirements comply with the planned arrangements, are implemented effectively and are suitable to achieve the specified objectives. Note: A functional safety audit may be carried out as part of a FSA.
The purpose of FS-Audit is to determine if functional safety procedures are being properly applied during project execution (think of it as an ISO-9001 audit for functional safety). The audit aims to identify potential systematic failures in the implementation of FS tasks and activities. In contrast to FSA, audit does not give a statement as to whether the required level of functional safety has been achieved; it only confirms that the correct procedures are being implemented as a prerequisite.
IEC61511-1:2016, § 3.2.87 defines functional safety verification as: confirmation by examination and provision of objective evidence that the requirements have been fulfilled.
Verification activities take place throughout the SLC to ensure systematic integrity. They consist of checking, review, inspection or testing activities as part of the normal engineering/construction/O&M quality process, however, with focus on functional safety. Verification activities are defined in the Project Functional Safety Plan (PFSP), including inputs, outputs, verification responsibility and timing.
IEC61511-1:2016, § 3.2.86 defines functional safety validation as: confirmation by examination and provision of objective evidence that the particular requirements for a specific intended use are fulfilled. …this means demonstrating that the SIF(s) and SIS after installation meet the SRS in all respects.
Validation takes place after commissioning and is often seen interchangeably with the Site Acceptance Test (SAT), i.e. it confirms that installed SIF/SIS are working correctly in line with the SRS. Note: FSA-3 goes a step further and confirms that the requirements themselves are complete, consistent and sufficient to achieve the intended level of safety.
Verification activities take place throughout the SLC and consist of specific checks of individual components or documents.
Audit activities take place throughout the SLC and confirm that functional safety procedures are being applied.
Validation is a specific SLC step, taking place after commissioning, but before introduction of hazards. Validation confirms that the installed SIF/SIS are working correctly in line with the SRS.
FSA confirms correct design/operation in line with functional safety requirements, but also makes a statement as to whether the requirements themselves are correct.
Increasing level of confirmation of functional safety compliance is as follows: Verification -> Audit -> Validation -> FSA
IEC 61511 defines Random Failure as „occurring at a random time, which results from one or more … degradation mechanisms in the hardware.“ Such failures occur at predictable rates (see bathtub curve) but at unpredictable (i.e., random) times. Systematic Failures are those „related to a pre-existing fault … which can only be eliminated by … a modification of the design, manufacturing process, operating procedures, documentation or other relevant factors”. Random failures are often defined as ‚hardware-related‘, whereas systematic failures are ‚due to human error‘. A systematic failure can be eliminated after being detected, while random hardware failures cannot. Implementation of a Functional Safety Management System shall minimise systematic failures.
Examples of random failure are:
- Aging or stress failure of electronic components including:
- Contact failure, soldered joint failure
- PCB/semi-conductor failure
- Relay stiction
- Resistor/capacitor degradation
Examples of systematic failure are:
- Errors during the Analysis phase of the SLC (HAZOP, SIL-Analysis, SRS) – see UK HSE “Out of Control: Why control systems go wrong and how to prevent failure”
- Software „bugs“
- Overlooking common cause failure
- O&M mistakes (SIF bypass, incorrect calibration, etc)