Measuring business continuity risk is crucial for protecting your financial institution from operational, financial, and reputational damage. When you understand your level of business continuity risk, you can determine if you're comfortable with that risk or need to enhance your preparedness. This applies whether you're quantifying risk for the first time or monitoring changes in your risk exposure.
Quantifying business continuity risk is made easier with the help of Key Business Continuity Indicators. They come in two forms: key risk indicators (KRIs) and key performance indicators (KPIs).
Table of Contents
What are KRIs for Business Continuity?
Key risk indicators measure potential risks that could impact the effectiveness of the business continuity plan. They serve as an early indicator of a change in the risk environment – whether that means an increase in risk, a decrease in risk, or a new risk. They give your institution an opportunity to adjust to an evolving risk environment by adding or enhancing risk mitigation controls.
Here are some business continuity KRIs your institution should monitor:
- Cybersecurity threats: No cyber resiliency program in place or simply a lack of monitoring for and assessing types of attempted cyberattacks and breaches.
- Manual workarounds for critical business functions: Lack of manual workarounds developed and documented for each critical (required recovery less than 24 hours) business function.
- Testing manual workarounds for each vital business function: Manual workarounds are developed but have not been tested for or not staff trained to recover vital (required recovery less than 8 hours) business functions.
- System downtime frequency and duration: Track the number of times critical systems are down and the length of each downtime.
- Unresolved vulnerabilities: Number of identified but unresolved vulnerabilities in critical systems.
- Backup failure rate: Percentage of failed backups over a defined period.
- Recovery point objectives (RPO) and recovery time objectives (RTO): An RPO is the maximum acceptable amount of data loss measured in time. It answers the question: "Up to what point in time can we recover data after a disruption?" RTO is the target recovery time for IT and business activities after a disaster has occurred. It answers the question: "How quickly must systems and services be restored to avoid significant business impact?" Watch for lack of measuring or inaccurate measurement and/or adherence to these critical BCP metrics (i.e. ensuring they are met or exceeded).
Related: Disaster Recovery Planning for Banks & Credit Unions
- Third-party resilience risk exposure: Inability to assess the reliability and resilience of key vendors and suppliers, including their own BCP and recovery capabilities. Number of critical third-party vendors without verified business continuity planning or refusal to provide verified critical system recovery point objectives (RPO) and recovery time objectives (RTO).
- Employee turnover rate: Rate of turnover among employees with key roles in the business continuity plan.
- Staff availability and redundancy: Lack of trained personnel available to execute the BCP and recover vital and critical business processes. Ensure key roles have backups.
- Training gaps: Number of key employees who are not clear in their role and responsibility during a crisis event.
- Resource shortages: Instances of critical resource shortages during tests or incidents.
- Incident frequency: Number of business continuity incidents occurring within a defined period.
- Incident response times: Slow incident response times and/or effectiveness of responses to incidents that trigger BCP activation (both tests and real-world events).
- Customer impact and feedback: Inability to monitor any disruptions in customer service continuity.
- Financial impact analysis: Not developing or analyzing the financial impacts of potential business disruptions and ensuring adequate resources are allocated for recovery.
- Findings: Review findings from internal and external audits related to business continuity planning and implementation. If there is an increasing number, it might indicate a systemic problem. If there are too few, it might be a sign you aren’t taking a close enough look at your program.
What Are KPIs for Business Continuity?
Key performance indicators are measurable benchmarks that demonstrate how effectively an institution is achieving its key business objectives. KPIs help businesses understand if they are on track to meet their goals and where they might need to make improvements. Examples below:
- Cybersecurity: Cyber resiliency program adheres to widely adopted standards such as NIST Cyber Security Framework 2.0. Note: Framework components may have their own separate KPI.
- Manual workarounds for critical business functions: Manual workarounds identified, developed, and documented for each critical (required recovery less than 24 hours) business function.
- Manual workaround tests: Manual workarounds are tested for vital (required recovery less than 8 hours) to confirm viability.
- RTO compliance rate (for on-prem systems): Percentage of systems recovered within the target Recovery Time Objective.
- RPO compliance rate (for on-prem systems): Percentage of data restored within the target Recovery Point Objective.
- RTO compliance rate (third-party hosted systems): Percentage of third-party RTO meets or exceeds expectations as confirmed via vendor test results documented and provided.
- RPO compliance rate (third-party hosted systems): Percentage of RPO meets required expectations as confirmed by confirmed via vendor test results documented and provided.
- Crisis management / Executive leadership exercise: Executive leadership and members of the crisis management team participate in mock crisis events to test communication channels and discuss strategic crisis decision making in advance of an actual event (minimum 1 per year).
- Tabletop exercise rate: Number of tabletop exercises conducted with all department leaders involved (minimum 1 per year).
Related: 9 Steps to an Effective Tabletop BCP Test
- Plan activation/incident response time: Average time taken to respond to a business continuity incident.
- Recovery testing success rate: Percentage of successful recovery tests conducted annually.
- Employee training completion rate: Percentage of employees who have completed business continuity training.
- Plan update frequency: Number of times the business continuity plan is reviewed and updated annually.
Related: Think Your BCP Hasn’t Changed Over the Past Year? Think Again
- System downtime: Total downtime of critical systems during the reporting period.
- Backup success rate: Percentage of successful backups completed within a defined period.
- Air gap backup effectiveness: Confirmation of air gap (protected from cyber compromise) backups successfully archived.
- Communication failures: Instances where communication failures were identified during an incident or test (i.e. percentage of stakeholders who received and understood BCP communication during an incident).
- Resource availability: Percentage of required resources (staff, equipment, etc.) available during a business continuity event.
- Cost of recovery: Average cost incurred to recover from business continuity incidents.
The Importance of Measuring Key BCP Indicators
Key business continuity indicators help financial institutions guard against operational threats. Early identification of business continuity risk lets financial institutions:
- Manage strategic goals: Understanding the operational risks of new product launches or expansions ensures that business decisions align with continuity requirements.
- Protect against financial loss: Financial losses may arise from extended downtime, data loss, or a loss of customer confidence.
- Enhance operational efficiencies: Recognizing business continuity weaknesses empowers institutions to streamline processes, improve controls, and reduce the likelihood of operational failures.
When used correctly, these indicators proactively address business continuity risk, saving your institution from costly disruptions and recovery efforts.
Key Business Continuity Risk Indicators by Category
Rather than break down indicators by KPIs or KRIs, it can be convenient to look at key business continuity indicators by category. Let's examine some categories your institution should measure.
Recovery time objective (RTO) – RTO is the maximum tolerable length of time a business process can be down after a disaster. Financial institutions should track:
- Actual recovery times versus expected RTOs for critical systems and business processes
- Percentage of systems and processes meeting their RTOs during tests
- Frequency of RTO reviews and updates
Recovery point objective (RPO) – RPO represents the maximum amount of data loss an organization can tolerate. Key indicators for RPO include:
- Actual data loss versus expected RPOs during tests or real incidents
- Frequency of data backups
- Success rate of data restoration tests
Business continuity plan testing – Regular testing is crucial for maintaining an effective BCP. Key indicators include:
- Frequency of BCP exercises (tabletop exercises, functional exercise, crisis management, etc.)
- Percentage of successful recovery tests versus total tests conducted
- Time taken to complete each phase of the BCP during tests
- Number of issues identified during tests and their resolution time
Employee training and awareness – Ensuring staff are prepared for potential disruptions is vital. Track:
- Percentage of employees who have completed BCP training
- Frequency of BCP awareness programs
- Employee feedback on the clarity and effectiveness of BCP training
Incident response time – How quickly your institution can respond to a disruptive event is critical. Monitor:
- Average time to activate the BCP after an incident is detected
- Time taken to assemble the crisis management team
- Duration of the initial impact assessment process
System availability and uptime – Maintaining high availability of critical systems is essential. Track:
- Percentage uptime of critical systems
- Frequency and duration of unplanned outages and impact
- Mean Time Between Failures (MTBF) for key systems
Third-party vendor resilience – Your institution's resilience depends on your critical vendors. Monitor:
Cyber resiliency – Developing and implementing an effective cyber resiliency program which adheres to a standard like NIST CSF 2.0. Measure your cyber resiliency program in each category:
- Governance
- Identification
- Protection
- Detection
- Response
- Recovery
Cybersecurity preparedness – A component to cyber resiliency, cybersecurity is integral to business continuity. Track:
- Number of identified cybersecurity vulnerabilities and their remediation time
- Frequency of cybersecurity incident response plan testing
- Success rate of phishing awareness tests among employees
Communication effectiveness – Clear communication during a crisis is crucial. Monitor:
- Time taken to send initial notifications to stakeholders during an incident
- Effectiveness of communication channels (e.g., percentage of messages successfully delivered)
- Stakeholder feedback on the clarity and timeliness of crisis communications
Financial impact of disruptions – Understanding the financial implications of disruptions helps justify BCP investments. Track:
- Estimated cost of downtime for critical business processes
- Actual costs incurred during disruptions or BCP activations
- Return on Investment (ROI) of BCP initiatives
- Technology solutions for managing business continuity
Solutions for Managing Business Continuity
Given the number of key BCP indicators financial institutions must measure, relying on manual planning and documentation simply doesn't make sense. The stakes for failing to maintain operational resilience are too high.
Business continuity software paired with enterprise-level risk management software gives banks, credit unions, mortgage companies, and other organizations a sophisticated toolkit for ensuring resilience – building tested business continuity plans and assessing continuity risk. Together, they provide the tools and knowledge to understand the risk environment and protect your institution.
Waiting until a real disaster strikes to identify weaknesses in your business continuity plan puts your institution at a disadvantage that can be hard to bounce back from. It can lead to scrambling and greater costs and consequences than proactively identifying and addressing issues at the earliest opportunity.
Embracing key business continuity indicators and adopting business continuity and risk management technology enables you to devote more time to growing your institution while ensuring its resilience in the face of potential disruptions.
Got business continuity questions? We have the answers!
Join us for our webinar: The Ultimate Business Continuity Q&A