Operational resilience | Risk Management & Reporting

March 18, 2024

Background

Operational resilience requirements are being rolled out across the UK and beyond. UK finance firms are required to improve their operational resilience in accordance with Financial Conduct Authority (FCA) Policy Statement PS21/3 (the Policy).

The new rules came into force in March 20221 for a “transitional period” which ends in March 2025. After which firms must improve the governance of their key operational systems. They must maintain records that reflect the state of operational resilience of the key resources that support the ongoing delivery of their “important business services”.

In this overview we:

  • Explore the ongoing scenario testing techniques recommended in the Policy to maintain the operational resilience of the resources that underpin important business services.
  • Consider some of the reasons organisations who are pursuing improvements in their operational governance capabilities should investigate:
    • automating processes to reduce pressure on staff time and resources;
    • utilising data-driven solutions to improve the accuracy and relevance of information about operational controls;
    • increasing their line of sight to improve the quality of operational resilience information, and
    • complementing a top-down governance process with bottom-up evidence-based information to support reliable decision-making.

With the Bank for International Settlements (BIS) operational risk management principles2 at its core, tempered by a 1 in 100-year pandemic, and ongoing cyber-attacks, the Policy requires firms to improve the resilience of their business operations.

It highlights that a firm’s risk management and governance processes should focus more particularly on a number of key operational dependencies including: processes, people, technology, facilities and information (resources). Scenario testing is then recommended to provide insights into how a disruption to one of these key resources, for example a cyber-attack on a supporting technology system can be mitigated so as to prevent seriously impacting a firm’s ability to reliably deliver its services.

The Policy reflects a clear change in accountabilities. It requires that the Board now has ultimate responsibility for operational resilience obligations. While operational resources may require specialist management, the interdependency of many of these resources requires that senior leadership is ultimately accountable for the maintenance and oversight of operational resilience efforts.

In sectors like finance, some aspects of improving the management of operational risk frustratingly reside in the hands of 3rd parties. Even more complexity is added when we consider that operational disruptions can be caused by not only known, but also unknown and even unknowable risks. This means that an effective operational resilience process needs to contemplate each type of risk to reliably ensure the delivery of important business services. It is important to remember that no matter the type of operational risk, the better the breadth and quality of the information about that risk available to managers, the greater the potential for its successful mitigation.

Disruptive events can strike without warning. So, it’s not surprising that, after the transitional period, the FCA also expects that firms need to move beyond scenario-driven analyses to enhance their operational resilience processes. They must continue to identify and strengthen their resilience against various vulnerabilities that might disrupt the delivery of services. While senior executives must focus on the resilience of the key operational resources that support their important business services.

FCA Supervised Firms currently impacted by FCA PS21/31:

  • Banks
  • Building societies
  • Designated investment firms
  • Insurers
  • Recognised investment exchanges (RIEs)
  • Enhanced scope senior managers and certification regime (SM&CR) firms
  • Entities authorised or registered under the Payment Services Regulations 2017 or the Electronic Money Regulations 2011.

And the prevalence of 3rd party providers in the sector means that the Policy reach is broader still.

The Board must define important business services (IBSs). These are the key services of the firm that, if their delivery were to be interrupted, could potentially cause intolerable harm to consumers, and/or put financial market integrity at risk. IBSs should be reviewed at least annually.

A time-based measure of impact tolerance should be established for each IBS. That is, how long is it before a disruption to an underlying resource impacts the delivery of the IBS and potentially causes harm to consumers or market integrity. These impact tolerances inform managers of the resilience of systems and resources that support their IBSs. They should be reviewed at least annually, or when there is a material change to business operations.

The objective here is to identify, understand and map the key operational resources that support the delivery of each IBS. This, of course, includes any 3rd party inputs that are outside the direct control of the firm, yet vital to the delivery of the IBS. Mapping should be accurately documented in sufficient detail, to identify any operational vulnerabilities and the mitigation strategies required to manage them within impact tolerance. It too should be reviewed regularly and reflect any operational resilience improvements. It must be signed off by an accountable senior executive.

Mapping resource inputs to the delivery of IBSs1

  • People: key personnel, succession plans, and training.
  • Processes: the activities and work-flows.
  • Technology: the supporting IT assets and systems
  • Facilities & Information: any other ancillary activities supporting IBS delivery.

The test plan and any scenario testing details should be summarised in the self-assessment document. Risks should be tested, under various operational scenarios. A “wide range” of IBS operational scenarios is encouraged, so relevant vulnerabilities can be identified under the various circumstances and effective mitigating controls established. Examples cited as potential disruptions requiring appropriate controls, include: cyberattacks or telecommunications/power failures – both of which have recently been shown, to cause serious impact to the important business services.

Read more on other Policy considerations

During the transitional period that ends in March 2025, there is a lot to be done as firms embed new governance frameworks across their business service value chains and iteratively build their operational controls. Regular testing of the impact tolerance of each IBS under various people, processes, technology, and resource scenarios is required to enable firms to learn about the ongoing effectiveness of their operational controls. Broadening that testing to include “severe but plausible scenarios” especially given the volatile environment and risk landscape in which firms now operate is also encouraged. Incorporating the lessons learned into the developing governance process will assist the ongoing closure of operational resilience gaps.

Over time, however, these scenario-centric techniques will add to the operating complexity of any governance process. Perhaps it is why, posttransition, regulators are encouraging firms to establish more capable and reliable means of identifying vulnerabilities and maintaining effective sets of mitigative controls.

It is important to note that some of the input resources that support the delivery of IBSs are more volatile than others. The stability of non-technical resources like the staff and even processes that support IBSs deliveries, for example, are less likely to be susceptible to disruptions, than those influenced by technology. And increasingly digitised business operations, key IT assets and systems and the technical resources that support them make them more vulnerable to technical disruption. The fact that a number of these resources can interdependently impact the outcome of a scenario testing process only makes it more complex and difficult to establish individual mitigating controls.

The Policy specifically makes the distinction between operational resilience and simply the outcome of operational risk management. Operational resilience includes the particular risk management of the specific operational resources that support reliable IBSs delivery. Cyber resilience is seen similarly; it is not just cyber protection of infrastructure and highrisk data, it specifically focuses on the defence of the operational resilience program and the specific priority resources that support IBS delivery.

As part of operational resilience then, cyber resilience must embrace more than the protection of high-risk data and infrastructure. It must also capture the systems and IT assets that support the IBSs. It’s not either or. And the CISO, or responsible manager, must be able to work with the accountable operational resilience executive to fulfil both functions. Poor cyber resilience can quickly put both the firm’s technology and business operations at risk almost simultaneously. And hence why it is a major factor in ongoing operational resilience.

With purposeful implementation, the cyber security controls that protect high risk data and infrastructure can be the same ones that protect the systems that support the IBS value chain. The number and sophistication of cyber attacks seeking to move laterally across systems to disrupt operational resources and services delivery is certainly increasing. But it’s the speed, unpredictability and impact of these attacks on the technology and processes it supports, that makes cyber resilience so hard to establish and then maintain.

Security teams seek to establish and monitor security controls in an effort to maintain appropriate cyber resilience of the data, infrastructure and IBS supporting systems. And while IT assets and systems can be priorities for particular focus, few key operational resources and systems that support IBSs deliveries can be easily isolated for scenario testing. And it’s why scenariobased identification and testing of security control effectiveness isn’t feasible for already stretched teams to continuously undertake on their own.

Because of the apparent unpredictability and speed at which gaps can emerge, only automated data-driven processes are quick enough to identify and analyse these new vulnerabilities to inform and guide the effectiveness of the control system.

Anticipating increasingly complex scenarios that might disrupt businesses operations is one thing but establishing an operational resilience process that is able to address as-yet unknown risks is quite another. Known risks, can clearly be anticipated as part of a “wide range of severe but plausible scenarios” but unknown risks, by definition, cannot. A problem in developing operational resilience governance then, is that these unknown and even unknowable threats will always exist with more information required before threats can be identified and mitigating controls established. That means more and better information.

The Policy makes directors and senior managers clearly responsible for the operational resilience of their firm and its IBS delivery. And with the requirement for more accurate and reliable operational resilience governance into the future, the transitional top-down operational scenario-based resilience model will have to give way to more sophisticated procedures.

Establishing access to even the smallest amount of evidence can forewarn the operations team of a vulnerability. But in the absence of any information at all an unknown risk can cause significant operational disruption

  • zero-day exploits, new and emerging threat vectors and other potential black swan events

A prudent operational resilience process must, wherever possible, address these blind spots that might otherwise threaten operational resources.

One way to supplement the scenario-based governance model is to progressively add more data-driven information from across the organisation’s operational systems. By automatically collecting and building better risk information about disruptions and resource vulnerabilities, from the bottom-up, emerging issues can be pin-pointed and mitigated as part of improving operational controls. And the timeliness and reliability of the information about otherwise unknown risks can enrich the development of additional mitigating controls.

It also creates a feedback loop to automatically inform management of the successful mitigation (or otherwise) of previously identified vulnerabilities.

For data-driven systems to quickly and accurately determine emerging new vulnerabilities in such volatile and complex operating environments, bottom-up data needs to be automatically collected, analysed and reported regularly and at scale. And given the speed at which resources can be disrupted, sufficiently to impact the delivery of IBSs, means that the speed of delivery of this information is key to mitigation efforts.

Many of the vulnerabilities that the Policy aims to address can occur unexpectedly and at a rate that teams are ill-equipped to respond to. Organisations need the ability to detect and mitigate these as soon as they emerge in order to maintain uninterrupted service delivery. As we’ve already observed managing operational risks and overall resilience of an organisation is not an occasional activity; problems could arise at any time and operational governance requires a rapid if not automated response to support the ongoing operational resilience process.

System failures, user provisioning, patching, functionality updates, technology roll-outs occur almost continuously; impacting business operations as they go. Operational resilience obligations cannot rely on ad hoc, intermittent assessments and response; they require timely evidencebased information.

As Boards come under increasing regulatory and legal pressure with regard to operational resilience, they will need to improve the nature and relevance of the information on which they base decisions about operational control settings and governance oversight. In dynamic environments, like the finance sector, where operational factors are constantly changing, annual audits and even scenario based operational resilience efforts are clearly no longer adequate.

Automated solutions that provide visibility of operational controls and the state of resilience and governance will become a priority; if not before March 2025, certainly after.

Other considerations in the application of the Policy.

Communications

As part of the operational resilience process, firms are expected to have a fast and effective communication plan in place to inform stakeholders, in the event of a harmful disruption to operations. Further information can be found in the FCA Handbook: SYSC 15A.8 Communications.

Governance

For the first time, Directors are responsible for setting business and risk strategies as well as for the overall oversight of operational resilience standards. And in today’s uncertain and complex operating environment, it is not surprising that a disruption to any key operational resource can quickly affect IBS delivery. It is for this reason that Boards and senior managers, must now have demonstrable knowledge, experience and skills to discharge their obligations across interdependent operational resources, and be able to manage the effectiveness of the integrated governance processes.

Self-assessment documents

Finally, a living self-assessment document must be maintained as a record of the current state of the firm’s operational resilience program. The recommended elements of the self-assessment document are noted in the FCA Handbook: SYSY 15A.6 Self-Assessment and Lessons Learned. It must be regularly updated and provide a systematic guide to ongoing operational resilience improvement. It is to be available to the regulator for review and should be revised whenever changes in business operations or IBS market conditions may impact the overall operational resilience of the firm.


  1. https://www.fca.org.uk/publication/policy/ps21-3-operational-resilience.pdf
  2. https://www.bis.org/bcbs/publ/d515.pdf

Active management for operational and cyber resilience

BLOG POSTS

Related Cybersecurity Content

SIGN UP TO RECEIVE CYBER SECURITY INSIGHTS

Read by directors, executives, and security professionals globally, operating in the most complex of security environments.