CyberLiver Business Continuity Plan

Introduction

This policy outlines the considerations regarding business continuity in scenarios where CyberLiver platform and services cannot be used as normal.

All of CyberLiver’s deployments and services include end-user applications such as patient-facing applications (mobile and web), clinician-facing applications (mobile and web), admin or performance-based applications (web dashboard), and backend services. This policy will describe an approach to business continuity in that context. Please get in touch with us to discuss the optimum approach in scenarios that are different to this.

Potential Disruption Scenarios

Some scenarios can impact the normal use of the above-mentioned applications and services. These include:

  1. Cyber Attacks: Ransomware, DDoS attacks, data breaches.
  2. System Failures: Hardware or software failures, network outages.
  3. Natural Disasters: Floods, earthquakes, fires.
  4. System Failures: Hardware or software failures, network outages.
  5. Supply Chain Issues: Third-party service outages, vendor disruptions.
  6. Pandemics and Other Health Crises: Impacting workforce availability.

Key Contacts

Patient-Facing Applications
Primary Contact:
  • Name: Ethan Gray
  • Title: DevOps Engineer
  • Email: [email protected]
  • Phone: +44 7398 709 007
Secondary Contact:
  • Name: Anu Balaji
  • Title: Product Manager
  • Email: [email protected]
  • Phone: +44 7585 644 007
IT Lead:
  • Name: Alex Prabu Palanisamy
  • Title: Engineering Lead and Data Scientist
  • Email: [email protected]
Operations Lead:

Clinician-Facing Applications

Web App: Enables clinicians to review patient data, communicate with patients, and manage treatments. The web app doesn't support offline mode.

Web App: Offers a broader view for managing multiple patients and accessing summarised analytics. Non-identifiable summary data is cached to provide some level of offline access.

Admin and Performance-Based Applications

Web Dashboard: Used by admin staff to monitor system performance, manage user accounts, and generate reports.

Backend Services

Web Dashboard: Data Processing: Handles data storage, processing, and analytics.

APIs: Provide the necessary interfaces for mobile and web apps to function.

Algorithms and services: Algorithms and other essential software services that help the functioning of the platform

Risk Assessment and Impact Analysis

Cyber Attacks:

Likelihood: High

Impact: Severe - Data loss, service unavailability, reputational damage.

Regular security audits, endpoint protection, staff training. Implementation of firewalls, intrusion detection systems, and continuous monitoring to detect and respond to threats promptly.

System Failures:

Likelihood: Medium

Impact: Moderate to Severe - Service downtime, potential data loss.

Mitigation: Redundant systems, regular maintenance, robust backup solutions. Our cloud service contracts with AWS and Google Cloud include automated backups of systems, configurations, and databases. In case of a blackout at one data centre, fallback data centres will keep the services running. Additionally, in the event of a complete blackout of a service provider (e.g., AWS), we have disaster recovery protocols to restore services on Google Cloud within 8-12 hours, and vice versa.

Natural Disasters:

Likelihood: Low

Impact: Severe - Infrastructure damage, prolonged downtime.

Mitigation: Disaster recovery plan, off-site backups, cloud services. Utilizing geographically dispersed data centres ensures that a natural disaster affecting one location does not disrupt the entire service.

Human Error:

Likelihood: Medium

Impact: Moderate - Service disruption, data integrity issues.

Mitigation: Staff training, access controls, audit trails. Implementing strict change management processes and regular reviews to minimize human error.

Supply Chain Issues:

Likelihood: Medium

Impact:Moderate to Severe - Service disruption, dependency on third parties.

Mitigation:Diversified suppliers, service level agreements, regular supplier assessments. Maintaining strong relationships with multiple vendors and having contingency plans for vendor disruptions.

Human Error:

Likelihood: Medium

Impact: Severe - Workforce availability, operational challenges.

Mitigation: Remote work policies, health and safety protocols, cross-training staff. Ensuring that all critical functions can be performed remotely and maintaining a flexible workforce.

Response and Recovery Strategies

Immediate Actions

Incident Identification:Report incidents to the primary contact immediately.

Initial Assessment: IT and Operations Leads to assess the impact and determine initial response steps.

Communication: Inform all stakeholders (employees, users, partners) about the incident and expected impact.

Roles and Responsibilities

Primary Contact: Oversee the overall response, coordinate with key stakeholders.

Secondary Contact Assist primary contact, ensure continuity in absence of primary contact.

IT Lead Address technical issues, restore IT services, ensure data integrity.

Operations Lead:Maintain operational continuity, coordinate with business units.

Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)

Patient-Facing Applications: RTO - 4 hours, RPO - 1 hour

Clinician-Facing Applications: RTO - 4 hours, RPO - 1 hour

Admin Applications: RTO - 8 hours, RPO - 2 hours

Backend Services: RTO - 2 hours, RPO - 30 minutes

Communication Plan

Internal Communication

Channels: Email, SMS, Internal messaging platform (e.g., Slack)

Responsible: Operations Lead

Frequency: Regular updates every hour during an incident

External Communication

Channels Email, Website notifications, Social media

Responsible: Primary Contact

Frequency: Initial notification within 1 hour, subsequent updates every 2 hours

Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)

Patient-Facing Applications: RTO - 4 hours, RPO - 1 hour

Clinician-Facing Applications: RTO - 4 hours, RPO - 1 hour

Admin Applications: RTO - 8 hours, RPO - 2 hours

Backend Services: RTO - 2 hours, RPO - 30 minutes

Template Messages

Initial Notification:

"Dear [User/Partner], we are currently experiencing a service disruption affecting [specific services]. Our team is actively working to resolve the issue. We apologise for any inconvenience caused and will provide updates as soon as possible. Thank you for your patience."

Update Notification:

"Dear [User/Partner], we are continuing to address the service disruption affecting [specific services]. Our team is making progress, and we anticipate restoration within [estimated time]. We appreciate your understanding and will keep you informed of any further developments."/p>

Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)

"Dear [User/Partner], we are pleased to inform you that the service disruption affecting [specific services] has been resolved. All systems are now operating normally. Thank you for your patience and cooperation."

Testing and Maintenance

Testing

Frequency: Semi-annually

Types of Tests: Tabletop exercises, full-scale drills

Responsible: IT Lead and Operations Lead

Maintenance

Review: Quarterly or after significant changes

Updates: Incorporate feedback from tests and real incidents

Responsible: Primary Contact

How We Comply with NHS Digital Requirements (Sections 7.1.2 to 7.3.6)

7.1.2: Business Continuity Policy

  1. Objective: To ensure the organisation can continue operations during and after a disruption.
  2. Implementation: Develop, maintain, and review a BCP, ensuring it aligns with the overall business strategy and is communicated to all employees. Regular training and awareness programmes will be conducted to ensure all staff understand their roles within the BCP.

7.2.1: Business Impact Analysis (BIA)

  1. Objective: To identify critical business functions and the impact of their disruption.
  2. Implementation: Conduct a BIA to determine RTO and RPO for each critical function, considering financial, operational, and reputational impacts. Each department will participate in the BIA to ensure all critical functions are identified and adequately assessed.

7.2.2: Risk Assessment

  1. Objective: To identify risks to business continuity and assess their likelihood and impact.
  2. Implementation: Regularly review potential risks, including cyber attacks, system failures, and natural disasters, and update mitigation strategies accordingly. Risk assessments will be conducted quarterly, and after any significant changes to the organisation or its operations.

7.2.3: Developing Business Continuity Plans

  1. Objective: To develop actionable plans to maintain or restore critical functions during a disruption.
  2. Implementation: Create detailed response and recovery strategies, assign roles and responsibilities, and ensure all employees are aware of the plans. Departments will have specific continuity plans that align with the overall BCP, ensuring a coordinated response.

7.2.4: Communication Strategy

  1. Objective: To ensure effective communication during a disruption.
  2. Implementation: Establish clear communication channels and protocols for internal and external stakeholders, including template