ROS 2 Robotic Systems Threat Model

Authors: Thomas Moulard, Juan Hortala Xabi Perez Gorka Olalde Borja Erice Odei Olalde David Mayoral

Date Written: 2019-03

Last Modified: 2021-01

This is a DRAFT DOCUMENT.

Disclaimer:

  • This document is not exhaustive. Mitigating all attacks in this document does not ensure any robotic product is secure.
  • This document is a live document. It will continue to evolve as we implement and mitigate attacks against reference platforms.

Table of Contents

Document Scope

This document describes potential threats for ROS 2 robotic systems. The document is divided into two parts:

  1. Robotic Systems Threats Overview
  2. Threat Analysis for the TurtleBot 3 Robotic Platform
  3. Threat Analysis for the MARA Robotic Platform

The first section lists and describes threats from a theoretical point of view. Explanations in this section should hold for any robot built using a component-oriented architecture. The second section instantiates those threats on a widely-available reference platform, the TurtleBot 3. Mitigating threats on this platform enables us to demonstrate the viability of our recommendations.

Robotic Systems Threats Overview

This section is intentionally independent from ROS as robotic systems share common threats and potential vulnerabilities. For instance, this section describes “robotic components” while the next section will mention “ROS 2 nodes”.

Defining Robotic Systems Threats

We will consider as a robotic system one or more general-purpose computers connected to one or more actuators or sensors. An actuator is defined as any device producing physical motion. A sensor is defined as any device capturing or recording a physical property.

Robot Application Actors, Assets, and Entry Points

This section defines actors, assets, and entry points for this threat model.

Actors are humans or external systems interacting with the robot. Considering which actors interact with the robot is helpful to determine how the system can be compromised. For instance, actors may be able to give commands to the robot which may be abused to attack the system.

Assets represent any user, resource (e.g. disk space), or property (e.g. physical safety of users) of the system that should be defended against attackers. Properties of assets can be related to achieving the business goals of the robot. For example, sensor data is a resource/asset of the system and the privacy of that data is a system property and a business goal.

Entry points represent how the system is interacting with the world (communication channels, API, sensors, etc.).

Robot Application Actors

Actors are divided into multiple categories based on whether or not they are physically present next to the robot (could the robot harm them?), are they human or not and are they a “power user” or not. A power user is defined as someone who is knowledgeable and executes tasks which are normally not done by end-users (build and debug new software, deploy code, etc.).

Actor Co-Located? Human? Power User? Notes
Robot User Y Y N Human interacting physically with the robot.
Robot Developer / Power User Y Y Y User with robot administrative access or developer.
Third-Party Robotic System Y N - Another robot or system capable of physical interaction with the robot.
Teleoperator / Remote User N Y N A human tele-operating the robot or sending commands to it through a client application (e.g. smartphone app)
Cloud Developer N Y Y A developer building a cloud service connected to the robot or an analyst who has been granted access to robot data.
Cloud Service N N - A service sending commands to the robot automatically (e.g. cloud motion planning service)

Assets

Assets are categorized in privacy (robot private data should not be accessible by attackers), integrity (robot behavior should not be modified by attacks) and availability (robot should continue to operate even under attack).

Asset Description
Privacy
Sensor Data Privacy Sensor data must not be accessed by unauthorized actors.
Robot Data Stores Privacy Robot persistent data (logs, software, etc.) must not be accessible by unauthorized actors.
Integrity
Physical Safety The robotic system must not harm its users or environment.
Robot Integrity The robotic system must not damage itself.
Robot Actuators Command Integrity Unallowed actors should not be able to control the robot actuators.
Robot Behavior Integrity The robotic system must not allow attackers to disrupt its tasks.
Robot Data Stores Integrity No attacker should be able to alter robot data.
Availability
Compute Capabilities Robot embedded and distributed (e.g. cloud) compute resources. Starving a robot from its compute resources can prevent it from operating correctly.
Robot Availability The robotic system must answer commands in a reasonable time.
Sensor Availability Sensor data must be available to allowed actors shortly after being produced.

Entry Points

Entry points describe the system attack surface area (how do actors interact with the system?).

Name Description
Robot Components Communication Channels Robotic applications are generally composed of multiple components talking over a shared bus. This bus may be accessible over the robot WAN link.
Robot Administration Tools Tools allowing local or remote users to connect to the robot computers directly (e.g. SSH, VNC).
Remote Application Interface Remote applications (cloud, smartphone application, etc.) can be used to read robot data or send robot commands (e.g. cloud REST API, desktop GUI, smartphone application).
Robot Code Deployment Infrastructure Deployment infrastructure for binaries or configuration files are granted read/write access to the robot computer's filesystems.
Sensors Sensors are capturing data which usually end up being injected into the robot middleware communication channels.
Embedded Computer Physical Access External (HDMI, USB...) and internal (PCI Express, SATA...) ports.

Robot Application Components and Trust Boundaries

The system is divided into hardware (embedded general-purpose computer, sensors, actuators), multiple components (usually processes) running on multiple computers (trusted or non-trusted components) and data stores (embedded or in the cloud).

While the computers may run well-controlled, trusted software (trusted components), other off-the-shelf robotics components (non-trusted) nodes may be included in the application. Third-party components may be malicious (extract private data, install a root-kit, etc.) or their QA validation process may not be as extensive as in-house software. Third-party components releasing process create additional security threats (third-party component may be compromised during their distribution).

A trusted robotic component is defined as a node developed, built, tested and deployed by the robotic application owner or vetted partners. As the process is owned end-to-end by a single organization, we can assume that the node will respect its specifications and will not, for instance, try to extract and leak private information. While carefully controlled engineering processes can reduce the risk of malicious behavior (accidentally or voluntarily), it cannot completely eliminate it. Trusted nodes can still leak private data, etc.

Trusted nodes should not trust non-trusted nodes. It is likely that more than one non-trusted component is embedded in any given robotic application. It is important for non-trusted components to not trust each other as one malicious non-trusted node may try to compromise another non-trusted node.

An example of a trusted component could be an in-house (or carefully vetted) IMU driver node. This component may communicate through unsafe channels with other driver nodes to reduce sensor data fusion latency. Trusting components is never ideal but it may be acceptable if the software is well-controlled.

On the opposite, a non-trusted node can be a third-party object tracker. Deploying this node without adequate sandboxing could impact:

  • User privacy: the node is streaming back user video without their consent
  • User safety: the robot is following the object detected by the tracker and its speed is proportional to the object distance. The malicious tracker estimates the object position very far away on purpose to trick the robot into suddenly accelerating and hurting the user.
  • System availability: the node may try to consume all available computing resources (CPU, memory, disk) and prevent the robot from performing correctly.
  • System Integrity: the robot is following the object detected by the tracker. The attacker can tele-operate the robot by controlling the estimated position of the tracked object (detect an object on the left to make the robot move to the left, etc.).

Nodes may also communicate with the local filesystem, cloud services or data stores. Those services or data stores can be compromised and should not be automatically trusted. For instance, URDF robot models are usually stored in the robot file system. This model stores robot joint limits. If the robot file system is compromised, those limits could be removed which would enable an attacker to destroy the robot.

Finally, users may try to rely on sensors to inject malicious data into the system (Akhtar, Naveed, and Ajmal Mian. “Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey.”).

The diagram below illustrates an example application with different trust zones (trust boundaries showed with dashed green lines). The number and scope of trust zones is depending on the application.

Robot System Threat Model Diagram Source (edited with Threat Dragon)

Threat Analysis and Modeling

The table below lists all generic threats which may impact a robotic application.

Threat categorization is based on the STRIDE (Spoofing / Tampering / Repudiation / Integrity / Denial of service / Elevation of privileges) model. Risk assessment relies on DREAD (Damage / Reproducibility / Exploitability / Affected users / Discoverability).

In the following table, the “Threat Category (STRIDE)” columns indicate the categories to which a threat belongs. If the “Spoofing” column is marked with a check sign (✓), it means that this threat can be used to spoof a component of the system. If it cannot be used to spoof a component, a cross sign will be present instead (✘).

The “Threat Risk Assessment (DREAD)” columns contain a score indicating how easy or likely it is for a particular threat to be exploited. The allowed score values are 1 (not at risk), 2 (may be at risk) or 3 (at risk, needs to be mitigated). For instance, in the damage column a 1 would mean “exploitation of the threat would cause minimum damages”, 2 “exploitation of the threat would cause significant damages” and 3 “exploitation of the threat would cause massive damages”. The “total score” is computed by adding the score of each column. The higher the score, the more critical the threat.

Impacted assets, entry points and business goals columns indicate whether an asset, entry point or business goal is impacted by a given threat. A check sign (✓) means impacted, a cross sign (✘) means not impacted. A triangle (▲) means “impacted indirectly or under certain conditions”. For instance, compromising the robot kernel may not be enough to steal user data but it makes stealing data much easier.

Threat Description Threat Category (STRIDE) Threat Risk Assessment (DREAD) Impacted Assets Impacted Entry Points Mitigation Strategies Similar Attacks in the Litterature
Spoofing Tampering Repudiation Info. Disclosure Denial of Service Elev. of Privileges Damage Reproducibility Exploitability Affected Users Discoverability DREAD Score Robot Compute Rsc. Physical Safety Robot Avail. Robot Integrity Data Integrity Data Avail. Data Privacy Embedded H/W Robot Comm. Channels Robot Admin. Tools Remote App. Interface Deployment Infra.
Embedded / Software / Communication / Inter-Component Communication
An attacker spoofs a component identity. 3 1 1 2 3 10
  • Components should authenticate themselves.
  • Components should not be attributed similar identifiers.
  • Component identifiers should be chosen carefully.
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker intercepts and alters a message. 3 3 3 3 3 15
  • Messages should be signed and/or encrypted.
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker writes to a communication channel without authorization. 3 3 3 3 3 15
  • Components should only communicate on encrypted channels.
  • Sensitive inter-process communication should be done through shared memory whenever possible.
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker listens to a communication channel without authorization. 2 3 3 3 3 14
  • Components should only communicate on encrypted channels.
  • Sensitive inter-process communication should be done through shared memory whenever possible.
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker prevents a communication channel from being usable. 3 3 3 3 3 15
  • Components should only be allowed to access channels they require.
  • Internet-facing channels and robot-only channels should be isolated.
  • Components behaviors should be tolerant of a loss of communication (e.g. go to x,y vs set velocity to vx, vy).
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
Embedded / Software / Communication / Long-Range Communication (e.g. WiFi, Cellular Connection)
An attacker hijacks robot long-range communication 3 2 1 3 1 10
  • Long-range communication should always use a secure transport layer (WPA2 for WiFi for instance)
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker intercepts robot long-range communications (e.g. MitM) 1 2 1 3 1 8
  • Long-range communication should always use a secure transport layer (WPA2 for WiFi for instance)
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
An attacker disrupts (e.g. jams) robot long-range communication channels. 2 2 1 1 3 9
  • Multiple long-range communication transport layers should be used when possible (e.g. cellular and WiFi)
Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015.
Embedded / Software / Communication / Short-Range Communication (e.g. Bluetooth)
An attacker executes arbitrary code using a short-range communication protocol vulnerability. 3 2 1 1 3 10
  • Communications protocols should be disabled if unused (by using e.g. rfkill).
  • Binaries and libraries required to support short-range communications should be kept up-to-date.
Checkoway, Stephen, Damon McCoy, Brian Kantor, Danny Anderson, Hovav Shacham, Stefan Savage, Karl Koscher, Alexei Czeskis, Franziska Roesner, and Tadayoshi Kohno. “Comprehensive Experimental Analyses of Automotive Attack Surfaces.” In Proceedings of the 20th USENIX Conference on Security, 6–6. SEC’11. Berkeley, CA, USA: USENIX Association, 2011.
Embedded / Software / Communication / Remote Application Interface
An attacker gains unauthenticated access to the remote application interface. 3 3 1 1 3 11
  • Implement authentication and authorization methods.
  • Enable RBAC to limit permissions for the users.
An attacker could eavesdrop communications to the Robot’s remote application interface. 1 1 1 1 3 7
  • Communications with the remote application interface should be done over a secure channel.
An attacker could alter data sent to the Robot’s remote application interface. 3 3 1 1 3 11
  • Communications with the remote application interface should be done over a secure channel.
Embedded / Software / OS & Kernel
An attacker compromises the real-time clock to disrupt the kernel RT scheduling guarantees. 3 2 1 3 2 11 Dessiatnikoff, Anthony, Yves Deswarte, Eric Alata, and Vincent Nicomette. “Potential Attacks on Onboard Aerospace Systems.” IEEE Security & Privacy 10, no. 4 (July 2012): 71–74.
An attacker compromises the OS or kernel to alter robot data. 3 2 1 3 2 11
  • OS user accounts should be properly secured (randomized password or e.g. SSH keys)
  • Hardened kernel (prevent dynamic loading of kernel modules)
  • Ensure only trustable kernels are used (e.g. Secure Boot)
  • /boot should not be accessible by robot processes
Clark, George W., Michael V. Doran, and Todd R. Andel. “Cybersecurity Issues in Robotics.” In 2017 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), 1–5. Savannah, GA, USA: IEEE, 2017.
An attacker compromises the OS or kernel to eavesdrop on robot data. 1 2 1 3 2 9
  • OS user accounts should be properly secured (randomized password or e.g. SSH keys)
  • Hardened kernel (prevent dynamic loading of kernel modules)
  • Ensure only trustable kernels are used (e.g. Secure Boot)
  • /boot should not be accessible by robot processes
Clark, George W., Michael V. Doran, and Todd R. Andel. “Cybersecurity Issues in Robotics.” In 2017 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), 1–5. Savannah, GA, USA: IEEE, 2017.
An attacker gains access to the robot OS through its administration interface. 3 3 2 3 3 14
  • Administrative interface should be properly secured (e.g. no default/static password).
  • Administrative interface should be accessible by a limited number of physical machines. For instance, one may require the user to be physically co-located with the robot (see e.g. ADB for Android)
Embedded / Software / Component-Oriented Architecture
A node accidentally writes incorrect data to a communication channel. 2 3 2 3 3 13
  • Components should always validate received messages.
  • Invalid message events should be logged and users should be notified.
Jacques-Louis Lions et al. "Ariane S Flight 501 Failure." ESA Press Release 33–96, Paris, 1996.
An attacker deploys a malicious component on the robot. 3 3 2 3 3 14
  • Components should not trust other components (received messages needs to be validated, etc.).
  • Users should not be able to deploy components directly.
  • Components binary should be digitally signed.
  • Components source code should be audited.
  • Components should run with minimal privileges (CPU and memory quota, minimal I/O and access to the filesystem)
Checkoway, Stephen, Damon McCoy, Brian Kantor, Danny Anderson, Hovav Shacham, Stefan Savage, Karl Koscher, Alexei Czeskis, Franziska Roesner, and Tadayoshi Kohno. “Comprehensive Experimental Analyses of Automotive Attack Surfaces.” In Proceedings of the 20th USENIX Conference on Security, 6–6. SEC’11. Berkeley, CA, USA: USENIX Association, 2011.
An attacker can prevent a component running on the robot from executing normally. 2 3 2 3 3 13
  • Components should not be trusted and be properly isolated (e.g. run as different users)
  • When safe, components should attempt to restart automatically when a fatal error occurs.
Dessiatnikoff, Anthony, Yves Deswarte, Eric Alata, and Vincent Nicomette. “Potential Attacks on Onboard Aerospace Systems.” IEEE Security & Privacy 10, no. 4 (July 2012): 71–74.
Embedded / Software / Configuration Management
An attacker modifies configuration values without authorization. 3 3 3 3 3 15
  • Configuration data access control list should be implemented.
  • Configuration data modifications should be logged.
  • Configuration write-access should be limited to the minimum set of users and/or components.
Ahmad Yousef, Khalil, Anas AlMajali, Salah Ghalyon, Waleed Dweik, and Bassam Mohd. “Analyzing Cyber-Physical Threats on Robotic Platforms.” Sensors 18, no. 5 (May 21, 2018): 1643.
An attacker accesses configuration values without authorization. 1 3 3 3 3 13
  • Configuration data should be considered as private.
  • Configuration data should accessible by the minimum set of users and/or components.
Ahmad Yousef, Khalil, Anas AlMajali, Salah Ghalyon, Waleed Dweik, and Bassam Mohd. “Analyzing Cyber-Physical Threats on Robotic Platforms.” Sensors 18, no. 5 (May 21, 2018): 1643.
A user accidentally misconfigures the robot. 3 3 3 3 3 15
  • Configuration data changes should be reversible.
  • Large change should be applied atomically.
  • Fault monitoring should be able to automatically reset the configuration to a safe state if the robot becomes unavailable.
Embedded / Software / Data Storage (File System)
An attacker modifies the robot file system by physically accessing it. 3 3 3 3 3 15
  • Robot filesystem must be encrypted. The key should be stored in a secure enclave (TPM).
  • Robot filesystem should be wiped out if the robot is physically compromised.
An attacker eavesdrops on the robot file system by physically accessing it. 1 3 3 3 3 13
  • Robot filesystem must be encrypted. The key should be stored in a secure enclave (TPM).
  • Robot filesystem should be wiped out if the robot perimeter is breached.
An attacker saturates the robot disk with data. 3 3 1 3 3 13
  • Robot components disk quota should be bounded.
  • Disk usage should be properly monitored, logged and reported.
  • Optionally, components may have the option to run w/o any file system access. This should be preferred whenever possible.
Embedded / Software / Logs
An attacker exfiltrates log data to a remote server. 2 2 2 3 3 12
  • Logs should never contain private data. Log data should be anonymized when needed.
  • Logs should be rotated and deleted after a pre-determined retention period.
  • Logs should be encrypted in-transit and at-rest.
  • Logs access should be ACL protected.
  • Logs access should be monitored to enable later audits.
Embedded / Hardware / Sensors
An attacker spoofs a robot sensor (by e.g. replacing the sensor itself or manipulating the bus). 3 2 1 3 3 12
  • Sensors should embed an identifier to detect hardware tampering.
  • Components should try to explicitly refer to which sensor ID they expect data from.
  • Sensor data should be signed and ideally encrypted over the wire.
Embedded / Hardware / Actuators
An attacker spoofs a robot actuator. 1 2 1 3 3 10
  • Actuators should embed an identifier.
  • Command vector should be signed (ideally encrypted) to prevent manipulation.
An attacker modifies the command sent to the robot actuators. (intercept & retransmit) 3 2 1 3 3 12
  • Actuators should embed an identifier.
  • Command vector should be signed (ideally encrypted) to prevent manipulation.
An attacker intercepts the robot actuators command (can recompute localization). 1 2 1 3 3 10
  • Command vector should be encrypted.
An attacker sends malicious command to actuators to trigger the E-Stop 2 2 3 3 1 11
  • If a joint command is exceeding the joint limits, a specific code path for handling out-of-bounds command should be executed instead of triggering the E-Stop. Whenever safe, the command could be discarded and the error reported to the user for instance.
Embedded / Hardware / Auxilliary Functions
An attacker compromises the software or sends malicious commands to drain the robot battery. 2 3 3 3 3 14
  • Per-node CPU quota should be enforced.
  • Appropriate protection should be implemented to prevent actuators from over-heating.
  • If the battery level becomes critically low, the robot should be able to bring itself to a stop.
Dessiatnikoff, Anthony, Yves Deswarte, Eric Alata, and Vincent Nicomette. “Potential Attacks on Onboard Aerospace Systems.” IEEE Security & Privacy 10, no. 4 (July 2012): 71–74.
Embedded / Hardware / Communications
An attacker connects to an exposed debug port. 3 3 3 3 3 15
  • Close the communication port to external communications or disable the service on non-development devices.
An attacker connects to an internal communication bus. 3 3 3 3 3 15
    Limit access to internal communications and buses.
Remote / Client Application
An attacker intercepts the user credentials on their desktop machine. 2 2 2 3 1 10
  • Remote users should be granted minimum privileges
  • Credentials on desktop machines should be stored securely (secure enclave, TPM, etc.)
  • User credentials should be revokable or expire automatically
  • User credentials should be tied to the user identity for audit purposes
Remote / Cloud Integration
An attacker intercepts cloud service credentials deployed on the robot. 2 2 1 3 1 9
  • Cloud services should be granted minimal privileges.
  • Cloud services credentials should be revokable.
  • Cloud services should be audited for abuse / unauthorized access.
An attacker gains read access to robot cloud data. 2 2 1 3 1 9
  • Cloud data stores should encrypt data at rest
An attacker alters or deletes robot cloud data. 2 2 1 3 1 9
  • Cloud data should have proper backup mechanisms.
  • Cloud data access should be audited. If an intrusion is detected, a process to restore the system back to a previous "uncompromised" state should be available.
Remote / Software Deployment
An attacker spoofs the deployment service. 3 3 2 3 3 14
  • Deployment service should be authenticated• Communication with the deployment service should be done over a secure channel.
An attacker modifies the binaries sent by the deployment service. 3 3 2 3 3 14
  • Deployment service should be authenticated• Communication with the deployment service should be done over a secure channel.
An attacker intercepts the binaries sent by the deployment service. 1 3 2 3 3 12
  • Deployment service should be authenticated• Communication with the deployment service should be done over a secure channel.
An attacker prevents the robot and the deployment service from communicating. 1 3 2 3 3 12
Cross-Cutting Concerns / Credentials, PKI and Secrets
An attacker compromises a Certificate Authority trusted by the robot. 3 1 1 2 3 10

Including a new robot into the threat model

The following steps are recommended in order to extend this document with additional threat models:

  1. Determine the robot evaluation scenario. This will include:
    • System description and specifications
    • Data assets
  2. Define the robot environment:
    • External actors
    • Entry points
    • Use cases
  3. Design security boundary and architectural schemas for the robotic application.
  4. Evaluate and prioritize entry points
    • Make use of the RSFrsf to find applicable weaknesses on the robot.
    • Take existing documentation as help for finding applicable entry points.
  5. Evaluate existing threats based on general threat table and add new ones to the specific threat table.
  6. Design hypothetical attack trees for each of the entry points, detailing the affected resources on the process.
  7. Create a Pull Request and submit the changes to the ros2/design repository.

Threat Analysis for the TurtleBot 3 Robotic Platform

System description

The application considered in this section is tele-operation of a Turtlebot 3 robot using an Xbox controller.

The robot considered in this section is a TurtleBot 3 Burger. It is a small educational robot embedding an IMU and a Lidar, two motors / wheels, and a chassis that hosts a battery, a Pi Raspberry Pi 3 Model B+ Single Board Computer, and a OpenCR 1.0 Arduino compatible board that interacts with the sensors. The robot computer runs two nodes:

We also make the assumption that the robot is running a ROS 2 port of the AWS CloudWatch sample application). A ROS 2 version of this component is not yet available, but it will help us demonstrate the threats related to connecting a robot to a cloud service.

For the purpose of demonstrating the threats associated with distributing a ROS graph among multiple hosts, an Xbox controller is connected to a secondary computer (“remote host”). The secondary computer runs two additional nodes:

  • joy_node is forwarding joystick input as ROS 2 messages,
  • teleop_twist_joy is converting ROS 2 joystick messages to control commands.

Finally, the robot data is accessed by a test engineer through a “field testing” computer.

Architecture Dataflow diagram

ROS 2 Application Diagram Source (draw.io)

Assets

Hardware
  • TurtleBot 3 Burger is a small, ROS-enabled robot for education purposes.
    • Compute resources
      • Raspberry PI 3 host: OS Raspbian Stretch with ROS 2 Crystal running natively (without Docker), running as root.
      • OpenCR board: using ROS 2 firmware as described in the TurtleBot3 ROS 2 setup instructions.
    • Hardware components) include:
      • The Lidar is connected to the Raspberry PI through USB.
      • A Raspberry PI camera module is connected to the Raspberry PI 3 through its Camera Serial Interface (CSI).
  • Field testing host: conventional laptop running OS Ubuntu 18.04, no ROS installed. Running as a sudoer user.
  • Remote Host: any conventional server running OS Ubuntu 18.04 with ROS 2 Crystal.
  • CI host: any conventional server running OS Ubuntu 18.04.

  • WLAN: a wifi local area network without security enabled, open for anyone to connect
  • Corporate private network: a secure corporate wide-area network, that spans multiple cities. Only authenticated user with suitable credentials can connect to the network, and good security practices like password rotation are in place.
Processes
  • Onboard TurtleBot3 Raspberry Pi
    • turtlebot3_node
    • CloudWatch nodes (hypothetical as it has not been yet ported to ROS 2)
      • cloudwatch_metrics_collector: subscribes to a the /metrics topic where other nodes publish MetricList messages that specify CloudWatch metrics data, and sends the corresponding metric data to CloudWatch metrics using the PutMetricsData API.
      • cloudwatch_logger: subscribes to a configured list of topics, and publishes all messages found in that topic to a configured log group and log stream, using the CloudWatch metrics API.
    • Monitoring Nodes
      • monitor_speed: subscribes to the topic /odom, and for each received odometry message, extract the linear and angular speed, build a MetricList message with those values, and publishes that message to data to the /metrics topic.
      • health_metric_collector: collects system metrics (free RAM, total RAM, total CPU usage, per core CPU usage, uptime, number of processes) and publishes it as a MetricList message to the /metrics topic.
    • raspicam2_node a node publishing Raspberry Pi Camera Module data to ROS 2.
  • An XRCE Agent runs on the Raspberry, and is used by a DDS-XRCE client running on the OpenCR 1.0 board, that publishes IMU sensor data to ROS topics, and controls the wheels of the TurtleBot based on the teleoperation messages published as ROS topics. This channel uses serial communication.
  • A Lidar driver process running on the Raspberry interfaces with the Lidar, and uses a DDS-XRCE client to publish data over UDP to an XRCE agent also running on the Raspberry Pi. The agent sends the sensor data to several ROS topics.
  • An SSH client process is running in the field testing host, connecting to the Raspberry PI for diagnostic and debugging.
  • A software update agent process is running on the Raspberry PI, OpenCR board, and navigation hosts. The agent polls for updates a code deployment service process running on the CI host, that responds with a list of packages and versions, and a new version of each package when an update is required.
  • The CI pipeline process on the CI host is a Jenkins instance that is polling a code repository for new revisions of a number of ROS packages. Each time a new revision is detected it rebuilds all the affected packages, packs the binary artifacts into several deployment packages, and eventually sends the package updates to the update agent when polled.
Software Dependencies

See TurtleBot3 ROS 2 setup instructions for details about TurtleBot3 software dependencies.

External Actors
  • A test engineer is testing the robot.
  • A user operates the robot with the joystick.
  • A business analyst periodically checks dashboards with performance information for the robot in the AWS Cloudwatch web console.
Robot Data Assets
  • Topic Message
    • Private Data
      • Camera image messages
      • Logging messages (might describe camera data)
      • CloudWatch metrics and logs messages (could contain Intellectual Property such as which algorithms are implemented in a particular node).
        • Restricted Data
      • Robot Speed and Orientation. Some understanding of the current robot task may be reconstructed from those messages..
  • AWS CloudWatch data
    • Metrics, logs, and aggregated dashboard are all private data, as they are different serializations of the corresponding topics.
  • Raspberry Pi System and ROS logs on Raspberry PI
    • Private data just like CloudWatch logs.
  • AWS credentials on all hosts
    • Secret data, provide access to APIs and other compute assets on the AWS cloud.
  • SSH credentials on all hosts
    • Secret data, provide access to other hosts.
  • Robot embedded algorithm (e.g. teleop_twist_joy)
    • Secret data. Intellectual Property (IP) theft is a critical issue for nodes which are either easy to decompile or written in an interpreted language.
Robot Compute Assets
  • AWS CloudWatch APIs and all other AWS resources accessible from the AWS credentials present in all the hosts.
  • Robot Topics
    • /cmd_vel could be abused to damage the robot or hurt users.

Entry points

  • Communication Channels
    • DDS / ROS Topics
      • Topics can be listened or written to by any actor:
        1. Connected to a network where DDS packets are routed to,
        2. Have necessary permissions (read / write) if SROS is enabled.
      • When SROS is enabled, attackers may try to compromise the CA authority or the private keys to generate or intercept private keys as well as emitting malicious certificates to allow spoofing.
      • USB connection is used for communication between Raspberry Pi and OpenCR board and LIDAR sensor.
    • SSH
      • SSH access is possible to anyone on the same LAN or WAN (if port-forwarding is enabled). Many images are setup with a default username and password with administrative capabilities (e.g. sudoer).
      • SSH access can be compromised by modifying the robot
  • Deployed Software
    • ROS nodes are compiled either by a third-party (OSRF build-farm) or by the application developer. It can be compiled directly on the robot or copied from a developer workstation using scp or rsync.
    • An attacker compromising the build-farm or the developer workstation could introduce a vulnerability in a binary which would then be deployed to the robot.
  • Data Store (local filesystem)
    • Robot Data
      • System and ROS logs are stored on the Raspberry Pi filesystem.
      • Robot system is subject to physical attack (physically removing the disk from the robot to read its data).
    • Remote Host Data
      • Machines running additional ROS nodes will also contain log files.
      • Remote hosts is subject to physical attack. Additionally, this host may not be as secured as the robot host.
    • Cloud Data
      • AWS CloudWatch data is accessible from public AWS API endpoints, if credentials are available for the corresponding AWS account.
      • AWS CloudWatch can be credentials can allow attackers to access other cloud resources depending on how the account has been configured.
    • Secret Management
      • DDS / ROS Topics
        • If SROS is enabled, private keys are stored on the local filesystem.
      • SSH
        • SSH credentials are stored on the robot filesystem.
        • Private SSH keys are stored in any computer allowed to log into the robot where public/private keys are relied on for authentication purposes.
      • AWS Credentials
        • AWS credentials are stored on the robot file system.

Use case scenarios

  • Development, Testing and Validation
    • An engineer develops code or runs tests on the robot. They may:
      • Restart the robot
      • Restart the ROS graph
      • Physically interact with the robot
      • Log into the robot using SSH
      • Check AWS console for metrics and log
  • End-User
    • A user tele-operates the robot. They may:
      • Start the robot.
      • Control the joystick.
    • A business analyst may access AWS CloudWatch data on the console to assess the robot performance.

Threat model

Each generic threat described in the previous section can be instantiated on the TurtleBot 3.

This table indicates which TurtleBot particular assets and entry points are impacted by each threat. A check sign (✓) means impacted while a cross sign (✘) means not impacted. The “SROS Enabled?” column explicitly states out whether using SROS would mitigate the threat or not. A check sign (✓) means that the threat could be exploited while SROS is enabled while a cross sign (✘) means that the threat requires SROS to be disabled to be applicable.

Threat TurtleBot Assets Entry Points SROS Enabled? Attack Mitigation Mitigation Result (redesign / transfer / avoid / accept) Additional Notes / Open Questions
Human Assets Robot App. DDS Topic OSRF Build-farm SSH
Embedded / Software / Communication / Inter-Component Communication
An attacker spoofs a component identity. Without SROS any node may have any name so spoofing is trivial.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
Risk is reduced if SROS is used. Authentication codes need to be generated for every pair of participants rather than every participant sharing a single authentication code with all other participants. Refer to section 7.1.1.3 of the DDS Security standard.
An attacker deploys a malicious node which is not enabling DDS Security Extension and spoofs the `joy_node` forcing the robot to stop.
  • DDS Security Governance document must set `allow_unauthenticated_participants` to False to avoid non-authenticated participants to be allowed to communicate with authenticated nodes.
  • DDS Security Governance document must set `enable_join_access_control` to True to explicitly whitelist node-to-node-communication. permissions.xml should be as restricted as possible.
Risk is mitigated.
  • Which actions would still be possible even with restrictive permission set? (e.g. does a node have the ability to fetch parameters from the parameter server?
  • How about set parameters? Query other node's lifecycle state? Etc.).
An attacker steals node credentials and spoofs the joy_node forcing the robot to stop.
  • Store node credentials in a secure location (secure enclave, RoT) to reduce the probability of having a private key leaked.
  • Run nodes in isolated sandboxes to ensure one node cannot access another node data (including credentials)
  • Permissions CA should digitally sign nodes binaries to prevent running tampered binaries.
  • Permissions CA should be able to revoke certificates in case credentials get stolen.
Mitigation risk requires additional work.
  • AWS Robotics and Automation is currently evaluating the feasibility of storing DDS-Security credentials in a TPM.
  • Complete mitigation would require isolation using e.g. Snap or Docker.
  • Deploying an application with proper isolation would require us to revive discussions around ROS 2 launch system
  • Yocto / OpenEmbedded / Snap support should be considered
An attacker intercepts and alters a message. Without SROS an attacker can modify /cmd_vel messages sent through a network connection to e.g. stop the robot.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications. Message tampering is mitigated by DDS security as message authenticity is verified by default (with preshared HMACs / digital signatures)
Risk is reduced if SROS is used. Additional hardening could be implemented by forcing part of the TurtleBot topic to only be communicated over shared memory.
An attacker writes to a communication channel without authorization. Without SROS, any node can publish to any topic.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
Risk is mitigated.
An attacker obtains credentials and publishes messages to /cmd_vel
  • permissions.xml must be kept as closed as possible.
  • Publication to sensitive topics permission must only be granted to a limited set of nodes the robot can trust.
  • ROS nodes must run in sandboxes to prevent interferences"
Transfer risk to user configuring permissions.xml permissions.xml should ideally be generated as writing it manually is cumbersome and error-prone.
An attacker listens to a communication channel without authorization. Without SROS: any node can listen to any topic.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
Risk is mitigated if SROS is used.
DDS participants are enumerated and fingerprinted to look for potential vulnerabilities.
  • DDS Security Governance document must set `metadata_protection_kind` to ENCRYPT to prevent malicious actors from observing communications.
  • DDS Security Governance document must set `enable_discovery_protection` to True to prevent malicious actors from enumerating and fingerprinting DDS participants.
  • DDS Security Governance document must set `enable_liveliness_protection` to True
Risk is mitigated if DDS-Security is configured appropriately.
TurtleBot camera images are saved to a remote location controlled by the attacker.
  • DDS Security Governance document must set `metadata_protection_kind` to ENCRYPT to prevent malicious actors from observing communications.
  • DDS Security Governance document must set `enable_discovery_protection` to True to prevent malicious actors from enumerating and fingerprinting DDS participants.
  • DDS Security Governance document must set `enable_liveliness_protection` to True
Risk is mitigated if DDS-Security is configured appropriately.
TurtleBot LIDAR measurements are saved to a remote location controlled by the attacker. Local communication using XRCE should be done over the loopback interface. This doesn't protect the serial communication from the LIDAR sensor to the Raspberry Pi.
An attacker prevents a communication channel from being usable. Without SROS: any node can ""spam"" any other component.
  • Enable SROS to use the DDS Security Extension. This does not prevent nodes from being flooded but it ensures that only communication from allowed participants are processed.
Risk may be reduced when using SROS.
A node can ""spam"" another node it is allowed to communicate with.
  • Implement rate limitation on topics
  • Define a method for topics to declare their required bandwidth / rate.
Mitigating risk requires additional work. How to enforce when nodes are malicious? Observe and kill?
Embedded / Software / Communication / Long-Range Communication (e.g. WiFi, Cellular Connection)
An attacker hijacks robot long-range communication ✘/✓ An attacker connects to the same unprotected WiFi network than a TurtleBot.
  • Prevent TurtleBot from connecting to non-protected WiFi network
  • SROS should always be used for long-range DDS communication.
Risk is reduced for DDS if SROS is used. Other protocols may still be vulnerable. Enforcing communication though a VPN could be an idea (see PR2 manual for a reference implementation) or only DDS communication could be allowed on long-range links (SSH could be replaced by e.g. adbd and be only available using a dedicated short-range link).
An attacker intercepts robot long-range communications (e.g. MitM) ✘/✓ An attacker connects to the same unprotected WiFi network than a TurtleBot.
  • Prevent TurtleBot from connecting to non-protected WiFi network
  • SROS should always be used for long-range DDS communication.
Risk is reduced for DDS if SROS is used. Other protocols may still be vulnerable.
An attacker disrupts (e.g. jams) robot long-range communication channels. ✘/✓ Jam the WiFi network a TurtleBot is connected to. If network connectivity is lost, switch to cellular network. Mitigating is impossible on TurtleBot (no secondary long-range communication system).
Embedded / Software / Communication / Short-Range Communication (e.g. Bluetooth)
An attacker executes arbitrary code using a short-range communication protocol vulnerability. ✘/✓ Attacker runs a blueborne attack to execute arbitraty code on the TurtleBot Raspberry Pi. A potential mitigation may be to build a minimal kernel with e.g. Yocto which does not enable features the robot does not use. In this particular case, the TurtleBot does not require Bluetooth but OS images enable it by default.
Embedded / Software / OS & Kernel
An attacker compromises the real-time clock to disrupt the kernel RT scheduling guarantees. ✘/✓ A malicious node attempts to write a compromised kernel to /boot TurtleBot / Zymbit Key integration will mostly mitigate this threat. Some level of mitigation will be possible through Turtlebot / Zymkey integration. SecureBoot support would probably be needed to completely mitigate this threat.
✘/✓ A malicious actor sends incorrect NTP packages to enable other attacks (allow the use of expired certificates) or to prevent the robot software from behaving properly (time between sensor readings could be miscomputed).
  • Implement NTP Best Practices
  • Use a hardware RTC clock such as the one provided by the Zymbit key to reduce the system reliance on NTP.
  • If your robot relies on GPS for localization purpose, consider using time from your GPS received (note that it opens the door to other vulnerabilities such as GPS jamming).
TurtleBot / Zymbit Key integration and following best practices for NTP configuration will mostly mitigate this threat.
An attacker compromises the OS or kernel to alter robot data. ✘/✓ A malicious node attempts to write a compromised kernel to /boot TurtleBot / Zymbit Key integration will mostly mitigate this threat.
An attacker compromises the OS or kernel to eavesdrop on robot data. ✘/✓ A malicious node attempts to write a compromised kernel to /boot TurtleBot / Zymbit Key integration will mostly mitigate this threat.
An attacker gains access to the robot OS through its administration interface. ✘/✓ An attacker connects to the TurtleBot using Raspbian default username and password (pi / raspberry pi)
  • Default password should be changed and whenever possible password based authentication should be replaced by SSH key based authentication
Demonstrating a mitigation would require significant work. Building out an image using Yocto w/o a default password could be a way forward. SSH is significantly increasing the attack surface. Adbd may be a better solution to handle administrative access.
✘/✓ An attacker intercepts SSH credentials.
  • Disable password-based auth for SSH. Rotate SSH keys periodically for all hosts.
  • Set up a bastion hosts that are the only host that can connect to the navigation and Raspberry PI hosts. Distribute temporary credentials for accessing the bastion hosts through some well known secure federated identity system.
  • Whenever possible administrative interfaces should only be accessible by users colocated with the robot. In this case, ssh can be replaced by e.g. an adbd implementation for Linux
Demonstrating a mitigation based on replacing sshd by adbd would require significant work.
Embedded / Software / Component-Oriented Architecture
A node accidentally writes incorrect data to a communication channel. ✘/✓ TurtleBot joy_node could accidentally publish large or randomized joystick values.
  • Expand DDS IDL to allow users to embed validity criteria to automate input sanitization (i.e. validate ranges, etc.)
  • Expand RMW to define per-topic strategies for invalid messages (drop, throw, abort, etc.)
Mitigation would require significant amount of engineering. This task would require designing and implementing a new system from scratch in ROS 2 to document node interactions.
An attacker deploys a malicious node on the robot. ✘/✓ TurtleBot is running a malicious joy_node. This node steals robot credentials try to kill other processes fill the disk with random data.
  • Modify TurtleBot launch scripts to run each node as a different user (or using some kind of containerization?)
  • Each ROS user should have very limited permissions and quota should be configured.
It is possible to mitigate this attack on the TurtleBot by creating multiple user accounts. While the one-off solution involving creating user accounts manually is interesting, it would be better to generalize this through some kind of mechanism (e.g. AppArmor / SecComp support for roslaunch)
✘/✓ TurtleBot joy_node repository is compromised by an attacker and malicious code is added to the repository. joy_node is released again and the target TurtleBot is compromised when the node gets updated.
  • For tampering a package on a third-party repository: set up a third-party package validation process so packages are not consumed directly from a third party, but instead each package version has to be approved and imported into a private trusted repository:
    • During import packages can be manually inspected and only downloaded from trusted official sources accessed through secure protocols.
    • Prefer importing from source instead of in binary formats and periodically run automatic analysis tools on the source code.
    • Track external version identifier for each imported package and flag an imported package version when a vulnerability is discovered on the same version of the original third party package.
    • Review the third package license with your legal team as part of the import process to ensure compliance with your company's legal requirements.
  • For tampering a package on a private code repository or a package artifact on an artifact repository: use standard mitigations for data stores including encryption of data at rest (server side encryption) and in transit (HTTPS/TLS) implement authorization and use use least privilege permissions to access to the data store restrict network access to least privilege.
  • For tampering a package artifact in flight during deployment: besides in transit encryption (HTTPS/TLS) use code signing (e.g. with AWS IoT code signing API) so hosts verify the package origin and integrity before installing them. This typically requires deploying a public key for the signing authority to the update agents on all hosts that receive software updates.
Risk is mitigated by enforcing processes around source control but the end-to-end story w.r.t deployment is still unclear. Mitigation of this threat is also detailed in the diagram below this table.
An attacker can prevent a component running on the robot from executing normally. ✘/✓ A malicious TurtleBot joy_node is randomly killing other nodes to disrupt the robot behavior.
  • TurtleBot launch files should set the exit_handler to restart_exit_handler.
  • ros2 node kill is too successous for production use cases. There should be mechanisms for safely asking a node to restart and/or forcing a process to be killed as a last resort. Those topics should be protected using DDS permissions to severely restrict the users able to run those operations (or fully disable those features for some particular scenario).
Risk is mitigated for the TurtleBot platform. This approach is unlikely to scale with robots whose nodes cannot be restarted at any time safely though. Systemd should be investigated as another way to handle and/or run ROS nodes.
Embedded / Software / Configuration Management
An attacker modifies configuration values without authorization. Node parameters are freely modifiable by any DDS domain participant.
An attacker access configuration values without authorization. Node parameters values can be read by any DDS domain participant.
A user accidentally misconfigures the robot. Node parameters values can be modified by anyone to any value.
Embedded / Software / Data Storage (File System)
An attacker modifies the robot file system by physically accessing it. ✘/✓ TurtleBot SD card is physically accessed and a malicious daemon in installed. Use a TPM ZymKey to encrypt the filesystem (using dm-crypt / LUKS) TurtleBot / Zymbit Key integration will mitigate this threat.
An attacker eavesdrops on the robot file system by physically accessing it. ✘/✓ TurtleBot SD card is cloned to steal robot logs, credentials, etc. Use a TPM ZymKey to encrypt the filesystem (using dm-crypt / LUKS) TurtleBot / Zymbit Key integration will mitigate this threat.
An attacker saturates the robot disk with data. ✘/✓ TurtleBot is running a malicious joy_node. This node steals robot credentials try to kill other processes fill the disk with random data.
  • Modify TurtleBot launch scripts to run each node as a different user (or using some kind of containerization?)
  • Each ROS user should have very limited permissions and quota should be configured.
  • joy_node should run w/o any file system access at all
A separate user account with disk quota enabled may be a solution but this needs to be demonstrated.
Embedded / Software / Logs
An attacker exfiltrates log data to a remote server. ✘/✓ TurtleBot logs are exfiltered by a malicious joy_node.
  • TurtleBot nodes should be properly isolated and log into files only they can access.
  • TurtleBot filesystem should be encrypted (dm-crypt / LUKS).
Risk is mitigated by limiting access to logs. However, how to run handle logs once stored requires more work. Ensuring no sensitive data is logged is also a challenge. Production nodes logs should be scanned to detect leakage of sensitive information.
Embedded / Hardware / Sensors
An attacker spoofs a robot sensor (by e.g. replacing the sensor itself or manipulating the bus). ✘/✓ The USB connection of the TurtleBot laser scanner is modified to be turned off under some conditions.
Embedded / Hardware / Actuators
An attacker spoofs a robot actuator. ✘/✓ TurtleBot connection to the OpenCR board is intercepted motor control commands are altered. Enclose OpenCR board and Raspberry Pi with a case and use ZimKey perimeter breach protection. Robot can be rendered inoperable if perimeter is breached.
An attacker modifies the command sent to the robot actuators. (intercept & retransmit) ✘/✓ TurtleBot connection to the OpenCR board is intercepted, motor control commands are altered. Enclose OpenCR board and Raspberry Pi with a case and use ZimKey perimeter breach protection. Robot can be rendered inoperable if perimeter is breached.
An attacker intercepts the robot actuators command (can recompute localization). ✘/✓ TurtleBot connection to the OpenCR board is intercepted motor control commands are logged. Enclose OpenCR board and Raspberry Pi with a case and use ZimKey perimeter breach protection. Robot can be rendered inoperable if perimeter is breached.
An attacker sends malicious command to actuators to trigger the E-Stop ✘/✓
Embedded / Hardware / Ancillary Functions
An attacker compromises the software or sends malicious commands to drain the robot battery. ✘/✓
Remote / Client Application
An attacker intercepts the user credentials on their desktop machine. A roboticist connects to the robot with Rviz from their development machine. Another user with root privileges steals the credentials.
  • On Mac: SROS credentials should be stored in OS X Secure Enclave"
Remote / Cloud Integration
An attacker intercepts cloud service credentials deployed on the robot. TurtleBot CloudWatch node credentials are extracted from the filesystem through physical access.
An attacker alters or deletes robot cloud data. TurtleBot CloudWatch cloud data is accessed by an unauthorized user.
An attacker alters or deletes robot cloud data. TurtleBot CloudWatch data is deleted by an attacker.
  • Cloud data backup should be realized on a regular basis.
Remote / Software Deployment
An attacker spoofs the deployment service.
An attacker modifies the binaries sent by the deployment service.
An attacker intercepts the binaries sent by the depoyment service.
An attacker prevents the robot and the deployment service from communicating.

Threat Diagram: An attacker deploys a malicious node on the robot

This diagram details how code is built and deployed for the “An attacker deploys a malicious node on the robot” attack.

Signing Service Mitigation Diagram Source (draw.io)

Attack Tree

Threats can be organized in the following attack tree.

Signing Service Mitigation Diagram Source (draw.io)

Threat Model Validation Strategy

Validating this threat model end-to-end is a long-term effort meant to be distributed among the whole ROS 2 community. The threats list is made to evolve with this document and additional reference may be introduced in the future.

  1. Setup a TurtleBot with the exact software described in the TurtleBot3 section of this document.
  2. Penetration Testing
    • Attacks described in the spreadsheet should be implemented. For instance, a malicious joy_node could be implemented to try to disrupt the robot operations.
    • Once the vulnerability has been exploited, the exploit should be released to the community so that the results can be reproduced.
    • Whether the attack has been successful or not, this document should be updated accordingly.
    • If the attack was successful, a mitigation strategy should be implemented. It can either be done through improving ROS 2 core packages or it can be a platform-specific solution. In the second case, the mitigation will serve as an example to publish best practices for the development of secure robotic applications.

Over time, new reference platforms may be introduced in addition to the TurtleBot 3 to define new attacks and allow other types of mitigations to be evaluated.

Threat Analysis for the MARA Robotic Platform

System description

The application considered in this section is a MARA modular robot operating on an industrial environment while performing a pick & place activity. MARA is the first robot to run ROS 2 natively. It is an industrial-grade collaborative robotic arm which runs ROS 2 on each joint, end-effector, external sensor or even on its industrial controller. Throughout the H-ROS communication bus, MARA is empowered with new possibilities in the professional landscape of robotics. It offers millisecond-level distributed bounded latencies for usual payloads and submicrosecond-level synchronization capabilities across ROS 2 components.

Built out of individual modules that natively run on ROS 2, MARA can be physically extended in a seamless manner. However, this also moves towards more networked robots and production environments which brings new challenges, especially in terms of security and safety.

The robot considered is a MARA, a 6 Degrees of Freedom (6DoF) modular and collaborative robotic arm with 3 kg of payload and repeatibility below 0.1 mm. The robot can reach angular speeds up to 90º/second and has a reach of 656 mm. The robot contains a variety of sensors on each joint and can be controlled from any industrial controller that supports ROS 2 and uses the HRIM information model. Each of the modules contains the H-ROS communication bus for robots enabled by the H-ROS SoM, which delivers real-time, security and safety capabilities for ROS 2 at the module level.

No information is provided about how ROS 2 nodes are distributed on each module. Each joint offers the following ROS 2 API capabilities as described in their documentation (MARA joint):

  • Topics
    • GoalRotaryServo.msg model allows to control the position, velocity or/and effort (generated from models/actuator/servo/topics/goal.xml, see HRIM for more).
    • StateRotaryServo.msg publishes the status of the motor.
    • Power.msg publishes the power consumption.
    • Status.msg informs about the resources that are consumed by the H-ROS SoM,
    • StateCommunication.msg is created to inform about the state of the device network.
  • Services
    • ID.srv publishes the general identity of the component.
    • Simulation3D.srv and SimulationURDF.srv send the URDF and the 3D model of the modular component.
    • SpecsCommunication.srv is a HRIM service which reports the specs of the device network.
    • SpecsRotaryServo.srv is a HRIM message which reports the main features of the device.
    • EnableDisable.srv disables or enables the servo motor.
    • ClearFault.srv sends a request to clear any fault in the servo motor.
    • Stop.srv requests to stop any ongoing movement.
    • OpenCloseBrake.srv opens or closes the servo motor brake.
  • Actions:
    • GoalJointTrajectory allows to move the joint using a trajectory msg.

Such API gets translated into the following abstractions:

Topic Name
goal /hrim_actuation_servomotor_XXXXXXXXXXXX/goal_axis1
goal /hrim_actuation_servomotor_XXXXXXXXXXXX/goal_axis2
state /hrim_actuation_servomotor_XXXXXXXXXXXX/state_axis1
state /hrim_actuation_servomotor_XXXXXXXXXXXX/state_axis2


Service Name
specs /hrim_actuation_servomotor_XXXXXXXXXXXX/specs
enable servo /hrim_actuation_servomotor_XXXXXXXXXXXX/enable
disable servo /hrim_actuation_servomotor_XXXXXXXXXXXX/disable
clear fault /hrim_actuation_servomotor_XXXXXXXXXXXX/clear_fault
stop /hrim_actuation_servomotor_XXXXXXXXXXXX/stop_axis1
stop /hrim_actuation_servomotor_XXXXXXXXXXXX/stop_axis2
close brake /hrim_actuation_servomotor_XXXXXXXXXXXX/close_brake_axis1
close brake /hrim_actuation_servomotor_XXXXXXXXXXXX/close_brake_axis2
open brake /hrim_actuation_servomotor_XXXXXXXXXXXX/open_brake_axis1
open brake /hrim_actuation_servomotor_XXXXXXXXXXXX/open_brake_axis2


Action Name
goal joint trajectory /hrim_actuation_servomotor_XXXXXXXXXXXX/trajectory_axis1
goal joint trajectory /hrim_actuation_servomotor_XXXXXXXXXXXX/trajectory_axis2


Parameters Name
ecat_interface Ethercat interface
reduction_ratio Factor to calculate the position of the motor
position_factor Factor to calculate the position of the motor
torque_factor Factor to calculate the torque of the motor
count_zeros_axis1  Axis 1 absolute value of the encoder for the zero position 
count_zeros_axis2  Axis 2 absolute value of the encoder for the zero position
enable_logging Enable/Disable logging 
axis1_min_position Axis 1 minimum position in radians 
axis1_max_position Axis 1 maximum position in radians 
axis1_max_velocity Axis 1 maximum velocity in radians/s
axis1_max_acceleration Axis 1 maximum acceleration in radians/s^2
axis2_min_position Axis 2 minimum position in radians 
axis2_max_position Axis 2 maximum position in radians 
axis2_max_velocity Axis 2 maximum velocity in radians/s
axis2_max_acceleration Axis 2 maximum acceleration in radians/s^2


The controller used in the application is a ROS 2-enabled industrial PC, specifically, an ORC. This PC also features the H-ROS communication bus for robots which commands MARA in a deterministic manner. This controller makes direct use of the aforementioned communication abstractions (topics, services and actions).

Architecture Dataflow diagram

ROS 2 Application Diagram Source (draw.io)

Assets

This section aims for describing the components and specifications within the MARA robot environment. Below the different aspects of the robot are listed and detailed in Hardware, Software and Network. The external actors and Data assests are described independently.

Hardware
  • 1x MARA modular robot is a modular manipulator, ROS 2 enabled robot for industrial automation purposes.
    • Robot modules
      • 3 x Han’s Robot Modular Joints: 2 DoF electrical motors that include precise encoders and an electro-mechanical breaking system for safety purposes. Available in a variety of different torque and size combinations, going from 2.8 kg to 17 kg weight and from 9.4 Nm to 156 Nm rated torque.
        • Mechanical connector: H-ROS connector A
        • Power input: 48 Vdc
        • Communication: H-ROS robot communication bus
          • Link layer: 2 x Gigabit (1 Gbps) TSN Ethernet network interface
          • Middleware: Data Distribution Service (DDS)
        • On-board computation: Dual core ARM® Cortex-A9
        • Operating System: Real-Time Linux
        • ROS 2 version: Crystal Clemmys
        • Information model: HRIM Coliza
        • Security:
          • DDS crypto, authentication and access control plugins
      • 1 x Robotiq Modular Grippers: ROS 2 enabled industrial end-of-arm-tooling.
        • Mechanical connector: H-ROS connector A
        • Power input: 48 Vdc
        • Communication: H-ROS robot communication bus
          • Link layer: 2 x Gigabit (1 Gbps) TSN Ethernet network interface
          • Middleware: Data Distribution Service (DDS)
        • On-board computation: Dual core ARM® Cortex-A9
        • Operating System: Real-Time Linux
        • ROS 2 version: Crystal Clemmys
        • Information model: HRIM Coliza
        • Security:
          • DDS crypto, authentication and access control plugins
  • 1 x Industrial PC: ORC include:
    • CPU: Intel i7 @ 3.70GHz (6 cores)
    • RAM: Physical 16 GB DDR4 2133 MHz
    • Storage: 256 GB M.2 SSD interface PCIExpress 3.0
    • Communication: H-ROS robot communication bus
      • Link layer: 2 x Gigabit (1 Gbps) TSN Ethernet network interface
      • Middleware: Data Distribution Service (DDS)
    • Operating System: Real-Time Linux
    • ROS 2 version: Crystal Clemmys
    • Information model: HRIM Coliza
    • Security:
      • DDS crypto, authentication and access control plugins
  • 1 x Update Server: OTA
    • Virtual Machine running on AWS EC2
      • Operating System: Ubuntu 18.04.2 LTS
      • Software: Mender OTA server
Network
  1. Ethernet time sensitive (TSN) internal network: Interconnection of modular joints in the MARA Robot is performed over a daisy chained Ethernet TSN channel. Each module acts as a switch, forming the internal network of the robot.
  2. Manufacturer (Acutronic Robotics) corporate private network: A secure corporate wide-area network, that spans multiple cities. Only authenticated user with suitable credentials can connect to the network, and good security practices like password rotation are in place. This network is used to develop and deploy updates in the robotic systems. Managed by the original manufacturer.
  3. End-user corporate private network: A secure corporate wide-area network, that spans multiple cities. Only authenticated user with suitable credentials can connect to the network, and good security practices like password rotation are in place.
  4. Cloud network: A VPC network residing in a public cloud platform, containing multiple servers. The network follows good security practices, like implementation of security applicances, user password rotation and multi-factor authentication. Only allowed users can access the network. The OTA service is open to the internet. Uses of the cloud network:
    • Manufacturer (Acutronic Robotics) uploads OTA artifacts from their Manufacturer corporate private network to the Cloud Network.
    • Robotic Systems on the End-user corporate private network fetch those artifacts from the Cloud Network.
    • Robotic Systems on the End-user corporate private network send telemetry data for predictive maintenance to Cloud Network.
Software processes

In this section all the processes running on the robotic system in scope are detailed.

  • Onboard Modular Joints
    • hros_servomotor_hans_lifecyle_node this node is in charge of controlling the motors inside the H-ROS SoM. This nodes exposes some services, actions and topics described below. This node publishes joint states, joint velocities and joint efforts. It provides a topic for servoing the modular joint and an action that will be waiting for a trajectory.
    • hros_servomotor_hans_generic_node a node which contains several generic services and topics with information about the modular joint, such as power measurement readings: voltage and current, status about the H-ROS SoM like cpu load or network stats, specifications about communication and cpu, the URDF or even the mesh file of the modular joint.
    • Mender (OTA client) runs inside the H-ROS SoM. When an update is launched, the client downloads this version and it gets installed and available when the device is rebooted.
  • Onboard Modular Gripper: Undisclosed.
  • Onboard Industrial PC:
    • MoveIt! motion planning framework.
    • Manufacturing process control applications.
    • Robot teleoperation utilities.
    • ROS 1 / ROS 2 bridges: These bridges are needed to be able to run MoveIT! which is not yet ported to ROS 2. Right now there is an effort in the community to port this tool to ROS 2.
Software dependencies

In this section all the relevant third party software dependencies used within the different components among the scope of this threat model are listed.

  • Linux OS / Kernel
  • ROS 2 core libraries
  • H-ROS core libraries and packages
  • ROS 2 system dependencies as defined by rosdep:
    • RMW implementation: In the H-ROS API there is the chance to choose the DDS implementation.
    • The threat model describes attacks with the security enabled and disabled. If the security is enabled, the security plugins are assumed to be configured and enabled.

See Mara ROS 2 Tutorials to find more details about software dependencies.

External Actors

All the actors interacting with the robotic system are here gathered.

  • End users
    • Robotics user: Interacts with the robot for performing work labour tasks.
    • Robotics operator: Performs maintenance tasks on the robot and integrates the robot with the industrial network.
    • Robotics researcher: Develops new applications and algorithms for the robot.
  • Manufacturer
    • Robotics engineer: Develops new updates and features for the robot itself. Performs in-site robot setup and maintainance tasks.
Robot Data assets

In this section all the assets storing information within the system are displayed.

  • ROS 2 network (topic, actions, services information)
    • Private Data
      • Logging messages
        • Restricted Data
      • Robot Speed and Orientation. Some understanding of the current robot task may be reconstructed from those messages.
  • H-ROS API
    • Configuration and status data of hardware.
  • Modules (joints and gripper)
    • ROS logs on each module. Private data, metrics and configuration information that could lead to GDPR issues or disclosure of current robot tasks.
    • Module embedded software (drivers, operating system, configuration, etc.)
      • Secret data. Intellectual Property (IP) theft is a critical issue here.
    • Credentials. CI/CD, enterprise assets, SROS certificates.
  • ORC Industrial PC
    • Public information. Motion planning algorithms for driving the robot (MoveIt! motion planning framework).
    • Configuration data. Private information with safety implications. Includes configuration for managing the robot.
    • Programmed routines. Private information with safety implications.
    • Credentials. CI/CD, enterprise assets, SROS certificates.
  • CI/CD and OTA subsystem
    • Configuration data.
    • Credentials
    • Firmware files and source code. Intellectual property, both end-user and manufacturer.
  • Robot Data
    • System and ROS logs are stored on each joint module’s filesystem.
    • Robot system is subject to physical attacks (physically removing the disk from the robot to read its data).
  • Cloud Data
    • Different versions of the software are stored on the OTA server.
  • Secret Management
    • DDS / ROS Topics
      • If SROS is enabled, private keys are stored on the local filesystem of each module.
  • Mender OTA
    • Mender OTA client certificates are stored on the robot file system.

Use case scenarios

As described above, the application considered is a MARA modular robot operating on an industrial environment while performing a pick & place activity. For this use case all possible external actor have been included. The actions they can perform on the robotic system have been limited to the following ones:

  • End-Users: From the End-user perspective, the functions considered are the ones needed for a factory line to work. Among this functions, there are the ones keeping the robot working and being productive. On an industrial environment, this group of external actors are the ones making use of the robot on their facilities.
    • Robotics Researcher: Development, testing and validation
      • A research engineer develops a pick and place task with the robot. They may:
        • Restart the robot
        • Restart the ROS graph
        • Physically interact with the robot
        • Receive software updates from OTA
        • Check for updates
        • Configure ORC control functionalities
    • Robotic User: Collaborative tasks
      • Start the robot
      • Control the robot
      • Work alongside the robot
    • Robotics Operator: Automation of industrial tasks
      • An industrial operator uses the robot in a factory. They may:
        • Start the robot
        • Control the robot
        • Configure the robot
        • Include the robot into the industrial network
        • Check for updates
        • Configure and interact with the ORC
  • Manufacturer: From the manufacturer perspective, the application considered is the development of the MARA robotic platform, using an automated system for deployment of updates, following a CI/CD system. These are the actors who create and maintain the robotic system itself.
    • Robotics Engineer: Development, testing and validation
      • Development of new functionality and improvements for the MARA robot.
        • Develop new software for the H-ROS SoM
        • Update OS and system libraries
        • Update ROS 2 subsystem and control nodes
        • Deployment of new updates to the robots and management of the fleet
        • In-Place robot maintenance

Entry points

The following section outlines the possible entry points an attacker could use as attack vectors to render the MARA vulnerable.

  • Physical Channels
    • Exposed debug ports.
    • Internal field bus.
    • Hidden development test points.
  • Communication Channels
    • DDS / ROS 2 Topics
      • Topics can be listened or written to by any actor:
        1. Connected to a network where DDS packets are routed to.
        2. Have necessary permissions (read / write) if SROS is enabled.
      • When SROS is enabled, attackers may try to compromise the CA authority or the private keys to generate or intercept private keys as well as emitting malicious certificates to allow spoofing.
    • H-ROS API
      • H-ROS API access is possible to anyone on the same LAN or WAN (if port forwarding is enabled).
      • When authentication (in the H-ROS API) is enabled, attackers may try to vulnerate it.
    • Mender OTA
      • Updates are pushed from the server.
      • Updates could be intercepted and modified before reaching the robot.
  • Deployed Software
    • ROS nodes running on hardware are compiled by the manufacturer and deployed directly. An attacker may tamper the software running in the hardware by compromising the OTA services.
    • An attacker compromising the developer workstation could introduce a vulnerability in a binary which would then be deployed to the robot.
    • MoveIt! motion planning library may contain exploitable vulnerabilities.

Trust Boundaries for MARA in pick & place application

The following content will apply the threat model to an industrial robot on its environment. The objective on this threat analysis is to identify the attack vectors for the MARA robotic platform. Those attack vectors will be identified and clasified depending on the risk and services implied. On the next sections MARA’s components and risks will be detailed thoroughly using the threat model above, based on STRIDE and DREAD.

The diagram below illustrates MARA’s application with different trust zones (trust boundaries showed with dashed green lines). The number and scope of trust zones is depending on the infrastructure behind.

Robot System Threat Model Diagram Source (edited with Draw.io)

The trust zones ilustrated above are the following:

  • Firmware Updates: This zone is where the manufacturer develops the different firmware versions for each robot.
  • OTA System: This zone is where the firmwares are stored for the robots to download.
  • MARA Robot: All the robots’ componentents and internal comunications are gathered in this zone.
  • Industrial PC (ORC): The industrial PC itself is been considered a zone itself because it manages the software the end-user develops and sends it to the robot.
  • Software control: This zone is where the end user develops software for the robot where the tasks to be performed are defined.

Threat Model

Each generic threat described in the main threat table can be instantiated on the MARA.

This table indicates which of MARA’s particular assets and entry points are impacted by each threat. A check sign (✓) means impacted while a cross sign (✘) means not impacted. The “SROS Enabled?” column explicitly states out whether using SROS would mitigate the threat or not. A check sign (✓) means that the threat could be exploited while SROS is enabled while a cross sign (✘) means that the threat requires SROS to be disabled to be applicable.

Threat MARA Assets Entry Points SROS Enabled? Attack Mitigation Mitigation Result (redesign / transfer / avoid / accept) Additional Notes / Open Questions
Human Assets Robot App. ROS 2 API (DDS) Manufacturer CI/CD End-user CI/CD H-ROS API OTA Physical
Embedded / Software / Communication / Inter-Component Communication
An attacker spoofs a software component identity. Without SROS any node may have any name so spoofing is trivial.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
Mitigating risk requires implementation of SROS on MARA. No verification of components. An attacker could connect a fake joint directly. Direct access to the system is granted (No NAC).
✘/✓ An attacker deploys a malicious node which is not enabling DDS Security Extension and spoofs the `joy_node` forcing the robot to stop.
  • DDS Security Governance document must set `allow_unauthenticated_participants` to False to avoid non-authenticated participants to be allowed to communicate with authenticated nodes.
  • DDS Security Governance document must set `enable_join_access_control` to True to explicitly whitelist node-to-node-communication. permissions.xml should be as restricted as possible.
Risk is mitigated
An attacker steals node credentials and spoofs the joint node forcing the robot to stop.
  • Store node credentials in a secure location (secure enclave, RoT) to reduce the probability of having a private key leaked.
  • Run nodes in isolated sandboxes to ensure one node cannot access another node data (including credentials)
  • Permissions CA should digitally sign nodes binaries to prevent running tampered binaries.
  • Permissions CA should be able to revoke certificates in case credentials get stolen.
Mitigation risk requires additional work.
  • AWS Robotics and Automation is currently evaluating the feasibility of storing DDS-Security credentials in a TPM.
  • Complete mitigation would require isolation using e.g. Snap or Docker.
  • Deploying an application with proper isolation would require us to revive discussions around ROS 2 launch system
  • Yocto / OpenEmbedded / Snap support should be considered
An attacker intercepts and alters a message. Without SROS an attacker can modify `/goal_axis` or `trajectory_axis` messages sent through a network connection to e.g. stop the robot.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications. Message tampering is mitigated by DDS security as message authenticity is verified by default (with preshared HMACs / digital signatures)
Risk is reduced if SROS is used.
An attacker writes to a communication channel without authorization. Without SROS, any node can publish to any topic.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
An attacker listens to a communication channel without authorization. Without SROS: any node can listen to any topic.
  • Enable SROS / DDS Security Extension to authenticate and encrypt DDS communications.
Risk is reduced if SROS is used.
✘/✓ DDS participants are enumerated and fingerprinted to look for potential vulnerabilities.
  • DDS Security Governance document must set `metadata_protection_kind` to ENCRYPT to prevent malicious actors from observing communications.
  • DDS Security Governance document must set `enable_discovery_protection` to True to prevent malicious actors from enumerating and fingerprinting DDS participants.
  • DDS Security Governance document must set `enable_liveliness_protection` to True
Risk is mitigated if DDS-Security is configured appropriately.
An attacker prevents a communication channel from being usable. Without SROS: any node can ""spam"" any other component.
  • Enable SROS to use the DDS Security Extension. This does not prevent nodes from being flooded but it ensures that only communication from allowed participants are processed.
Risk may be reduced when using SROS.
A node can ""spam"" another node it is allowed to communicate with.
  • Implement rate limitation on topics
  • Define a method for topics to declare their required bandwidth / rate.
Mitigating risk requires additional work. How to enforce when nodes are malicious? Observe and kill?
Embedded / Software / Communication / Remote Application Interface
An attacker gains unauthenticated access to the remote application interface. ✘/✓ An attacker connects to the H-ROS API in an unauthenticated way. Reads robot configuration and alters configuration values.
  • Add authentication mechanisms to the H-ROS API.
  • Enable RBAC to limit user interaction with the API.
Risk is mitigated.
An attacker could eavesdrop communications to the Robot’s remote application interface. ✘/✓ An attacker executes a MitM attack, eavesdropping all unencrypted communications and commands sent to the API. Encrypt the communications through the usage of HTTPS. Risk is mitigated.
An attacker could alter data sent to the Robot’s remote application interface. ✘/✓ An attacker could execute a MitM attack and alter commands being sent to the API. Encrypt the communications through the usage of HTTPS. Risk is mitigated.
Embedded / Software / OS & Kernel
An attacker compromises the real-time clock to disrupt the kernel RT scheduling guarantees. ✘/✓ A malicious actor attempts to write a compromised kernel to /boot
  • Enable verified boot on Uboot to prevent booting altered kernels.
  • Use built in TPM to store firmware public keys and define an RoT.
Risk is mitigated.
An attacker compromises the OS or kernel to alter robot data. ✘/✓ A malicious actor attempts to write a compromised kernel to /boot
  • Enable verified boot on Uboot to prevent booting altered kernels.
  • Use built in TPM to store firmware public keys and define an RoT.
Risk is mitigated.
An attacker compromises the OS or kernel to eavesdrop on robot data. ✘/✓ A malicious actor attempts to write a compromised kernel to /boot
  • Enable verified boot on Uboot to prevent booting altered kernels.
  • Use built in TPM to store firmware public keys and define an RoT.
Risk is mitigated.
Embedded / Software / Component-Oriented Architecture
A node accidentally writes incorrect data to a communication channel. ✘/✓ A node writes random or invalid values to the /goal_axis topics.
  • Expand DDS IDL to allow users to embed validity criteria to automate input sanitization (i.e. validate ranges, etc.)
  • Expand RMW to define per-topic strategies for invalid messages (drop, throw, abort, etc.).
Need to expand DDS IDL or RMW for mitigating the risk.
✘/✓ A node could to write out of physical bounds values to the `/goal_axis` or `/trajectory_axis` topics, causing damage to the robot.
  • Define physical limitations of the different joints on the motor driver, limiting the possible movement to a safe range.
  • Enable signature verification of executables to reduce the risks of inserting a malicious node.
  • Limit control of actuators to only the required nodes. Enable AppArmor policies to isolate nodes.
Risk is mitigated when applying limits on actuator drivers.
An attacker deploys a malicious node on the robot. ✘/✓ An attacker deploys a malicious node to the robot. This node performs dangerous movements that compromise safety. The node attempts to perform physical or logical damage to the modules.
  • Run each node in an isolated environment with limited privileges(sandboxing).
  • Enable signing and verification of executables.
  • Running the component in a Ubuntu Core sandbox environment could limit the consequences of the attack.
  • Enabling signature verification of executables would reduce the risks of inserting a malicious node.
  • Limiting control of actuators to only the required nodes would reduce risk in case of a node compromise. Enable AppArmor policies to isolate nodes.
An attacker can prevent a component running on the robot from executing normally. ✘/✓ A malicious node running on the robot starts sending kill requests to other nodes in the system, disrupting the normal behaviour of the robot. Having the abiliy to shutdown/kill nodes through API request supposes a problem on the ROS implementation. Deprecation of the function should be considered. Node restarting policie should be applied. Deprecation of the shutdown API call needs to be considered.
Embedded / Software / Configuration Management
An attacker modifies configuration values without authorization. Node parameters are freely modifiable by any DDS domain participant.
An attacker accesses configuration values without authorization. Node parameters values can be read by any DDS domain participant.
A user accidentally misconfigures the robot. ✘/✓ Node parameters values can be modified by anyone to any value.
Embedded / Software / Data Storage (File System)
An attacker modifies the robot file system by physically accessing it. ✘/✓ An attacker modifies the filesystem data within the robot Enable filesystem encryption with LUKS or dm-crypt, storing keys on TPM device. Risk is mitigated.
An attacker eavesdrops on the robot file system by physically accessing it. ✘/✓ An attacker physically accesses the memory chip to eavesdrop credentials, logs or sensitive data. Enable filesystem encryption with LUKS or dm-crypt, storing keys on TPM device. Risk is mitigated.
An attacker saturates the robot disk with data. A malicious node writes random data that fills the robot disk. Enable disk quotas on the system. Enable sandboxing of the processes. Separate disk into multiple partitions, sending non trivial data to temporary directories. Risk is partially mitigated. Disk cleanup routines and log rotation should also be implemented.
Embedded / Software / Logs
An attacker exfiltrates log data to a remote server. ✘/✓ An attacker compromising the OTA server could request device log data and eavesdrop sensitive information. Enable RBAC on the OTA server, limit access to sensitive functions. Risk is mitigated.
Embedded / Hardware / Sensors
An attacker spoofs a robot sensor (by e.g. replacing the sensor itself or manipulating the bus). ✘/✓ An attacker could physically tamper the readings from the sensors. Add noise or out-of-bounds reading detection mechanism on the robot, causing to discard the readings or raise an alert to the user. Add detection of sensor disconnections. Risk is mitigated.
Embedded / Hardware / Actuators
An attacker spoofs a robot actuator. ✘/✓ An attacker could insert a counterfeit modular joint in the robot, compromising the whole system (e.g. a modified gripper).
  • Implement network access control systems, performing a verification of the part before granting access to the system.
  • Implement certificate based, 802.1x authentication for the communication with the nodes, discarding any new modules that do not authenticate on the system.
Risk is mitigated. Additional evaluation should be performed. Authenticating nodes via certificates would require shipping the nodes with client certificates, and the validated manufacturers would require a subordinate CA to sign their modules (Kinda DRM-ish).
An attacker modifies the command sent to the robot actuators (intercept & retransmit). ✘/✓ An attacker intercepts the communication channel traffic. The command is altered an retransmitted to the target joint.
  • Implement network access control systems, performing a verification of the part before granting access to the system.
  • Implement certificate based, 802.1x authentication for the communication with the nodes, discarding any new modules that do not authenticate on the system.
Risk is mitigated.
Embedded / Hardware / Communications
An attacker connects to an exposed debug port. ✘/✓ An attacker could connect to an exposed debug port and gain control over the robot through the execution of arbitrary commands.
  • Limit access or remove exposed debug ports.
  • Disable local debug terminals and functionality from the ports.
  • Add authentication mechanisms to limit access to the ports only to authenticated devices and users.
Risk is mitigated.
An attacker connects to an internal communication bus. ✘/✓ An attacker could connect to an internal communication bus to send arbitrary data or eavesdrop communication between different components of the robot.
  • Limit access or remove unused communication ports.
  • Physically limit access to the internal robot components and communication buses.
  • Add physical tamper detection sensors to detect physical intrussions to the robot.
Risk is mitigated.
Remote / Software Deployment
An attacker spoofs the deployment service. ✘/✓ An attacker spoofs the update deployment server and serves malicious content to the devices.
  • Validate the deployment server through Public Key Infrastructure.
  • Prevent insecure connections to the server from the devices through HTTPS and HSTS policies.
  • Certificate pinning on devices.
Risk is mitigated.
An attacker modifies the binaries sent by the deployment service. ✘/✓ An attacker intercepts the communication to the deployment server and serves malicious content to the devices.
  • Validate the deployment server through Public Key Infrastructure.
  • Prevent insecure connections to the server from the devices through HTTPS and HSTS policies.
  • Digitally sign the binaries sent to the devices.
Risk is mitigated.
An attacker intercepts the binaries sent by the depoyment service. ✘/✓ An attacker intercepts the update communication and stores the binary sent to the devices, gaining access to intellectual property.
  • Make use of secure, encrypted communication channels.
  • Verify client devices through client certificates.
  • Sign and Encrypt update files.
Risk is mitigated.
An attacker prevents the robot and the deployment service from communicating. ✘/✓ An attacker blocks the robots update process.
  • Deploy multiple update server endpoints.
  • Deploy a distributed update system.
Risk is partially mitigated.

Attack Trees

In an atempt to analyze the different possible attacks before even happening, attack trees are created. This diagrams speculate about the possible attacks against the system in order to be able to counter them. Threats can be organized in the following attack trees. Attacks are ordered starting from the physical attack vector, continuing with network based attacks, and finishing with the support infrastructure for the robot.

Physical vector attack tree

The following attack tree describes the possible paths to be followed by an attacker for compromising the system.

Physical vector attack tree

Diagram Source (draw.io)

The next diagram shows the infrastructure affected on a possible attack based on a compromise of a physical communication port.

Physical vector attack architecture

ROS 2 API vector attack tree

The following attack tree describes the possible paths to be followed by an attacker for physically compromising the system.

ROS 2 API vector attack tree

Diagram Source (draw.io)

The following diagram shows the infrastructure affected on a possible attack based on exploitation of the ROS 2 API.

ROS 2 API vector attack architecture

H-ROS API vector attack tree

The following attack tree describes the possible paths to be followed by an attacker for compromising the system.

H-ROS API vector attack tree

Diagram Source (draw.io)

The diagram below shows the infrastructure affected on a possible attack against MARA’s H-ROS API.

H-ROS API vector attack architecture

Code repository compromise vector attack tree

The following attack tree describes the possible paths to be followed by an attacker for compromising the system.

Code repository compromise attack tree

Diagram Source (draw.io)

The following diagram shows the infrastructure affected on a possible attack against the manufacturer code repository, showing its potential implications and consequences.

Code repository attack architecture

Threat Model Validation Strategy

Validating this threat model end-to-end is a long-term effort meant to be distributed among the whole ROS 2 community. The threats list is made to evolve with this document and additional reference may be introduced in the future.

  1. Setup a MARA with the exact software described in the Components MARA section of this document.
  2. Penetration Testing
    • Attacks described in the spreadsheet should be implemented. For instance, a malicious hros_actuator_servomotor_XXXXXXXXXXXX could be implemented to try to disrupt the robot operations.
    • Once the vulnerability has been exploited, the exploit should be released to the community so that the results can be reproduced.
    • Whether the attack has been successful or not, this document should be updated accordingly.
    • If the attack was successful, a mitigation strategy should be implemented. It can either be done through improving ROS 2 core packages or it can be a platform-specific solution. In the second case, the mitigation will serve as an example to publish best practices for the development of secure robotic applications.
    • For trainning purposes, an online playground (RCTF) exists to challenge roboticists to learn and discover robot vulnerabilities.
    • For an overall evaluation of the robots’ security measures, the Robot Security Framework (RSF) will be used. This validation has to be don e after the assessment is completed in order to have a realistic results.

Security Assessment preliminary results

Due to the iterative nature of this process, the results shown on this section may differ depending on the threat model’s version.

The results here presented are a summary of the discoveries made during the Acutronic Robotics’ MARA assessment. Prior to the assessment, in order to be time-effective, a threat analysis was done over the system and the most likely attack vectors were identified.

Introduction

Based on the threat model performed over MARA, an assessment has been performed in order to discover the vulnerabilities within. The content below is directly related to that threat model using all the attack vectors identified as starting points for the exercise.

This Security Assessment Report for Acutronic Robotics’ MARA components has been performed by Alias Robotics S.L. The targeted system for the assessment is the H-ROS powered MARA. H-ROS, the Hardware Robot Operating System, is a collection of hardware and software specifications and implementations that allows creating modular and distributed robot parts. H-ROS heavily relies on the use of ROS 2 as a robotic development framework.

This document presents an overview of the results obtained in the assessment. In it, we introduce a report focused on identifying security issues or vulnerabilities detected during the exercise.

Results

This assessment has evaluated the security measures present in the H-ROS SoM components. The evaluation has been done through a vulnerability and risk assessment, and the following sections are a summary of the findings.

The vulnerabilities are rated on a severity scale according to the guidelines stated in the Robot Vulnerability Scoring System (RVSS), resulting in a score between 0 and 10, according to the severity of the vulnerability. The RVSS follows the same principles as the Common Vulnerability Scoring System (CVSS), yet adding aspects related to robotics systems that are required to capture the complexity of robot vulnerabilities. Every evaluated aspect is defined in a vector, having its weight at the final score.

All assessment findings from the MARA are classified by technical severity. The following list explains how Alias Robotics rates vulnerabilities by their severity.

  • Critical: Scores between 9 and 10
  • High: Scores between 7 and 8.9
  • Medium: Scores between 4 and 6.9
  • Low: Scores between 0.1 and 3.9
  • Informational: Scores 0.0

Signing Service Mitigation

In the course of a first funded vulnerability assessment, 2 critical, 6 high, 13 medium and 6 low vulnerabilities have been found. The ones fixed by Acutronic Robotics are disclosed below using the following format:

Name Name of the vulnerability
ID Vulnerability identifier
RVSS Score Rating assigned according to the Robot Vulnerability Scoring System (RVSS), based on the severity of the vulnerability
Scoring Vector Vector showing the different aspects that have defined the final score
Description Information about the nature of the vulnerability
Remediation Status Status of the remediation, with evidences and date (Active, Fixed)



Findings
Name H-ROS API vulnerable to DoS attacks
ID Vuln-07
RVSS Score 7.5
Scoring Vector RVSS:1.0/AV:RN/AC:L/PR:N/UI:N/Y:Z/S:U/C:N/I:N/A:H/H:N
Description The H-ROS API does not use any mechanism for limiting the requests that a user is able to perform in a determined set of time. This can lead to DoS attacks or premature wearing of the device.
Remediation Status



Name Insufficient limitation of hardware boundaries.
ID Vuln-02
RVSS Score 6.8
Scoring Vector RVSS:1.0/AV:IN/AC:L/PR:L/UI:N/Y:Z/S:U/C:N/I:L/A:H/H:E
Description Actuator drivers do not correctly limit the movement boundaries, being able to execute commands above the physical capabilities of the actuator.
Remediation Status



Name ROS 2 Goal topic vulnerable to DoS attacks.
ID Vuln-13
RVSS Score 5.5
Scoring Vector RVSS:1.0/AV:IN/AC:L/PR:N/UI:N/Y:Z/S:U/C:N/I:N/A:H/H:U
Description The ROS 2 nodes that control the motor fail when a big number of messages are sent in a small span of time. The application crashes and is not able to recover from failure, causing a DoS. This is probably caused by memory leaks, bugs or log storage.
Remediation Status



Name ROS 2 Trajectory topic vulnerable to DoS
ID Vuln-14
RVSS Score 5.5
Scoring Vector RVSS:1.0/AV:IN/AC:L/PR:N/UI:N/Y:Z/S:U/C:N/I:N/A:H/H:N
Description Trajectory messages sent to the `/trajectory` topic of the H-ROS SoM cause an unrecoverable failure on the ROS node that can only be fixed by a system restart.
Remediation Status



Name OTA OpenSSH Linux distribution version disclosure
ID Vuln-20
RVSS Score 5.3
Scoring Vector RVSS:1.0/AV:RN/AC:L/PR:N/UI:N/Y:Z/S:U/C:L/I:N/A:N/H:N
Description The OpenSSH server discloses the distribution name (Ubuntu) being used in the server in the connection headers, providing additional information to attackers.
Remediation Status



Name OTA OpenSSH version vulnerable to user enumeration attacks
ID Vuln-21
RVSS Score 5.3
Scoring Vector RVSS:1.0/AV:RN/AC:L/PR:N/UI:N/Y:Z/S:U/C:L/I:N/A:N/H:N
Description TThe OpenSSH server version 7.6p1 is vulnerable to user enumeration attacks by timing.
Remediation Status

References

  1. Abera, Tigist, N. Asokan, Lucas Davi, Jan-Erik Ekberg, Thomas Nyman, Andrew Paverd, Ahmad-Reza Sadeghi, and Gene Tsudik. “C-FLAT: Control-FLow ATtestation for Embedded Systems Software.” ArXiv:1605.07763 [Cs], May 25, 2016. http://arxiv.org/abs/1605.07763.
  2. Ahmad Yousef, Khalil, Anas AlMajali, Salah Ghalyon, Waleed Dweik, and Bassam Mohd. “Analyzing Cyber-Physical Threats on Robotic Platforms.” Sensors 18, no. 5 (May 21, 2018): 1643. https://doi.org/10.3390/s18051643.
  3. Bonaci, Tamara, Jeffrey Herron, Tariq Yusuf, Junjie Yan, Tadayoshi Kohno, and Howard Jay Chizeck. “To Make a Robot Secure: An Experimental Analysis of Cyber Security Threats Against Teleoperated Surgical Robots.” ArXiv:1504.04339 [Cs], April 16, 2015. http://arxiv.org/abs/1504.04339.
  4. Checkoway, Stephen, Damon McCoy, Brian Kantor, Danny Anderson, Hovav Shacham, Stefan Savage, Karl Koscher, Alexei Czeskis, Franziska Roesner, and Tadayoshi Kohno. “Comprehensive Experimental Analyses of Automotive Attack Surfaces.” In Proceedings of the 20th USENIX Conference on Security, 6–6. SEC’11. Berkeley, CA, USA: USENIX Association, 2011. http://dl.acm.org/citation.cfm?id=2028067.2028073.
  5. Clark, George W., Michael V. Doran, and Todd R. Andel. “Cybersecurity Issues in Robotics.” In 2017 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), 1–5. Savannah, GA, USA: IEEE, 2017. https://doi.org/10.1109/COGSIMA.2017.7929597.
  6. Denning, Tamara, Cynthia Matuszek, Karl Koscher, Joshua R. Smith, and Tadayoshi Kohno. “A Spotlight on Security and Privacy Risks with Future Household Robots: Attacks and Lessons.” In Proceedings of the 11th International Conference on Ubiquitous Computing - Ubicomp ’09, 105. Orlando, Florida, USA: ACM Press, 2009. https://doi.org/10.1145/1620545.1620564.
  7. Dessiatnikoff, Anthony, Yves Deswarte, Eric Alata, and Vincent Nicomette. “Potential Attacks on Onboard Aerospace Systems.” IEEE Security & Privacy 10, no. 4 (July 2012): 71–74. https://doi.org/10.1109/MSP.2012.104.
  8. Dzung, D., M. Naedele, T.P. Von Hoff, and M. Crevatin. “Security for Industrial Communication Systems.” Proceedings of the IEEE 93, no. 6 (June 2005): 1152–77. https://doi.org/10.1109/JPROC.2005.849714.
  9. Elmiligi, Haytham, Fayez Gebali, and M. Watheq El-Kharashi. “Multi-Dimensional Analysis of Embedded Systems Security.” Microprocessors and Microsystems 41 (March 2016): 29–36. https://doi.org/10.1016/j.micpro.2015.12.005.
  10. Groza, Bogdan, and Toma-Leonida Dragomir. “Using a Cryptographic Authentication Protocol for the Secure Control of a Robot over TCP/IP.” In 2008 IEEE International Conference on Automation, Quality and Testing, Robotics, 184–89. Cluj-Napoca, Romania: IEEE, 2008.https://doi.org/10.1109/AQTR.2008.4588731.
  11. Javaid, Ahmad Y., Weiqing Sun, Vijay K. Devabhaktuni, and Mansoor Alam. “Cyber Security Threat Analysis and Modeling of an Unmanned Aerial Vehicle System.” In 2012 IEEE Conference on Technologies for Homeland Security (HST), 585–90. Waltham, MA, USA: IEEE, 2012.https://doi.org/10.1109/THS.2012.6459914.
  12. Kleidermacher, David, and Mike Kleidermacher. Embedded Systems Security: Practical Methods for Safe and Secure Software and Systems Development. Amsterdam: Elsevier/Newnes, 2012.
  13. Klein, Gerwin, June Andronick, Matthew Fernandez, Ihor Kuz, Toby Murray, and Gernot Heiser. “Formally Verified Software in the Real World.” Communications of the ACM 61, no. 10 (September 26, 2018): 68–77.https://doi.org/10.1145/3230627.
  14. Lee, Gregory S., and Bhavani Thuraisingham. “Cyberphysical Systems Security Applied to Telesurgical Robotics.” Computer Standards & Interfaces 34, no. 1 (January 2012): 225–29. https://doi.org/10.1016/j.csi.2011.09.001.
  15. Lera, Francisco J. Rodríguez, Camino Fernández Llamas, Ángel Manuel Guerrero, and Vicente Matellán Olivera. “Cybersecurity of Robotics and Autonomous Systems: Privacy and Safety.” In Robotics - Legal, Ethical and Socioeconomic Impacts, edited by George Dekoulis. InTech, 2017.https://doi.org/10.5772/intechopen.69796.
  16. McClean, Jarrod, Christopher Stull, Charles Farrar, and David Mascareñas. “A Preliminary Cyber-Physical Security Assessment of the Robot Operating System (ROS).” edited by Robert E. Karlsen, Douglas W. Gage, Charles M. Shoemaker, and Grant R. Gerhart, 874110. Baltimore, Maryland, USA, 2013.https://doi.org/10.1117/12.2016189.
  17. Morante, Santiago, Juan G. Victores, and Carlos Balaguer. “Cryptobotics: Why Robots Need Cyber Safety.” Frontiers in Robotics and AI 2 (September 29, 2015). https://doi.org/10.3389/frobt.2015.00023.
  18. Papp, Dorottya, Zhendong Ma, and Levente Buttyan. “Embedded Systems Security: Threats, Vulnerabilities, and Attack Taxonomy.” In 2015 13th Annual Conference on Privacy, Security and Trust (PST), 145–52. Izmir, Turkey: IEEE, 2015. https://doi.org/10.1109/PST.2015.7232966.
  19. Pike, Lee, Pat Hickey, Trevor Elliott, Eric Mertens, and Aaron Tomb. “TrackOS: A Security-Aware Real-Time Operating System.” In Runtime Verification, edited by Yliès Falcone and César Sánchez, 10012:302–17. Cham: Springer International Publishing, 2016.https://doi.org/10.1007/978-3-319-46982-9_19.
  20. Ravi, Srivaths, Paul Kocher, Ruby Lee, Gary McGraw, and Anand Raghunathan. “Security as a New Dimension in Embedded System Design.” In Proceedings of the 41st Annual Conference on Design Automation  - DAC ’04, 753. San Diego, CA, USA: ACM Press, 2004. https://doi.org/10.1145/996566.996771.
  21. Serpanos, Dimitrios N., and Artemios G. Voyiatzis. “Security Challenges in Embedded Systems.” ACM Transactions on Embedded Computing Systems 12, no. 1s (March 29, 2013): 1–10. https://doi.org/10.1145/2435227.2435262.
  22. Vilches, Víctor Mayoral, Laura Alzola Kirschgens, Asier Bilbao Calvo, Alejandro Hernández Cordero, Rodrigo Izquierdo Pisón, David Mayoral Vilches, Aday Muñiz Rosas, et al. “Introducing the Robot Security Framework (RSF), a Standardized Methodology to Perform Security Assessments in Robotics.” ArXiv:1806.04042 [Cs], June 11, 2018. http://arxiv.org/abs/1806.04042.
  23. Zubairi, Junaid Ahmed, and Athar Mahboob, eds. Cyber Security Standards, Practices and Industrial Applications: Systems and Methodologies. IGI Global, 2012. https://doi.org/10.4018/978-1-60960-851-4.
  24. Akhtar, Naveed, and Ajmal Mian. “Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey.” ArXiv:1801.00553 [Cs], January 2, 2018. http://arxiv.org/abs/1801.00553.
  25. V. Mayoral Vilches, E. Gil-Uriarte, I. Zamalloa Ugarte, G. Olalde Mendia, R. Izquierdo Pisón, L. Alzola Kirschgens, A. Bilbao Calvo, A. Hernández Cordero, L. Apa, and C. Cerrudo, “Towards an open standard for assessing the severity of robot security vulnerabilities, the Robot Vulnerability Scoring System (RVSS),” ArXiv:807.10357 [Cs], Jul. 2018. https://arxiv.org/abs/1807.10357
  26. G. Olalde Mendia, L. Usategui San Juan, X. Perez Bascaran, A. Bilbao Calvo, A. Hernández Cordero, I. Zamalloa Ugarte, A. Muñiz Rosas, D. Mayoral Vilches, U. Ayucar Carbajo, L. Alzola Kirschgens, V. Mayoral Vilches, and E. Gil-Uriarte, “Robotics CTF (RCTF), a playground for robot hacking,” ArXiv:810.02690 [Cs], Oct. 2018. https://arxiv.org/abs/1810.02690
  27. L. Alzola Kirschgens, I. Zamalloa Ugarte, E. Gil Uriarte, A. Muñiz Rosas, and V. Mayoral Vilches, “Robot hazards: from safety to security,” ArXiv:1806.06681 [Cs], Jun. 2018. https://arxiv.org/abs/1806.06681
  28. Víctor Mayoral Vilches, Gorka Olalde Mendia, Xabier Perez Baskaran, Alejandro Hernández Cordero, Lander Usategui San Juan, Endika Gil-Uriarte, Odei Olalde Saez de Urabain, Laura Alzola Kirschgens, “Aztarna, a footprinting tool for robots,” ArXiv:1806.06681 [Cs], December 22, 2018. https://arxiv.org/abs/1812.09490