### CS 453/698: Software and Systems Security ### **Module: Hardware & Mobile Security** Lecture: Side-channel attacks Adam Caulfield *University of Waterloo*Spring 2025 # Reminders & Recap #### **Reminders:** - A4 is released - Due July 25th ### Recap – last time we covered: #### ARM TrustZone - TZASC/TZMA: partition system resources - NS-bit: internal to CPU, used by TZ-Aware MMU + Cache - Secure world boots first #### **Android** - OS that leverages TZ - Some features require SE ### What is a side channel? **Definition:** a mechanism by which an attacker can extract information about a system or its operations by observing characteristics or indirect effects that are not part of the system's intended input or output. ### What is a side channel? <u>Definition:</u> a mechanism by which an attacker can extract information about a system or its operations by observing characteristics or indirect effects that are not part of the system's intended input or output. #### Metaphor: Locard's exchange principle In forensic science, Locard's principle holds that: the perpetrator of a crime will bring something into the crime scene and leave with something from it, and that both can be used as forensic evidence $\rightarrow$ every contact leaves a trace ### What is a side channel? <u>Definition:</u> a mechanism by which an attacker can extract information about a system or its operations by observing characteristics or indirect effects that are not part of the system's intended input or output. #### Metaphor: Locard's exchange principle In forensic science, Locard's principle holds that: the perpetrator of a crime will bring something into the crime scene and leave with something from it, and that both can be used as forensic evidence $\rightarrow$ every contact leaves a trace #### For computer security: The execution of code will bring something to the hosting platform and leave with something from it, and both can be used as side channels. ### Side Channels #### **Examples of side channels** Bandwidth consumptions (e.g., network traffic) "James Bond" attacks - Thermal/audio footprints - Power consumption General timing side channels Cache-timing channels #### **Alice and Bob Communicate:** Alice accesses health forum via encrypted channel with Bob Adv. knows: bob hosts web forum & its content But, cannot directly decrypt the downloaded content Pages 3 KB 4 KB 2 KB 1 KB Adv. determines size of all pages on health forum Then, measures the size of Alice's downloaded pages Pages 3 KB 4 KB 2 KB 1 KB #### **Another example:** Re-identification of Netflix video streaming Burst sizes of a streamed scene of "Reservoir dogs" ### Side Channels: "James Bonds" attacks #### Any type of characteristic can be used as a side channel (a) STEP 1: Victim Enters Password (b) STEP 2: Victim Leaves (*Opportunistic*) (b) STEP 2: Victim Drawn Away (*Orchestrated*) (c) STEP 3: Thermal Residues Captured Figure 4: An Example of Thermanator Attack. ### Side Channels: "James Bonds" attacks #### Any type of characteristic can be used as a side channel Cronin et al.: Charger-Surfing: Exploiting a Power Line Side-Channel for Smartphone Information Leakage #### Take this example function: Finds the maximum value in a \_\_secret\_\_ buffer (int \* arr) ``` 1 int *find_max(__secret__ int *arr, int n) { 2 int max_val = INT_MINIMUM; 3 for (int i = 0; i < n; i++) { 4 if (arr[i] > max_val) { 5 max_val = arr[i]; 6 } 7 } 8 return max_val; 9 } ``` #### Assume an only sees Enc(max\_val)... Why could they learn max\_val through timing this function? #### Take this example function: Finds the maximum value in a \_\_secret\_\_ buffer (int \* arr) ``` int *find_max(__secret__ int *arr, int n) { int max_val = INT_MINIMUM; for (int i = 0; i < n; i++) { if (arr[i] > max_val) { max_val = arr[i]; } } return max_val; } } ``` #### Assume an only sees Enc(max\_val)... Why could they learn max\_val through timing this function? ### **Root cause** → **Secret dependent execution paths** ### Can be exploited with remote access - Adv only needs to know the inputs - Continue querying the function with different values ### **Root cause** → **Secret dependent execution paths** #### Can be exploited with remote access - Adv only needs to know the inputs - Continue querying the function with different values ### Root cause -> Secret dependent execution paths #### Can be exploited with remote access - Adv only needs to know the inputs - Continue querying the function with different values ### How to mitigate? → Constant-time programming ### Constant-time programming - Avoid secret-dependent if-statements - Avoid secret-dependent memory accesses - Avoid variable-time instructions - DIV, MULT (some archs.), Floating point operations ### Some examples: how to mitigate the previous example? ``` 1 int *find_max(__secret__ int *arr, int n) { 2 int max_val = INT_MINIMUM; 3 for (int i = 0; i < n; i++) { 4 if (arr[i] > max_val) { 5 max_val = arr[i]; 6 } 7 } 8 return max_val; 9 } ``` ### Some examples: how to mitigate the previous example? #### Perform the same computation for each iteration Record comparison Boolean into a *predicate* variable Use value of *predicate* as a mask to set max\_val for the current iteration ### Another examples: is this function constant-time? ``` int * get_element( int *arr, int size, __secret__ int index ) { int element = arr[index] return element } ``` ### Another examples: is this function constant-time? ``` int * get_element( int *arr, int size, __secret__ int index ) { int element = arr[index] return element } ``` ### No $\rightarrow$ secret dependent memory access How to patch? ### Another examples: is this function constant-time? ``` int * get_element( int *arr, int size, __secret__ int index ) { int element = 0 for (int i=0; i<size; i++) { int value = arr[i]; int match = (i == index); element = (match * value) + (~match * element) } return element }</pre> ``` #### Similar idea: perform memory access for each value Record comparison of correct access to expected one Use comparison (in match) as a mask to update element ### **Architectural-specific timing attacks:** ### For example, exploiting cache: Accessing values from cache vs. has specific timing ### Example: Intel CPUs - L1 cache → 4 cycles - L2 cache → 12 cycles - L3 cache $\rightarrow$ 26-31 cycles - DRAM memory → 120+ cycles ### **Architectural-specific timing attacks:** ### For example, exploiting cache: Accessing values from cache vs. has specific timing ### Example: Intel CPUs - L1 cache → 4 cycles - L2 cache → 12 cycles - L3 cache $\rightarrow$ 26-31 cycles - DRAM memory → 120+ cycles ### Some CPU instructions enable unprivileged cache maintenance - prefetch -> suggest CPU to load data into the catch - clflush -> throw out data from all caches #### **Concrete scenario:** - You run a secure program on a machine, and the program does one of two things: - Encrypt() - Decrypt() - You do not want anyone to know whether your program is encrypting a message or decrypting a message - Assuming trust in OS and hardware for now - The binary of your program is available - Attackers run their programs on the same machine - Their goal is to infer which operation your program is running ### Common access-driven cache attack strategies: Flush + Reload Prime + Probe # **Victim Address space Attacker Address Space** Cache Encrypt() Decrypt() **Init:** victim program loaded while cache is empty Step 1: attacker loads the Encrypt() code into its address space **Step 2:** attacker flushes the cache Step 3a: victim performs Encrypt() operation Step 3b: victim performs Decrypt() operation **Step 4:** attacker calls encrypt and times it $\rightarrow$ if occurred after 3a, will be fast Step 4: if after step 3b, slow because it is no longer in cache ### **Summary:** • Load Encrypt() to gain virtual address to the same physical page Flush the cache line corresponding to Encrypt() Reload by calling again, measuring the time to detect if the victim has loaded it • Why? -> Determine which code/data is in use, then attack further ### Prime + Probe # **Victim Address space Attacker Address Space** Cache Encrypt() Decrypt() **Init:** victim program loaded while the cache is empty Step 1: Attacker fills all available cache (prime) **Step 2a:** Victim evicts cache lines while performing Encrypt() Step 2b: Victim evicts cache lines while performing Decrypt() **Step 3:** attacker calls Encrypt after step $2a \rightarrow fast!$ **Step 3:** attacker calls Encrypt after step 2b → slow ### Flush + Reload ### **Summary:** • Prime: attacker fills targeted cache sets with their own data Victim executes and evicts some of the attacker's cache lines - Attacker re-accesses their cache lines, and timing reveals victim activity - Why? → Same as before - Determine which code/data is in use, then attack further So, how does this relate to HW security measures? #### So, how does this relate to HW security measures? Don't forget the "Uncle Ben" principle of TEEs TEE Manufacturer ### TEE Programmer ### So, how does this relate to HW security measures? Don't forget the "Uncle Ben" principle of TEEs TEE Manufacturer ### TEE Programmer Insecure code inside TEE boundary (including software-induced side channels) break the hardware-provided guarantees #### Recall from earlier... #### Recall from earlier... #### Recall from earlier... ### **Example: TPM timing side channels** Particularly as it relates to firmware-TPMs (fTPMs) - Software-based implementation of TPMs - Addressing some limitations of physical TPM: low-bandwidth - The idea run the entire TPM functionality in software inside a TEE - Software-virtualized TPM - Intel fTPM ### **Example: TPM timing side channels** Firmware TPMs depicted: ### **Example: TPM timing side channels** Firmware TPMs depicted: ### **Example: TPM timing side channels** Firmware TPMs depicted: ### **Example: TPM timing side channels** TPM-Fail paper ### Example: fTPM timing side channels in Intel and STMicroelctronic STMicroelectronics fTPM signature generation Intel fTPM signature generation fTPMs were found to have timing side channels for ECDSA signature generation • ECDSA scalar multiplication depends on nonce length ### **Example: Intel SGX** - Recall $\rightarrow$ - Enclave pages is placed in EPC - Metadata stored in EPCM - Both cannot be directly modified (only through EADD before EINIT) - Malicious OS cannot directly modify - However: entire memory hierarchy is shared - Enclave and non-enclave share cache - Enclave and non-enclave share other memory modules (DRAM module) - Additionally: - Outer world can invoke exits from enclave Asynchronous exits (AEX) - Pages have "accessed" and "dirty" bits observable by OS **Example: Intel SGX** $\rightarrow$ *sneaky page monitoring (SPM)* Goal: exploit page faults to learn control flow of enclave "Leaky Cauldron on the Dark Land: Understanding Memory Side-Channel Hazards in SGX" ### **Example: TrustZone** - Timing side channels - Typical cache-timing attacks don't work in the same way - NS-bit in the cache - Slightly modified version of Prime-Probe is still possible - Also interrupt-based attacks - If misconfigured interrupt controller, can invoke interrupts to return to Normal World #### **Example: TrustZone version of Prime+Probe** # Summary ### Many sources of side-channels must be considered Some require physical access, others are possible to observe remotely - Constant-time programming - Some tools to automate, but mostly done manually - Some ISA support → e.g., conditional instructions in ARM #### For fTPMs and TEEs: - For timing side channels: Uncle Ben's principle - For other side channels: understand architectural behavior # That's all for today! ### Coming up.... • Ethics, law, regulations, and compliance #### **Reminders:** A4 is due on July 25 # That's all for today! #### **Resources:** - Leaky Cauldron - Another SGX attack: <u>SGX Step</u> - Load-step attack in TrustZone - TruSpy cache attack in TrustZone - TPM-Fail