The Last Mile An Empirical Study of Timing Channels on seL4

David Cock Qian Ge Toby Murray Gernot Heiser

#### 4 November 2014



Department of Broadband, Communications and the Digital Economy Australian Research Council

Australian Government

NICTA Funding and Supporting Members and Partners





Background seL4 Channels

Local Channels The Cache Channel

nstruction-Based Schedulin Cache Colouring New Channels

Remote Channels Scheduled Delivery

# Outline

### Background

- seL4
- Channels
- Experimental Approach

## • Local Channels

- The Cache Channel
- Instruction-Based Scheduling
- Cache Colouring
- New Channels
- Remote Channels
  - Scheduled Delivery
- Summary
  - Outcomes
  - Ongoing Work



#### Background

seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser

# seL4

seL4 is a verified, high-performance microkernel. We have:

- Proof of functional correctness.
- Proof of authority confinement.
- Proof of explicit information-flow control.
- WCET analysis.



Background

seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

# seL4

seL4 is a verified, high-performance microkernel. We have:

- Proof of functional correctness.
- Proof of authority confinement.
- Proof of explicit information-flow control.
- WCET analysis.

We don't have:

• An comprehensive hardware model.



Background

seL4 Channels Experimental Approach

Local Channels

The Cache Channel nstruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

## Covert and Side Channels



Unexpected channels invalidate info-flow control. We can't prove their absence:

- Depend heavily on undocumented chip internals.
- Channels are probabilistic.





Background seL4 Channels Experimental Approach

OCAL CHANNELS The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

#### Channels we consider

- Can subvert our proof: *Timing channels*.
- Could be fixed with OS techniques e.g.
  *Cache contention.*
- That can be exploited in software.



Background seL4 Channels Experimental Appro

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

#### Channels we consider

- Can subvert our proof: *Timing channels*.
- Could be fixed with OS techniques e.g.
  *Cache contention.*
- That can be exploited in software.

#### Channels we don't consider

- Physical attacks e.g. DPA.
- Channels already excluded by proof e.g. Storage channels.



Background seL4 Channels Experimental Appro

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser

- The L2 cache channel.
- The bus contention channel (not covered today).



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

- The L2 cache channel.
- The bus contention channel (not covered today).
- The data also suggests 3 more (2 as-yet-unrecognised).



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

- The L2 cache channel.
- The bus contention channel (not covered today).
- The data also suggests 3 more (2 as-yet-unrecognised).

We evaluate 2 cache-channel countermeasures:

- Instruction-based scheduling.
- Cache colouring.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

- The L2 cache channel.
- The bus contention channel (not covered today).
- The data also suggests 3 more (2 as-yet-unrecognised).

We evaluate 2 cache-channel countermeasures:

- Instruction-based scheduling.
- Cache colouring.

We also consider countermeasures against **remote**, **algorithmic** channels:



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

- The L2 cache channel.
- The bus contention channel (not covered today).
- The data also suggests 3 more (2 as-yet-unrecognised).

We evaluate 2 cache-channel countermeasures:

- Instruction-based scheduling.
- Cache colouring.

We also consider countermeasures against **remote**, **algorithmic** channels:

• Lucky-13 against OpenSSL is our example victim.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

- The L2 cache channel.
- The bus contention channel (not covered today).
- The data also suggests 3 more (2 as-yet-unrecognised).

We evaluate 2 cache-channel countermeasures:

- Instruction-based scheduling.
- Cache colouring.

We also consider countermeasures against **remote**, **algorithmic** channels:

- Lucky-13 against OpenSSL is our example victim.
- Scheduled delivery is our countermeasure.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

|   | 0   |    |
|---|-----|----|
| Ν | IC. | ΤA |

|            | Core                         | Date | L2 Cache | Background                                |
|------------|------------------------------|------|----------|-------------------------------------------|
| iMX.31     | ARM1136JF-S ( <i>ARMv6</i> ) | 2005 | 128 KiB  | seL4<br>Channels<br>Experimental Approach |
| E6550      | Conroe ( <i>x86-64</i> )     | 2007 | 4096 KiB | Local Channels                            |
| DM3730     | Cortex A8 (ARMv7)            | 2010 | 256 KiB  |                                           |
| AM3358     | Cortex A8 (ARMv7)            | 2011 | 256 KiB  | New Channels<br>Remote Channels           |
| iMX.6      | Cortex A9 (ARMv7)            | 2011 | 1024 KiB | Scheduled Delivery                        |
| Exynos4412 | Cortex A9 (ARMv7)            | 2012 | 1024 KiB | Summary                                   |
| •          | · · · ·                      |      |          |                                           |

- 7 years and 3 (ARM) core generations.
- 32-fold range of cache sizes.

- Integrated with nightly regression test.
- Runs each channel with each countermeasure (54 combinations).
- 2,000 hours of data over 12 months, 4.3 GiB.
- Data and analysis tools open source: http://ssrg.nicta.com.au/projects/TS/ timingchannels.pml



Background seL4 Channels Experimental Approach

LOCAL Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

# Outline

## Background

- seL4
- Channels
- Experimental Approach

## Local Channels

- The Cache Channel
- Instruction-Based Scheduling
- Cache Colouring
- New Channels
- Remote Channels
  - Scheduled Delivery
- Summary
  - Outcomes
  - Ongoing Work



Background

seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser



- If you share a core, you share a cache.
- Hit vs. miss makes a **big** time difference.
- Many published attacks steal keys through the cache.





- If you share a core, you share a cache.
- Hit vs. miss makes a **big** time difference.
- Many published attacks steal keys through the cache.
- · We can control cache allocation (more shortly).



- 32,768 cache lines, 1000Hz sample rate (preemption).
- Bandwidth: 2400b/s.
- Baseline for comparison.



Sackground seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

### Advantages

- Applies to any channel.
- Simple to implement (18 lines in seL4).



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

#### Advantages

- Applies to any channel.
- Simple to implement (18 lines in seL4).

#### Disadvantages

- Restrictive Need to remove all clocks.
- Performance counter accuracy critical.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

### Advantages

- Applies to any channel.
- Simple to implement (18 lines in seL4).

#### Disadvantages

- Restrictive Need to remove all clocks.
- Performance counter accuracy critical.

Works great on older chips (iMX.31), but...



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

# Exynos4412 Cache Channel with IBS



- Preempt after 10<sup>5</sup> instructions. Bandwidth: 400b/s.
- 6× reduction very poor result.
- Event delivery is imprecise thanks to speculation.

# Exynos4412 Cache Channel with IBS



Lines evicted /10<sup>3</sup>

- Preempt after 10<sup>5</sup> instructions. Bandwidth: 400b/s.
- 6× reduction very poor result.
- Event delivery is imprecise thanks to speculation.
- Attacker can modulate it!

**NICTA** 

Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery



Caches are divided into sets by physical address.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery



- Caches are divided into sets by physical address.
- The set selector bits of the address choose the set.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery



- Caches are divided into sets by physical address.
- The set selector bits of the address choose the set.
- These bits (usually) overlap the frame number.





Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery



- Caches are divided into sets by physical address.
- The set selector bits of the address choose the set.
- These bits (usually) overlap the frame number.
- If these colour bits differ, the frames cannot collide.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery



- Caches are divided into sets by physical address.
- The set selector bits of the address choose the set.
- These bits (usually) overlap the frame number.
- If these **colour bits** differ, the frames cannot collide.
- The frame number (phys. addr.) is **under OS control**.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery

In seL4, resource allocation is securely delegated. The initial task splits RAM in to coloured **pools**.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

New Channels

Remote Channels Scheduled Delivery

In seL4, resource allocation is securely delegated. The initial task splits RAM in to coloured **pools**.

#### **Advantages**

- Very few kernel changes (in seL4).
- Low overhead.



Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels

In seL4, resource allocation is securely delegated. The initial task splits RAM in to coloured **pools**.

#### **Advantages**

- Very few kernel changes (in seL4).
- Low overhead.

#### Disadvantages

- Only applies to the cache channel.
- Relies on internal details of cache operation.
- Doesn't work with large pages.
- Hashed caches break everything.



#### Background seL4 Channels Experimental Approach

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery

# Exynos4412 Cache Channel, Partitioned



- Bandwidth: 15b/s.
- 160× reduction Much better, but not perfect.

# Exynos4412 Cache Channel, Partitioned



- Bandwidth: 15b/s.
- 160× reduction Much better, but not perfect.

Where does that 15b/s come from?

# Exynos4412 TLB Channel



- Average rate correlates with TLB misses.
- Flushing on switch removes the signal.

The Last Mile

Copyright NICTA 2014

## Misprediction and the Cycle Counter



- Cycle counter affected by invisible mispredicts.
- A new (an **unexpected**) channel.
- Event delivery is **precise**, the cycle counter is wrong.

## Outline

## Background

- seL4
- Channels
- Experimental Approach

## • Local Channels

- The Cache Channel
- Instruction-Based Scheduling
- Cache Colouring
- New Channels

#### Remote Channels

Scheduled Delivery

#### Summary

- Outcomes
- Ongoing Work



#### Background

seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

#### Remote Channels

Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile



- Exploits non-constant-time MAC calculation.
- This is an Algorithmic side-channel.
- Remotely exploitable.
- We reproduce the distinguishing attack.



Background

seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels

Scheduled Delivery

## OpenSSL 1.0.1c Response Times



- Nearby attacker (crossover cable).
- Modified vs. unmodified packet.
- Distinguishable with pprox 100% probability.



Fixed in OpenSSL version 1.0.1e

A constant-time padding/MAC check.

- Now secure (on x86).
- We tested on ARM still a small channel.

Background seL4 Channels

Local Channels The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels

Scheduled Delivery

## The Upstream Fix: Constant-Time Code



A constant-time padding/MAC check.

- Now secure (on x86).
- We tested on ARM still a small channel.
- We present an OS-level solution, with:
  - Better performance.
  - Lower latency.
  - Lower CPU overhead.
  - No modification of OpenSSL.



#### Background seL4 Channels

Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

#### Remote Channels

Scheduled Delivery

## **OpenSSL 1.0.1e Response Times**



## OpenSSL 1.0.1e Response Times



- Constant-time implementation.
- Better, but still distinguishable 62%.
- 60µs (8.1%) latency penalty.





#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

#### Remote Channels

Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser



out <---- SSL\_write <---- server <----



- Separate OpenSSL and application.
- Announce packets over IPC.



Background seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery





- Announce packets over IPC.
- Record arrival time.



Background seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

Copyright NICTA 2014

David Cock, Qian Ge, Toby Murray, Gernot Heiser

25/34







- Separate OpenSSL and application.
- Announce packets over IPC.
- Record arrival time.



Scheduled Delivery





- Separate OpenSSL and application.
- Announce packets over IPC.
- Record arrival time.



Background seL4 Channels Experimental Approach

\_ocal Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery





- Separate OpenSSL and application.
- Announce packets over IPC.
- Record arrival time.
- Block response on timer.



Background seL4 Channels Experimental Approach

Local Channels

Ine Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery





- Separate OpenSSL and application.
- Announce packets over IPC.
- Record arrival time.
- Block response on timer.



Background seL4 Channels Experimental Approach

LOCAL Channels The Cache Channel Instruction-Based Scheduling

Remote Channels Scheduled Delivery





- Separate OpenSSL and application.
- Announce packets over IPC.
- Record arrival time.
- Block response on timer.



#### Background seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

#### We use **real-time scheduling** to precisely delay messages. Provides an efficient **mechanism** to enforce a delay **policy**: See *Askarov et. al., CCS 2010.*



Background

seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

We use **real-time scheduling** to precisely delay messages. Provides an efficient **mechanism** to enforce a delay **policy**: See *Askarov et. al., CCS 2010*.

#### **Advantages**

- Uses existing IPC controls no modifications.
- Fast and effective (see next slide).



ackground

Experimental Approach

DCal Channels The Cache Channel Instruction-Based Scheduling

Remote Channels Scheduled Delivery

We use **real-time scheduling** to precisely delay messages. Provides an efficient **mechanism** to enforce a delay **policy**: See *Askarov et. al., CCS 2010*.

#### **Advantages**

- Uses existing IPC controls no modifications.
- Fast and effective (see next slide).

#### Disadvantages

- Specific to remote/network attacks.
- Need to wrap (but not modify) vulnerable component.



Background seL4 Channels

Experimental Approach

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser

## Security of Scheduled Delivery





Background seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

## Security of Scheduled Delivery



- More secure 57% distinguishable.
- Faster 10µs (1.4%) latency penalty.



Background seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile

Copyright NICTA 2014

David Cock, Qian Ge, Toby Murray, Gernot Heiser

## Performance of Scheduled Delivery



- Lower overhead (2% vs. 11%), but earlier saturation.
- This is a worst-case benchmark no server work.

## Outline

## Background

- seL4
- Channels
- Experimental Approach

## • Local Channels

- The Cache Channel
- Instruction-Based Scheduling
- Cache Colouring
- New Channels

#### Remote Channels

Scheduled Delivery

## Summary

- Outcomes
- Ongoing Work



#### Background

seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

#### Summary

Outcomes Ongoing Wor

The Last Mile

David Cock, Qian Ge, Toby Murray, Gernot Heiser

#### These channels are real

- We managed to exploit every channel we tried.
- The bandwidth is high, and growing.



Background seL4 Channels Experimental Approach

Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary

Outcomes Ongoing Work

#### These channels are real

- We managed to exploit every channel we tried.
- The bandwidth is high, and growing.

#### They're getting worse

- Countermeasures get less effective on newer chips.
- New hardware channels e.g. branch predictor.



Background seL4 Channels Experimental Approach

LOCAL Channels The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery

Summar

Outcomes Ongoing Work

#### These channels are real

- · We managed to exploit every channel we tried.
- The bandwidth is high, and growing.

#### They're getting worse

- Countermeasures get less effective on newer chips.
- New hardware channels e.g. branch predictor.

#### But there is hope

- Resource partitioning (e.g. colouring) is effective.
- Repurpose hardware QoS features.
- OS-level techniques can help.



#### Background seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summar

Outcomes Ongoing Work

# sel4 sel4 sel4 sel4

- Ongoing project at NICTA (Qian Ge).
- Developing a fully cache-coloured seL4.
- We use these tools to evaluate the result.



Background seL4 Channels Experimental Approach

OCAL CHANNELS The Cache Channel Instruction-Based Scheduling Cache Colouring

Remote Channels Scheduled Delivery



## **Questions?**

Data and Tools http://ssrg.nicta.com.au/projects/ TS/timingchannels.pml seL4 Is Open Source! http://sel4.systems

#### Background

seL4 Channels Experimental Approach

#### Local Channels

The Cache Channel Instruction-Based Scheduling Cache Colouring New Channels

Remote Channels Scheduled Delivery

Summary Outcomes Ongoing Work

The Last Mile



- Noise injection doesn't scale.
- Increasing determinism is the way to go.

**NICTA** 

Ongoing Work