Compositionality Results for Quantitative Information Flow

Compositionality Results for Quantitative Information Flow Yusuke Kawamoto École Polytechnique & INRIA, France Joint wo...

1 downloads 80 Views 1MB Size
Compositionality Results for Quantitative Information Flow Yusuke Kawamoto École Polytechnique & INRIA, France

Joint work with Konstantinos Chatzikokolakis Catuscia Palamidessi

(École Polytechnique & CNRS, France)

(École Polytechnique & INRIA, France)

In QEST 2014

Background: Information flow • Goal: To avoid the leakage of secret information in systems

• Problem: There is almost always some leakage due to observable outputs

• In reality, it is difficult / impossible to prevent all secret information from being leaked. E.g. Receipt of a credit card displays the last 4 digits: **** **** **** 1234

Information leakage exists but is harmless.

• How can we say an information leakage is harmless or not?

10 September 2014

QEST 2014

2

Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..

– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y. C

X Secret input

10 September 2014

System

Y Public output

QEST 2014

3

Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..

– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.

E.g. X High value

10 September 2014

C

Program

Y Low value

QEST 2014

4

Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..

– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.

E.g. X Secret key

10 September 2014

C

Cipher

Y Decryption time

QEST 2014

5

Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..

– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.

E.g. X

C Anonymous communication

Sender of message

protocol

Crowds protocol

(on mobile ad-hoc network) 0.3

Y

0.7

Initiator

Observable messages

A participant forwards a message through others.

0.3

0.7

0.3

Server

(attacker) 0.3

Corrupted (attacker)

Channel matrix C describes Prob( recv | initiator )

X

Y Receive Receive Receive

from A from B from C initiator = A 0.3 0.7 0 initiator = B 0.35 0.3 0.35 initiator = C 0.35 0.35 0.3 10 September 2014

QEST 2014

6

Background: Quantitative information flow What the analyst should know in advance • Prior distribution π (on inputs) E.g. π [initiator = A] = 1/3, π [initiator = B] = 1/3, π [initiator = C] = 1/3

• Channel matrix C (Conditional probability dist. of outputs given secret inputs) E.g. C [initiator = A, Receive from A ] = 0.3, ……

10 September 2014

QEST 2014

7

Background: Quantitative information flow What the analyst should know in advance • Prior distribution π (on inputs) E.g. π [initiator = A] = 1/3, π [initiator = B] = 1/3, π [initiator = C] = 1/3

• Channel matrix C (Conditional probabilities of outputs given inputs) E.g. C [initiator = A, Receive from A ] = 0.3, ……

What the analyst wants to know • How much information on the secret input value is leaked by the output value. Prior distribution π

Channel C

Posterior distribution π C calculate

calculate

Prior uncertainty

Leakage = 10 September 2014

Many kinds of measures

Prior uncertainty



Posterior uncertainty

Posterior uncertainty QEST 2014

8

Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1

Prior dist π

π2

Component channel C1 Component channel C2



πn

Component channel Cn

Kinds of systems

Different kinds of composition π 1 C1 π 2 C2

Posterior dist π C

π n Cn

Small

Large known Large unknown

Input distribution π

known

known

known

Component channels Ci

known

known

approx. statistically

Computing Leak(πi, Ci )

feasible

feasible

approx. statistically By running Ci with πi many times

E.g. Tool leakiEst [Chothia et al.’13] 10 September 2014

QEST 2014

9

Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1

Prior dist π

π2

Component channel C1 Component channel C2



πn

Component channel Cn

Kinds of systems

π 1 C1 π 2 C2

Posterior dist π C

π n Cn

Small

Large known Large unknown

Input distribution π

known

known

known

Component channels Ci

known

known

approx. statistically

Computing Leak(πi, Ci )

feasible

feasible

approx. statistically

Computing C = C1 || C2 || … || Cn

feasible

unfeasible

unfeasible

Computing Leak(π, C )

feasible

unfeasible

unfeasible

10 September 2014

QEST 2014

10

Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1

Prior dist π

π2

Component channel C1 Component channel C2



πn

Component channel Cn

Kinds of systems

π 1 C1 π 2 C2

Posterior dist π C

π n Cn

Small

Large known Large unknown

Input distribution π

known

known

known

Component channels Ci

known

known

approx. statistically

Computing Leak(πi, Ci )

feasible

feasible

approx. statistically

Computing C = C1 || C2 || … || Cn

feasible

unfeasible unfeasible Lower & upper bounds by

Computing Leak(π, C )

feasible

our compositionality theorems unfeasible unfeasible

10 September 2014

QEST 2014

11

Contribution of our work • Compositionality results of “g-leakage”, a general notion of information leakage measures. • Improvement of these results by input approximation technique. • Evaluation of our results • by an example of the Crowds protocol, • by randomly generated channels.

10 September 2014

QEST 2014

12

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014

QEST 2014

13

Leakage measure: Min-entropy leakage

[Smith’09]

• MEL (min-entropy leakage) quantifies the information leakage under one-try guessing attacks. Attacker can guess the secret value (and learn whether the guess is correct) only once.

C

π calculate

V(π)

πC calculate

V(π, C)

MEL • Prior vulnerability

V(π) = (Maximum probability of guessing a secret)

• Posterior vulnerability V(π, C) = (Average probability after observing output from C) • MEL (Min-entropy leakage) I∞(π, C) = (– log V(π)) – (– log V(π, C)) Prior uncertainty 10 September 2014

QEST 2014

Posterior uncertainty 14

Leakage measure: g-leakage

[Alvim et al.’12]

• g-leakage quantifies information leakage under an operational scenario, specified by a gain function g. When the secret value is x and the attacker makes a guess w on x, then g(w, x) ∊ [0, 1] represents the gain of the attacker.

C

π calculate

Vg (π)

πC calculate

Vg (π, C)

g-leakage • Prior g-vulnerability

Vg(π) = (Average gain of the best guess)

• Posterior g-vulnerability Vg(π, C) = (Average gain after observing output from C ) • g-leakage

Ig(π, C) = (– log Vg(π)) – (– log Vg(π, C)) Prior uncertainty

10 September 2014

QEST 2014

Posterior uncertainty 15

Examples of g-leakage

[Alvim et al.’12]

Identity gain function gid • Attacker makes only one guess w about secret x. • gid(w, x) =

1 (if w = x) 0 (otherwise)

• g-leakage with gid quantifies the leakage under one-try guessing attacks. MEL (Min-entropy leakage)

k-tries gain function gWk • Attacker makes k guesses w1, w2, … , wk about secret x. • gWk ({w1, w2, … , wk}, x) =

1 if x ∊ {w1, w2, … , wk} 0 otherwise

• k-tries g-leakage quantifies the leakage under k-tries guessing attacks. 10 September 2014

QEST 2014

16

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014

QEST 2014

17

Parallel compositions of channels Parallel composition (with distinct input) C1 ×C2 x1 Distinct inputs

x2

Component channel C1 Component channel C2

y1 y2

Outputs

Parallel composition with shared input C1 || C2 x Shared Component channel C1 input xSpecial case of distinct inputs

x

10 September 2014

Component channel C2

y1 y2

Outputs

QEST 2014

18

Compositions of gain functions

• Attacker’s gain gi (wi, xi) depends on the secret xi and a guess wi about xi . • For each input domain Xi, we prepare a gain function gi. x1 y1 Gain function g1 on X1×W1

Gain function g2 on X2×W2

Component channel C1

x2

Component channel C2

y2

Outputs

• For the joint input domain X1×X2 , we need to define some composed gain function g . C1 ×C2 Gain function g on (X1×X2)×(W1×W2) (x1, x2)

x1

x2

Component channel C1 Component channel C2

y1 y2

Outputs

The only assumption: a joint guess (w1, w2) is worthless iff at least one of w1 or w2 is worthless. i.e. 10 September 2014

iff QEST 2014

19

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014

QEST 2014

20

Compositionality for g-leakage x1 Distinct inputs x2

C1 ×C2 Component ch C1 Component ch C2

(disjoint inputs)

y1 y2 Outputs

Some factor that depends on the correlation between the marginal distributions πi’s of the prior π

Theorem 1 M(π,g,g1,g2)

M(π,g,g1,g2)

for any jointly supported prior π . π1[x1] π2[x2] = 0 implies π [x1, x2] = 0

10 September 2014

QEST 2014

21

Compositionality for g-leakage x1 Distinct inputs x2

C1 ×C2 Component ch C1 Component ch C2

(disjoint inputs)

y1 y2 Outputs

Theorem 1 M(π,g,g1,g2)

M(π,g,g1,g2)

for any jointly supported prior π .

Corollary When the marginal priors π1 and π2 of π are independent:

Remark: • If the marginal priors πi’s are far from independence, the leakage bounds tend to be worse. • We re-obtain the compositionality for min-capacity [Barthe & Kopf’11]: 10 September 2014

QEST 2014

22

Compositionality for g-leakage Shared input x

x x

C1 || C2 Component ch C1 Component ch C2

(shared input)

y1 y2 Outputs

Theorem 3 K(π, g)

For any prior π :

Corollary (MEL) For any prior π : where

(Minimum of all non-zero probabilities)

Remark: • We re-obtain 10 September 2014

[Espinoza & Smith’13] QEST 2014

23

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014

QEST 2014

24

Input approximation technique • Our leakage bounds are not good when the prior π contains very small probability. (Minimum of all non-zero probabilities) Very small π[x] makes Hmin(π) very large.

• On the other hand, very small probabilities affect MEL only a little. Only maximum probability matters.

Input approximation technique

Only large probability matters. Reminiscent of “smooth entropy” [Cachin’97]

1. We calculate a sub-distribution π’ by replacing small probabilities with 0 in π.

2. We calculate bounds using our theorems for π’ . 3. We evaluate the errors introduced by replacing π with π’ : ε : Summation of all removed probabilities 10 September 2014

QEST 2014

25

Input approximation technique Example • Prior distribution π • Channel matrix C

• MEL (Min-entropy leakage) I∞(π, C10) Better than • True value = 0.1319 previous work on min-capacity • Our upper bound = 0.7444 • Min-capacity bound in [Barthe&Kopf’11] = 4.114 MEL ≤ Min-capacity 10 September 2014

QEST 2014

26

Compositionality of MEL

(unknown channels)

• We cannot directly apply this when analyst does not know component channels Ci.

Kinds of systems

Small

Large known Large unknown

Approximate input distribution π’

known

known

known

Component channels Ci

known

known

approx. statistically

Computing I∞(π’i, Ci )

feasible

feasible

approx. statistically Cannot run Ci with sub-distribution π’i

• So we first estimate I∞(πi, Ci ) statistically and then use the previous inequality to obtain bounds on I∞(π’i, Ci ). • Then, by input approximation technique, we obtain leakage bounds. • But the bounds are worse than when the analyst knows Ci . (See our paper for details) 10 September 2014

QEST 2014

27

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation

By using the extension of our tool leakiEst

• Summary & future work 10 September 2014

QEST 2014

28

E.g. Sender anonymity in Crowds protocol Crowds protocol

(on mobile ad-hoc network)

Network topology changes frequently

• For anonymous communication, participants forward messages through others. 0.3 0.7

Initiator

0.3

0.7

0.3

Attacker

Server

(attacker) 0.3

Corrupted (attacker)

• knows the network topology • controls the server & corrupted participants • can guess the initiator only twice (2-tries attack)

2-tries g-leakage

(anonymity of initiator)

• Attacker makes 2 guesses w1, w2 about initiator x.

• gW2({w1, w2}, x) =

10 September 2014

1 if x ∊ {w1, w2} 0 otherwise QEST 2014

29

E.g. Sender anonymity in Crowds protocol Crowds protocol

(on mobile ad-hoc network)

Network topology changes frequently

• For anonymous communication, participants forward messages through others. 0.3 0.7

Initiator

0.3

0.7

0.3

Attacker

Server

• We model the network topology as channel Ci.

Corrupted

• We consider a repeated executions of Crowds (with possibly changing topologies), and model it as a composed channel C1 || C2 || … || Cn.

(attacker) 0.3

(attacker)

• knows the network topology • controls the server & corrupted participants • can guess the initiator only twice (2-tries attack)

2-tries g-leakage

Results

(anonymity of initiator)

• Attacker makes 2 guesses w1, w2 about initiator x.

• gW2({w1, w2}, x) =

10 September 2014

1 if x ∊ {w1, w2} 0 otherwise QEST 2014

30

E.g. Randomly generated channels • Component: Randomly generated 10×10 channel matrix C. • The analyst does not know C.

Accuracy MEL of 6 channels composed

Far from uniform dist

Far from uniform dist When small probabilities are removed too much.

Time

(using our extension of the tool leakiEst) • Calculating true leakages: Exponential (in the number of component channels)

• Calculating our leakage bounds: Linear (in the number of component channels) 10 September 2014

QEST 2014

31

Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014

QEST 2014

32

Summary • Compositionality results of “g-leakage”. • Improvement of these results by input approximation technique.

• Evaluation of our results • by an example of the Crowds protocol, • by randomly generated channels.

Future work • To find how to determine parameters to obtain the optimal bounds.

• To explore possible relationships between our input approximation and “smooth entropy” in the information theory. • Compositionality results for more complicated compositions (ongoing)

10 September 2014

QEST 2014

33

Thank you for your attention.