Compositionality Results for Quantitative Information Flow Yusuke Kawamoto École Polytechnique & INRIA, France
Joint work with Konstantinos Chatzikokolakis Catuscia Palamidessi
(École Polytechnique & CNRS, France)
(École Polytechnique & INRIA, France)
In QEST 2014
Background: Information flow • Goal: To avoid the leakage of secret information in systems
• Problem: There is almost always some leakage due to observable outputs
• In reality, it is difficult / impossible to prevent all secret information from being leaked. E.g. Receipt of a credit card displays the last 4 digits: **** **** **** 1234
Information leakage exists but is harmless.
• How can we say an information leakage is harmless or not?
10 September 2014
QEST 2014
2
Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..
– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y. C
X Secret input
10 September 2014
System
Y Public output
QEST 2014
3
Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..
– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.
E.g. X High value
10 September 2014
C
Program
Y Low value
QEST 2014
4
Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..
– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.
E.g. X Secret key
10 September 2014
C
Cipher
Y Decryption time
QEST 2014
5
Background: Quantitative information flow • Quantitative information flow analysis – quantifies how much information about secret inputs is leaked by public outputs. [ Clark et al.’01] [Boreale’06] [Malacaria’07] [Kopf&Basin’07] [Chatzikokolakis’08] [Smith’09] …..
– models a system as an information-theoretic channel C that receives secret input from X & returns public output on Y.
E.g. X
C Anonymous communication
Sender of message
protocol
Crowds protocol
(on mobile ad-hoc network) 0.3
Y
0.7
Initiator
Observable messages
A participant forwards a message through others.
0.3
0.7
0.3
Server
(attacker) 0.3
Corrupted (attacker)
Channel matrix C describes Prob( recv | initiator )
X
Y Receive Receive Receive
from A from B from C initiator = A 0.3 0.7 0 initiator = B 0.35 0.3 0.35 initiator = C 0.35 0.35 0.3 10 September 2014
QEST 2014
6
Background: Quantitative information flow What the analyst should know in advance • Prior distribution π (on inputs) E.g. π [initiator = A] = 1/3, π [initiator = B] = 1/3, π [initiator = C] = 1/3
• Channel matrix C (Conditional probability dist. of outputs given secret inputs) E.g. C [initiator = A, Receive from A ] = 0.3, ……
10 September 2014
QEST 2014
7
Background: Quantitative information flow What the analyst should know in advance • Prior distribution π (on inputs) E.g. π [initiator = A] = 1/3, π [initiator = B] = 1/3, π [initiator = C] = 1/3
• Channel matrix C (Conditional probabilities of outputs given inputs) E.g. C [initiator = A, Receive from A ] = 0.3, ……
What the analyst wants to know • How much information on the secret input value is leaked by the output value. Prior distribution π
Channel C
Posterior distribution π C calculate
calculate
Prior uncertainty
Leakage = 10 September 2014
Many kinds of measures
Prior uncertainty
–
Posterior uncertainty
Posterior uncertainty QEST 2014
8
Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1
Prior dist π
π2
Component channel C1 Component channel C2
…
πn
Component channel Cn
Kinds of systems
Different kinds of composition π 1 C1 π 2 C2
Posterior dist π C
π n Cn
Small
Large known Large unknown
Input distribution π
known
known
known
Component channels Ci
known
known
approx. statistically
Computing Leak(πi, Ci )
feasible
feasible
approx. statistically By running Ci with πi many times
E.g. Tool leakiEst [Chothia et al.’13] 10 September 2014
QEST 2014
9
Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1
Prior dist π
π2
Component channel C1 Component channel C2
…
πn
Component channel Cn
Kinds of systems
π 1 C1 π 2 C2
Posterior dist π C
π n Cn
Small
Large known Large unknown
Input distribution π
known
known
known
Component channels Ci
known
known
approx. statistically
Computing Leak(πi, Ci )
feasible
feasible
approx. statistically
Computing C = C1 || C2 || … || Cn
feasible
unfeasible
unfeasible
Computing Leak(π, C )
feasible
unfeasible
unfeasible
10 September 2014
QEST 2014
10
Motivation: Info leakage of large systems Composed channel C ≝ C1 || C2 || … || Cn π1
Prior dist π
π2
Component channel C1 Component channel C2
…
πn
Component channel Cn
Kinds of systems
π 1 C1 π 2 C2
Posterior dist π C
π n Cn
Small
Large known Large unknown
Input distribution π
known
known
known
Component channels Ci
known
known
approx. statistically
Computing Leak(πi, Ci )
feasible
feasible
approx. statistically
Computing C = C1 || C2 || … || Cn
feasible
unfeasible unfeasible Lower & upper bounds by
Computing Leak(π, C )
feasible
our compositionality theorems unfeasible unfeasible
10 September 2014
QEST 2014
11
Contribution of our work • Compositionality results of “g-leakage”, a general notion of information leakage measures. • Improvement of these results by input approximation technique. • Evaluation of our results • by an example of the Crowds protocol, • by randomly generated channels.
10 September 2014
QEST 2014
12
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014
QEST 2014
13
Leakage measure: Min-entropy leakage
[Smith’09]
• MEL (min-entropy leakage) quantifies the information leakage under one-try guessing attacks. Attacker can guess the secret value (and learn whether the guess is correct) only once.
C
π calculate
V(π)
πC calculate
V(π, C)
MEL • Prior vulnerability
V(π) = (Maximum probability of guessing a secret)
• Posterior vulnerability V(π, C) = (Average probability after observing output from C) • MEL (Min-entropy leakage) I∞(π, C) = (– log V(π)) – (– log V(π, C)) Prior uncertainty 10 September 2014
QEST 2014
Posterior uncertainty 14
Leakage measure: g-leakage
[Alvim et al.’12]
• g-leakage quantifies information leakage under an operational scenario, specified by a gain function g. When the secret value is x and the attacker makes a guess w on x, then g(w, x) ∊ [0, 1] represents the gain of the attacker.
C
π calculate
Vg (π)
πC calculate
Vg (π, C)
g-leakage • Prior g-vulnerability
Vg(π) = (Average gain of the best guess)
• Posterior g-vulnerability Vg(π, C) = (Average gain after observing output from C ) • g-leakage
Ig(π, C) = (– log Vg(π)) – (– log Vg(π, C)) Prior uncertainty
10 September 2014
QEST 2014
Posterior uncertainty 15
Examples of g-leakage
[Alvim et al.’12]
Identity gain function gid • Attacker makes only one guess w about secret x. • gid(w, x) =
1 (if w = x) 0 (otherwise)
• g-leakage with gid quantifies the leakage under one-try guessing attacks. MEL (Min-entropy leakage)
k-tries gain function gWk • Attacker makes k guesses w1, w2, … , wk about secret x. • gWk ({w1, w2, … , wk}, x) =
1 if x ∊ {w1, w2, … , wk} 0 otherwise
• k-tries g-leakage quantifies the leakage under k-tries guessing attacks. 10 September 2014
QEST 2014
16
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014
QEST 2014
17
Parallel compositions of channels Parallel composition (with distinct input) C1 ×C2 x1 Distinct inputs
x2
Component channel C1 Component channel C2
y1 y2
Outputs
Parallel composition with shared input C1 || C2 x Shared Component channel C1 input xSpecial case of distinct inputs
x
10 September 2014
Component channel C2
y1 y2
Outputs
QEST 2014
18
Compositions of gain functions
• Attacker’s gain gi (wi, xi) depends on the secret xi and a guess wi about xi . • For each input domain Xi, we prepare a gain function gi. x1 y1 Gain function g1 on X1×W1
Gain function g2 on X2×W2
Component channel C1
x2
Component channel C2
y2
Outputs
• For the joint input domain X1×X2 , we need to define some composed gain function g . C1 ×C2 Gain function g on (X1×X2)×(W1×W2) (x1, x2)
x1
x2
Component channel C1 Component channel C2
y1 y2
Outputs
The only assumption: a joint guess (w1, w2) is worthless iff at least one of w1 or w2 is worthless. i.e. 10 September 2014
iff QEST 2014
19
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014
QEST 2014
20
Compositionality for g-leakage x1 Distinct inputs x2
C1 ×C2 Component ch C1 Component ch C2
(disjoint inputs)
y1 y2 Outputs
Some factor that depends on the correlation between the marginal distributions πi’s of the prior π
Theorem 1 M(π,g,g1,g2)
M(π,g,g1,g2)
for any jointly supported prior π . π1[x1] π2[x2] = 0 implies π [x1, x2] = 0
10 September 2014
QEST 2014
21
Compositionality for g-leakage x1 Distinct inputs x2
C1 ×C2 Component ch C1 Component ch C2
(disjoint inputs)
y1 y2 Outputs
Theorem 1 M(π,g,g1,g2)
M(π,g,g1,g2)
for any jointly supported prior π .
Corollary When the marginal priors π1 and π2 of π are independent:
Remark: • If the marginal priors πi’s are far from independence, the leakage bounds tend to be worse. • We re-obtain the compositionality for min-capacity [Barthe & Kopf’11]: 10 September 2014
QEST 2014
22
Compositionality for g-leakage Shared input x
x x
C1 || C2 Component ch C1 Component ch C2
(shared input)
y1 y2 Outputs
Theorem 3 K(π, g)
For any prior π :
Corollary (MEL) For any prior π : where
(Minimum of all non-zero probabilities)
Remark: • We re-obtain 10 September 2014
[Espinoza & Smith’13] QEST 2014
23
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014
QEST 2014
24
Input approximation technique • Our leakage bounds are not good when the prior π contains very small probability. (Minimum of all non-zero probabilities) Very small π[x] makes Hmin(π) very large.
• On the other hand, very small probabilities affect MEL only a little. Only maximum probability matters.
Input approximation technique
Only large probability matters. Reminiscent of “smooth entropy” [Cachin’97]
1. We calculate a sub-distribution π’ by replacing small probabilities with 0 in π.
2. We calculate bounds using our theorems for π’ . 3. We evaluate the errors introduced by replacing π with π’ : ε : Summation of all removed probabilities 10 September 2014
QEST 2014
25
Input approximation technique Example • Prior distribution π • Channel matrix C
• MEL (Min-entropy leakage) I∞(π, C10) Better than • True value = 0.1319 previous work on min-capacity • Our upper bound = 0.7444 • Min-capacity bound in [Barthe&Kopf’11] = 4.114 MEL ≤ Min-capacity 10 September 2014
QEST 2014
26
Compositionality of MEL
(unknown channels)
• We cannot directly apply this when analyst does not know component channels Ci.
Kinds of systems
Small
Large known Large unknown
Approximate input distribution π’
known
known
known
Component channels Ci
known
known
approx. statistically
Computing I∞(π’i, Ci )
feasible
feasible
approx. statistically Cannot run Ci with sub-distribution π’i
• So we first estimate I∞(πi, Ci ) statistically and then use the previous inequality to obtain bounds on I∞(π’i, Ci ). • Then, by input approximation technique, we obtain leakage bounds. • But the bounds are worse than when the analyst knows Ci . (See our paper for details) 10 September 2014
QEST 2014
27
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation
By using the extension of our tool leakiEst
• Summary & future work 10 September 2014
QEST 2014
28
E.g. Sender anonymity in Crowds protocol Crowds protocol
(on mobile ad-hoc network)
Network topology changes frequently
• For anonymous communication, participants forward messages through others. 0.3 0.7
Initiator
0.3
0.7
0.3
Attacker
Server
(attacker) 0.3
Corrupted (attacker)
• knows the network topology • controls the server & corrupted participants • can guess the initiator only twice (2-tries attack)
2-tries g-leakage
(anonymity of initiator)
• Attacker makes 2 guesses w1, w2 about initiator x.
• gW2({w1, w2}, x) =
10 September 2014
1 if x ∊ {w1, w2} 0 otherwise QEST 2014
29
E.g. Sender anonymity in Crowds protocol Crowds protocol
(on mobile ad-hoc network)
Network topology changes frequently
• For anonymous communication, participants forward messages through others. 0.3 0.7
Initiator
0.3
0.7
0.3
Attacker
Server
• We model the network topology as channel Ci.
Corrupted
• We consider a repeated executions of Crowds (with possibly changing topologies), and model it as a composed channel C1 || C2 || … || Cn.
(attacker) 0.3
(attacker)
• knows the network topology • controls the server & corrupted participants • can guess the initiator only twice (2-tries attack)
2-tries g-leakage
Results
(anonymity of initiator)
• Attacker makes 2 guesses w1, w2 about initiator x.
• gW2({w1, w2}, x) =
10 September 2014
1 if x ∊ {w1, w2} 0 otherwise QEST 2014
30
E.g. Randomly generated channels • Component: Randomly generated 10×10 channel matrix C. • The analyst does not know C.
Accuracy MEL of 6 channels composed
Far from uniform dist
Far from uniform dist When small probabilities are removed too much.
Time
(using our extension of the tool leakiEst) • Calculating true leakages: Exponential (in the number of component channels)
• Calculating our leakage bounds: Linear (in the number of component channels) 10 September 2014
QEST 2014
31
Outline of the talk • Background: Quantitative information flow • Leakage measures • Channel compositions • Main results: Compositionality for g-leakage • Improving the results by input approximation • Evaluation • Summary & future work 10 September 2014
QEST 2014
32
Summary • Compositionality results of “g-leakage”. • Improvement of these results by input approximation technique.
• Evaluation of our results • by an example of the Crowds protocol, • by randomly generated channels.
Future work • To find how to determine parameters to obtain the optimal bounds.
• To explore possible relationships between our input approximation and “smooth entropy” in the information theory. • Compositionality results for more complicated compositions (ongoing)
10 September 2014
QEST 2014
33
Thank you for your attention.