slides Wu final

On Balancing Middlebox Set-up Cost and Bandwidth Consumption in NFV Jie Wu (PhD student: Yang Chen) Center for Networked...

2 downloads 115 Views 8MB Size
On Balancing Middlebox Set-up Cost and Bandwidth Consumption in NFV Jie Wu (PhD student: Yang Chen) Center for Networked Computing Temple University, USA

1. Introduction of Middlebox l Network Function Virtualization (NFV) ¡

Technology of virtualizing network functions into software building blocks

l Middlebox: software implementation of network services ¡

Improve the network performance: l Web proxy and video transcoder, load balancer, …

¡

Enhance the security: l Firewall, IDS/IPS, passive network monitor, …

l Examples

Web Proxy

Firewall

NAT

Middlebox Dependency Relations [1] l Multiple middleboxes may/may not have a serving order ¡

Examples l

Firewall usually before Proxy

l

Virus scanner either before or after NAT gateway

l Categories ¡

Non-ordered middlebox set

¡

Totally-ordered middlebox set (service chain)

¡

Partially-ordered middlebox set

[1] Dynamic Service Function Chaining in SDN-Enabled Networks with Middleboxes (ICNP ’16)

Middlebox Traffic Changing Effects [2] l Middleboxes may change flow rates in different ways ¡

Citrix CloudBridge WAN accelerator: 20% (diminishing)

¡

BCH(63,48) encoder: 130% (expanding) Checksum

Data 1

0

1

1

1

0

1

0

[2] Traffic Aware Placement of Interdependent NFV Middleboxes (INFOCOM ’17)

Middlebox Placement Overview l Problem ¡

Placing middleboxes to satisfy all flows’ middlebox service requests

l Objectives: ¡

Minimizing middlebox setup cost

¡

Minimizing bandwidth consumption

[3] [2]

l Constraints ¡

Dependency relations

¡

Traffic-changing effects

¡

Vertex capacity and middlebox processing volume

[2] Traffic Aware Placement of Interdependent NFV Middleboxes (INFOCOM ’17) [3] Provably Efficient Algorithms for Joint Placement and Allocation of Virtual Network (INFOCOM ’17)

A Middlebox Placement Model [4] l Cost

Setup cost

m

Communication cost

f2 l Objective ¡

f1 Minimizing sum of middlebox setup cost and communication cost

l Two special cases ¡

Facility location problem l Single middlebox placement

¡

Generalized assignment problem l Each middlebox has a limited processing volume l Placing middleboxes and assigning to flows

[4] Near Optimal Placement of Virtual Network Functions (INFOCOM ’15)

A Service Chain Model [2] l Objective ¡

Minimizing the total bandwidth consumption

l Solutions ¡ ¡

m1

Consider traffic-changing effects Place middleboxes for a single flow

m2

m3

m1

m2

m3

Non-ordered

Totally-ordered

(Optimal greedy: sort

(Optimal DP: latter

traffic-changing ratios in increasing order)

middleboxes must be after front ones)

m1

m2

m3

Partially-ordered (NP-hard: reduced from the Clique Problem)

[2] Traffic Aware Placement of Interdependent NFV Middleboxes (INFOCOM ’17)

2. Our Model l Problem ¡

Placing middleboxes to satisfy all flows’ network service requests

l Network service requests ¡

Multiple middleboxes l Middlebox set with or without dependency relations

l Cost ¡ ¡

Middlebox setup l Sum of middlebox setup cost Bandwidth consumption l Sum of each flow’s bandwidth consumption cost on each link

l Objective ¡

Minimizing total cost of middlebox setup and bandwidth consumption

A Motivating Example m: 0.8 (diminishing) m’: 1.3 (expanding)

Independent middleboxes

Dependent middleboxes: m’ before m

Red flow with high rate

A flow covered by multiple middleboxes (Multiple coverage: when additional setup cost is less than the reduced bandwidth consumption cost)

3. Problem Formulation l Middlebox setup cost ¡

l

cm: unit setup cost of middlebox m

l Bandwidth consumption cost ¡

l

w(bfe): bandwidth cost function of flow f on link e

¡

l l

rf: initial traffic rate of flow f : traffic-changing ratio of middlebox m

l Objective ¡

Minimizing c1 +c2

Problem Formulation (cont’d) l Translog bandwidth cost function on each link

l Reasons ¡

Widely used in Cisco EIGRP and OSPF protocols

¡

Log-linear for easy calculation

l The weight of setup cost and bandwidth consumption ¡

Adjusting the traffic-changing ratios and unit setup costs of middleboxes

Problem Complexity NP-hard l

Even with no traffic-changing effects

l

Even when placing a single middlebox

Proof l

Reduction from set-cover problem

l

Use minimum number of middleboxes to “cover” all flows ¡ Flows as elements: F= {f1,f2,…, f|F|} ¡ Placed middleboxes as sets: {S1, S2,…} ¡ S1={f1, f2, f4}, S2={f1, f2}, S3={f3}

Problem Complexity (cont’d) Yang Chen and Jie Wu

l In this paper, we focus on tree-structured topologies 2 (0, 1) implies the bandwidth ey do not in�uLeft Right mplies that the Triangle Triangle idth consumpe calculated as

m)

(a) Complete tree. data centers 8 Tree-based e (6)

(b) Hierarchical data centers center. Hierarchical data

Figure 2: Tree-structured topologies in data centers.

e simplify the Each triangle is st of each �ow mostly a perfect topologies like Fat-tree [1] or BCube [11] in data centers. More genor complete tree erally, data centers always use symmetric, hierarchical topologies e (7) m 8 to balance tra�c load [15]. Because of the bi-directional links, the Complete tree Perfect tree topology can be abstracted as two connected, complete trees, as on by the selecshown in Fig. 2(b). The up and down links separate a hierarchical

4. Placing a Single Middlebox Solution l

Local Greedy Algorithm (LGA)

Steps l

Calculate each total cost of placing middleboxes in a whole level

l

Select the level with the minimum total cost

l

Iterative implementation ¡ From top level to bottom, total costs will decrease and then increase ¡ Select the level with the local minimum

4. Placing a Single Middlebox (cont’d) Time complexity (|V|: #node) ¡

O(|V|)

Optimal for perfect tree topologies ¡

Symmetry of placement

¡

No multiple “coverage” situation

Also optimal for complete tree topologies ¡

Also multiple “coverage” situation

¡

The most unbalanced traffic distribution: left and right subtrees of root have a depth difference of 1

Illustration Total cost

Calculate level by level

Optimal

Level

5. Placing Multiple Middleboxes Non-ordered middlebox set placement l Solution ¡

Combined Local Greedy Algorithm (CLGA)

l Insight ¡

Place each middlebox independently by applying LGA

l Time complexity (|V|: #node, |M|: #middlebox) ¡

O(|V||M|)

l Optimal for complete trees

Totally-ordered Middlebox Set Placement l Solution: Dynamic Programming (DP) l Works for infinite and finite vertex capacity l OPT(i, j) ¡ ¡

Minimum cost of subtree with root vi when placing first j middleboxes in the set if capacity is not enough (k+1)th to jth middleboxes First j middleboxes

. -

-

capacity (in�nite as a special case). Each vertex is numbered sequentially by BFS and each middlebox is numbered in service chain order. Suppose n = |V |. OPT(i, j) denotes the minimum cost of the placement in tree with the root i when we have placed the �rst j middleboxes for all paths from leaf nodes to i . If the capacity is not enough to place j middleboxes, the cost is 1. The optimal gives a recursive formula in the left triangle: lsubstructure Left triangle

Dynamic Programming Formulation 8 min {OPT(2i, k ) + OPT(2i + 1, k ) > > > 0k j > > P P > n c. > > + c + blog ic }, 1  i  b > l l 2 > > > k > > P > < cl + blog icr f , b n2 c < i  n. OPT(i, j) => > 0l j > > > > > > > 1 if not enough node capacity. > > > > > > > > otherwise. : 0

Left subtree

Right subtree

(8)

After placing in the left triangle, we place the remaining middlein thetriangle totally-ordered set in the right triangle in the reverse lboxes Right order of the service chain. The nodes are also numbered by BFS. Similarformula to thefor left The ¡recursive thetriangle’s placementformulation in the right triangle is shown in Eq. (9). Little di�erence exists between Eq. (8) and Eq. (9). For simplicity, we only discuss Eq. (8) in the following. P 8 min {OPT(2i, k ) + OPT(2i + 1, k ) + cl > > > > k 0k j

Bandwidth consumption

Newly placed middleboxes

An Example m1

m2

m3

Traffic-changing ratio 0.5 Setup cost 0.2

0.8 0.4

1.1 0.3

l Dependency relations ¡ m1 ->m2 ->m3 l Initial traffic rate ¡

r1=r2 =r3 =1

Totally-ordered Middlebox Set Placement (cont’d) Insights l

The optimal placement with root vi by placing first j and its two subtrees by placing no more than j middleboxes

Perfect tree ¡ ¡

Transformed to a line Similar to a single flow placement

Complete tree ¡

No multiple “coverage” situation

Time complexity (|V|: #node, |M|:#middlebox) ¡

O(|V||M|3 )

Partially-ordered Middlebox Set Placement l NP-hard even for a single flow

[2]

l One heuristic solution ¡

Insight l

¡

Steps l l

¡

Transform into a totally-ordered middlebox set ( : traffic-changing ratio)

Treat middleboxes with dependencies as a single middlebox Sort middleboxes in increasing order of

Example

m1 m2 m3 m4 0.7 1.1 0.8 1.1

m5 0.5

m6 1.4

l

Middlebox set

l

Dependency relationship m2 -> m3, m4 -> m5-> m6 m1 0.7

m4

m5

m6

1.1*0.5*1.4=0.77

m2

m3

1.1*0.8=0.88

[2] Traffic aware placement of interdependent NFV middleboxes (INFOCOM ’17)

Partially-ordered Middlebox Set Placement (cont’d) l Another heuristic solution ¡

Insight l

¡

Steps l

l

¡

Transform into a non-ordered middlebox set Treat middleboxes with dependencies as a single middlebox by a topological order No dependency relations among new middleboxes

Example

m1 m2 m3 m4 0.7 1.1 0.8 1.1

m5 0.5

m6 1.4

l

Middlebox set

l

Dependency relationship m2 -> m3, m4 -> m5-> m6 m1

m2

m3

0.7

1.1*0.8=0.88

m4

m5

m6

1.1*0.5*1.4=0.77

6. Handling Heterogeneous flows for Non-ordered Middlebox Set l Group Flows by Initial Bandwidths (GFIB) ¡

Group flows by initial traffic rates (rf : f’s traffic rate) l

#group:

l

The traffic rate range of the ith group:

¡

Treat flows in each group as homogeneous

¡

Apply CLGA for each group

l An example Combine CLGA

r1=1

CLGA

r2=2

r3=3

max rf= 7 min rf= 1 Group 1: [1,2) Group 2: [2,4) Group 3: [4,8)

CLGA

r4=4

r5=6

r6=7

6. Handling Heterogeneous Flows for Non-ordered Middlebox Set (cont’d) l Time complexity

l Performance-guaranteed algorithm ¡

Approximation ratio

[5]

:

[5] On Optimal Scheduling of Multiple Mobile Chargers in Wireless Sensor Networks (MSCC ’14)

7. Simulation l Our algorithms ¡

¡

LGA l

Single middlebox

l

Select the level with the minimum cost

CLGA l l

¡

DP l l

¡

Non-ordered middlebox set Apply LGA independently Totally-ordered middlebox set Dynamic programming

GFIB l l l

Heterogeneous flows Group flows by initial traffic rates Combine placement by applying CLGA for each group

7. Simulation l Comparison algorithms ¡

Random-fit l Randomly place middleboxes until all flows are satisfied

¡

NOSP [2] l Place middleboxes in increasing order of traffic-changing effects for each flow from source to destination independently l For single middlebox or non-ordered middlebox set

¡

TOSP l l

[2]

Dynamic programming based algorithm for each flow independently For totally-ordered middlebox set with or without vertex capacity

[2] Traffic aware placement of interdependent NFV middleboxes (INFOCOM ’17)

Settings l Topology ¡

Perfect 5-layer binary tree for each triangle

l Facebook data center traffic trace ¡

Single-flow initial traffic rate: 1~6 Mb

l Middlebox set m1 Traffic-changing ratio 0.7 Setup cost 0.4

l Dependency relationship ¡

m2 -> m3 -> m1 -> m4

m2 0.8 0.6

m3 1.1 0.2

m4 1.2 0.8

0

2

3 4 5 Traffic rate (Mbps)

0

6

1

(a) Total cost value.

Simulation Results

(b

Chen Placement and Jie Wuwith Balanced NFV Yang Middlebox Set-up Cost and Bandwidth Consum Figure 7: Totally-ordered middlebox

30 20 10 0

1

2

3 4 5 Traffic rate (Mbps)

6

4 3 2

30 20 10

1 0

40

1

0 21

32 43 54 65 Traffic rate (Mbps) Traffic rate (Mbps)

6

25 20 15 10

Middlebox set-up cost

40

5

40 20

5 0

1

0 2 1 3 2 4 3 5 4 6 5 Traffic rateTraffic (Mbps) rate (Mbps)

T

50

GFIB CLGA NOSP NOSP Random-fit Random-fit 60

Total cost value

Middlebox set-up cost

Total cost value

50

80

30

CLGA LGA NOSP NOSP 50 Random-fit Random-fit

6

Middlebox set-up cost

LGA NOSP Random-fit

60

Total cost value

60

e r If

To

40 30 20 10 0

6

1

(a) Total cost value. (b) Middlebox setup cost. (a) Total cost value. setup cost cost.value. (a) Total Single middlebox Non-ordered middlebox set(b) Middlebox Bandwidth heterogeneity50 (b (LGA) (CLGA) (GFIB) set under ban Figure 4: Single middlebox under bandwidth Figure homogeneity. 5: Non-ordered middlebox Figure 8:set. Non-ordered

l

LGA costs 20.3% less than NOSP and 35.1% less than Random-fit.

Total cost value

eig e al ot s, A 1

1

40 30

setup cost of GFIB is much lower than only handle �ows one at a time. For each, we select one �ow and to it addressing the middlebox-sharing lfollower CLGA the even heavy traffic. 20 andperforms a Hadoop node [22].best More 80%with �owsuntil are less run than the algorithms all �ows have been selected. timal, GFIB is still worth applying to th than 6 Mbps. As a result, the tra�c rate (2) ranges from 1 randomly to 6 Mbps places middleboxes Random-�t on random nodes 10 lwith The performance of Random-fit is not steady. demonstrates the e�ciency and e�ectiv a stride of 1 Mbps in this paper. The of alluntil �owsallare the are “covered”. The middleboxes 1 onsources the paths �ows are In summary, the experiments verify nodesheterogeneous and their destinations are the root. Wetheir assume each about link sorted by tra�c-changing ratios. lleafFor flows, GFIB saves 36.9% and 34.0% ciency of our proposed algorithms in th is bidirectional and has enough bandwidth to hold all �ows, which compared to and NOSP and Random-fit. and in the shared-root-double-tree topo eliminates congestion ensures that of all �ows 7.3the routing Evaluation forisHomogeneous Flows Fig only considering bandwidth consumptio successful. This is because routing failure is not the concern in this We start with homogeneous �ows. Thesharing independent variable in �ows save middleboxes among paper. The selections of the parameters are based on [10]. x-axis is the uni�ed bandwidth for all �ows. a consumptio TakingFirst, bothwe theevaluate bandwidth For most cases, the capacities of all servers are in�nite since the dyn single type of middlebox placement usingusage LGA into in Fig. 4. All curves it is worth consideration, m the numbers of middleboxes are relatively small compared to the both m in Fig. 4(a) are increasing because heavier tra�c consumes more and CLGA can be used as e�cient, gree

Table 4: Di�erent dependencies with r f = 3 Mbps.

20 15 10 5 0

CLGA NOSP Random-fit

Totally-ordered middleboxes Total cost Set-up cost 0.8 ! 1.1 ! 0.7 ! 1.2 20.9 10.4 1.1 ! 0.7 ! 0.8 ! 1.2 23.7 12.0 0.7 ! 1.2 ! 1.1 ! 0.8 22.8 9.6 0.7 ! 0.8 ! 1.1 ! 1.2 11.9 4.4 24.7 10.2 ICPP’18, August 2018, Eugene, Oregon, USA1.2 ! 1.1 ! 0.8 ! 0.7

Simulation Results (cont’d)

1

2

Totally-ordered middlebox set

3 4 5 Traffic rate (Mbps)

6

Yang Chen and Jie Wu

(b) Middlebox setup cost. 60

DP TOSP Random-fit

Total cost value

dered middlebox set. 50 40 30

DPDP TOSP TOSP Random-fit 20 40 Random-fit

15

e. For each, 20 we select one �ow and ws have been selected. 10 ces middleboxes on random nodes 0 1 The 2 middleboxes 3 4 5 e “covered”. are Traffic rate (Mbps) ratios.

20

25 50

30

10 5

20

Middlebox set-up cost

6

25

Middlebox set-up cost Total cost value

Middlebox set-up cost

30

18 16

DP TOSP Random-fit

introduce a performance-guaranteed algorithm. Extensive simula tions show e�ciency and e�ectiveness of our algorithms.

14 12 10

REFERENCES

[1] A��F����, M., L��������, A., ��� V�����, A. A scalable, commodity data cente network architecture. SIGCOMM Comput. Commun. Rev. 38, 4 (Aug. 2008), 63–74 1 [2] C�����, 2 3 M., K������, 4 5 T., R���������, 6 6 2 2 3 3 4 4 5 5 6 6 R., ��� S������, S. Virtualizing the Traffic rate (Mbps) Traffic rate (Mbps) Traffic rate (Mbps) network forwarding plane. In Proceedings of the Workshop on Programmable (a) Total cost value. (b) Middlebox setup cost. Routers for Extensible Services of Tomorrow (New York, NY, USA, 2010), PRESTO (a) Total cost value. (b) Middlebox setup cost. Without vertex capacity With vertex capacity Middlebox order effect at 3 Mbps (DP) ’10, ACM, pp. 8:1–8:6. mogeneous Flows [3] C����. Cisco eigrp protocol. 6: Totally-ordered node Figure 7: Totally-ordered middleboxFigure set with node capacity. set without [4] C����. Cisco:capacity. Nat order of operation. ows. The independent variable in [5] C�����. Citrix cloudbridge product overview, 2015. [6] C����, R., L�����E����, L., N���, J. S., ��� R��, D. Near optimal placement o h for alll �ows. First, we evaluate a virtual functions. the dynamic programming, our proposed DP hasnetwork the lowest costInin2015 IEEE Conference on Computer Communication ment using 80 LGA in Fig. 4. All curves 50 GFIB (INFOCOM) (April 2015), pp. 1346–1354. GFIB both NOSP metrics. The cost is a little larger than that of the non-ordered NOSP consumes more use heavier [7] D����, S. IP over WDM: building the next-generation optical Internet. John Wiley l tra�c 40 Random-fit Random-fit 60 middlebox set. This is because the dependency relations make it & Sons, 2004. lowest cost value, and on average, F����������, S����, V., Y�, M., ��� M����, J. Flowtags: Enforcing network 30 more di�cult to place a single type of[8]middlebox atS.,its optimal NOSP and l4035.1% less than Randomwide policies in the presence of dynamic middlebox actions. In Proceedings of the location. We also �nd that when the tra�c becomes larger,Workshop the on Hot Topics in Software De�ned Networking IV, we state that LGA is optimal Second ACM SIGCOMM 20 (New York,itNY, USA, 2013),the HotSDN ’13, ACM, pp. 19–24. Random-�t algorithm performs much worse than does with middlebox 20 for homogeneous �ows [9] G������J�������, A., V����������, R., P������, C., G�����, R., K����� 10 independent set. The dependency relations limit the placement terms of middlebox setup, the cost J., D��, S., ��� A�����, A. Opennf: Enabling innovation in network function more, so the random placement needs morecontrol. middleboxes. Random0 In Proceedings of the 2014 ACM Conference on SIGCOMM (New York, NY it places 0a1 required middlebox in 1 2 3 4 5 6 2 3 4 5 6 USA, 2014), SIGCOMM ’14,the ACM, pp. 163–174. �t performs a little better in the second case. The limitation of Traffic rate (Mbps) e are 16 leaf nodes,Traffic NOSP rateneeds (Mbps) 16 [10] G�������, R. Network Functions Virtualisation: An Introduction, Bene�ts, Enablers server capacity eliminates .4 = 6.4. This is(a)the largest number Total cost value. (b) Middlebox setup cost. the location possibilities Challenges of andRandom-�t’s Call for Action, Introductory white paper. 2012 SDN and OpenFlow middleboxes. As a result, the performanceWorld di�erence among Congress, 2012. these e that all �ows are “covered”. Hence, [11] G��, C., L�, G., L�, H., Z����, X., S��, Y., T���, C., Z����, Y., ��� L� three methods is decreased, especially when the tra�c rateD.,isW�, large. 0 10 1 1

8

Limited vertex capacity increases the minimum cost. Middlebox set-up cost

Total cost value

The total cost is larger than the non-ordered middlebox set.

The order of a middlebox set matters not only for total cost but also for set-up cost.

OSP is Figure the largest. The middlebox 8: Non-ordered set under bandwidth heterogeneity.

8. Conclusion and Future Work l Middlebox constraints ¡

Traffic-changing effects Dependency relations

¡

Flow sharing

¡

l Middlebox placement ¡

Balancing middlebox set-up cost and bandwidth consumption

l Tree-structured topologies ¡

Optimal algorithms for homogeneous flows

¡

Performance-guaranteed algorithm for heterogeneous flows

l Future work ¡

General tree-structures

nd Jie Wu

Other Service Chain Models

g, Temple University, USA wu}@temple.edu

d1 = 1

d2 = 2

Processing time s

f

s'

f'

m1

m2 m3

m1

m2 d1 ...

m3 d2 ...

1 Anf illustrating f' Fig. example. m1 f m1: f' fm' 1

d d' ff'

f

Xm2f

m2f ' ff ' f ' m2 f'f f XXMiddleboxes XX m1 m2 f m 3 m3 f m3 f' f f Flows f XX f m3 ' f' X' 3 4 5 d1 fd2 d1 d2 d1 d2 d1d1 d2d2 d1 d2 0 f (a) f 0 before4 f 0 . 3 2 (b) ff0 . before f . (a) f before f . (b) f 0 before

. . . .. .. .. .. .. .. .. .. . . .. . . .. .. .. .. .. .. .. ... .. . . . .

prolong m1

p1j

m1

d1 j p1

m

p2

(a) Twoser o (a) Two ordered

t’ = 12 t = 10 f before f’ f‘ before f TABLE I: Processing times. scheduling orders. to Fig. 2: Different conformconform to a stric Fig. 2: Different scheduling orders. [12],prop th In [12],Inthey • Minimizing the makespan tc.) of middleboxes according to online changing traffic. in middlebox m2 be cannot less the completion of the chainin in middlebox m2 cannot less be than thethan completion time of time the chaining of net 0 little attention to the flow scheduling However, •they pay 0 Minimizing f in middlebox theddelay d1 , iswhich + 1 = 5. resource al the completion time 1 plus f in middlebox m1average plusmthe delay 4 + 1is=4 5. 1 , which resource allocation equence of the middleboxes, resulting in a topoor control 0 of f 0 in The same situation happens the completion time The same situation happens to the completion time of f in constrained constrained progra of the flow completion time. The flow completion time is middlebox m , which is 5 + 2 = 7. This illustrates that the

Q&A