[Cloud Networking Notes] Week4

CDN (Content Distribution Network)

为什么要CDN?

要为页面加载加速,可以使用static caching的方法。然而静态缓存的方法要面对两个大问题:

  • Volume and diversity of content
  • Dynamic content, encrypted content

CDN可以解决这些问题。

如何实现CDN?

实现CDN大致分为三步:

  1. 将内容服务器(content server) 分布在全球。
  2. 将所有服务器与原始服务器连成一个网络(Network the sites and the origin)。
  3. 将客户端定向到合适的服务器 (Direct clients to appropriate servers)。

关于第二步

下两图比较了不使用CDN使用CDN 的TCP传输。首先看下图,客户端光是跟原始服务器握手就已经浪费了很长时间。

然后再看看使用CDN的情况。可见位于亚洲的content server有点像是一个origin server的代理,这下是不是快多了?!

关于第三步 (Direct clients to appropriate servers)

影响服务器选取的因素有两点:

  • Estimated latency to the client from different locations
  • Load at different CDN locations

下面解释为“距离”,其实应该是一个权重因子。

把域名换成cdn模式的URL。

[Cloud Networking Notes] Week4_第1张图片

然后利用域名解析,找到并返回一个离客户端最近的content server。

[Cloud Networking Notes] Week4_第2张图片

注意到上面是利用本地DNS的位置信息作为地理定位的。存在一个问题,如果客户使用的是公共DNS呢?例如,google public DNS (8.8.8.8),这时无法获知 Local DNS 的位置,怎么定位呢?
Why Google public DNS(8.8.8.8)’s ping latency so low?

8.8.8.8 is not one host. Instead it’s an anycast address which routed to the nearest host out of many locations around the world.

Client Connectivity

Bandwidth vs. Latency

带宽固然重要,特别对于video-based的应用。然而,超高清(Ultra-HD)视频也只用15Mbps的带宽,如今光纤时代早已毫无压力。所以,时延才是影响用户体验的瓶颈。

来看一篇测量博文,时延成为web性能的瓶颈。
Latency: The New Web Performance Bottleneck

However, when it comes to your web browsing experience, it turns out that latency, not bandwidth, is likely the constraining factor today.

Just for Fun

两岸三地网速大比拼,中国队弱爆了:

[Cloud Networking Notes] Week4_第3张图片

全球网速大比拼,思密达碉堡了,均速23.3Mbps:

[Cloud Networking Notes] Week4_第4张图片

测量报告网址 trends-visualizations-connectivity-global-heat-map-internet-speeds-broadband-adoption

Coping With Network Performance:

Application-layer Tweaks for Lower Latency

先来算算两个问题

Question 15
Suppose you have a large pool of identical servers hosting a replicated service. Further, assume that the request-response time for the service is random, with the following distribution: 10ms for 99.8% of the requests, but 1 second for the remaining 0.2%. Assume requests are completely independent. If you make 100 requests in a batch in parallel, what is the probability that your batch of requests takes 1 second?

解:

P=1(0.998)100=0.18

所以,当一个批处理中有100个并行请求,将有 18%的可能会有1秒的延时,相当严重啊!

Question 16
Now, let’s try to alleviate the problem we see above – the above chance of a request-batch taking 1s is very poor! In the same scenario, suppose that to speed up our request-batch, we start a timer immediately after issuing the 100 requests in the batch and then replicate all the queries for which we don’t get responses within 10ms. What is the likelihood that we make 3 or more replicated requests? Assume that we only make one batch of replicated queries, and don’t do the replication for the replicated queries themselves, etc. Also, you can use this binomial calculator tool if you like.

解:
X 为重复发送请求的次数。

P(X3)
=1P(X<3)
=1(C1001000.998100+C991000.998990.002+C981000.998980.0022)

所以,重复三次以上的概率仅为 0.1%。所以,重复多次的可能性非常低,也即,请求达1秒的可能性也就大为下降!

Question 17
Further, in the above scenario (with the replication after 10ms), let’s assume that we have to replicate exactly 3 requests after 10ms. What is the likelihood that our batch of requests still takes 1 second?

你可能感兴趣的:(CDN,数据中心,networking)