Scaling CloudFlare’s Massive WAF

转载地址:http://www.scalescale.com/scaling-cloudflares-massive-waf/
Posted by    on  December 29th, 2014.  1 Comment
CloudFlare #statporn

14,000 blocked reqs/sec
1.2 billion blocked reqs/day
Goal: exec all rules <= 1ms
actual execution ~400µs
1,937
 string matches
5,682 general rules
102 Cloudflare Rules

Application

HTTP Server nginx
App Server: OpenResty
JIT Compiler: LuaJIT

Algorithms

String Matching: Aho-Corasick

Rules

Open Rules: OWASP

System Profiling

FlameGraph: SystemTap generated
Real Time Analyzing: Nginx SystemTap Toolkit
profile

I first heard John speak at the Nginx.Conf conference in San Francisco. He’s done an amazing job explaining a large scale, high volume WAF (Web Application Firewall) platform that he and his colleagues have built. In this interview he’ll explain design goals, benchmarking, testing and WAF rule new roll outs. The story here is really about performance and scale by optimizing every last drop out with Nginx and LUA. Enjoy.

–Chris / ScaleScale / MaxCDN

John is an Engineer at Cloudflare that designed their WAF.

What is the vision behind the WAF?

CloudFlare wants to provide a WAF to a very large number of customers. To do so meant two things: being compatible with the existing mod_security WAF so that we could leverage existing rulesets and allow people familiar with mod_security (both CloudFlare people and customers) to write new rules.

How CloudFlare WAF Works

CloudFlare’s WAF stops attacks at the network edge, protecting your website from common web threats and specialized attacks before they reach your servers. It covers both desktop and mobile websites as well as applications.

The Web Application Firewall (WAF) works by examining HTTP requests to your website. It looks at both GET and POST requests and applies rules to help filter out illegitimate traffic from legitimate website visitors. You can decide whether to block, challenge or simulate an attack. With blocking and challenging, CloudFlare’s WAF will block any traffic identified as illegitimate before it reaches your origin web server.


CloudFlare’s Web Application Firewall (WAF) automatically protects your website from these types of attacks:

• SQL injection, comment spam • Cross-site scripting (XSS)
• Distributed denial of service (DDoS) attacks • Application-specific attacks (WordPress, CoreCommerce)

Testing CloudFlare’s XSS Protection

Using www.jgc.org, it’s very easy to see the CloudFlare WAF in action. Using a simple GET operation with a dummy variable that contains a basic XSS script will trigger the security feature and show a page saying that you have been blocked.

Request Headers

GET /?user=<script>alert("test")</script> HTTP/1.1
Host: jgc.org
Connection: keep-alive
...

Response Headers

HTTP/1.1 403 Forbidden
Date: Wed, 10 Dec 2014 06:56:35 GMT
Content-Type: text/html; charset=UTF-8
...

Click here to see the error screen generated by the WAF

Where did the initial and new rules come from?

We use both the open source OWASP ruleset plus we developed our own internal rules based on attack traffic against CloudFlare customers. Today the majority of blocked requested are being stopped by our custom rules.

We develop rules internally based on attacks or vulnerabilities and then build a test suite (positive and negative tests to ensure that the rules are blocking only what we want). We have a large automatic test suite for the WAF which gets run across the entire rule set to ensure that it’s working correctly.

Recently added WAF Rules
Description Exploit Blog Post
Drupal 7 sql injection SA-CORE-2014-005 Drupal 7 SA-CORE-2014-005 SQL Injection Protection
Shellshock Shellshock (software bug) Inside Shellshock: How hackers are using it to exploit systems
Shellshock protection enabled for all customers
WHMCS Zero Day Vulnerability WHMCS Security Advisory for 5.x Patching a WHMCS zero day on day zero
Protect Your Sites With Rapidly Deployed WAF Rules

We process all requests. GETs, POSTs, etc. and the bodies that go with them. We have a custom routine inside the WAF that looks at POST data (for example) and identifies it by both the MIME type and by sniffing the actual bytes looking to see what the data is.

The WAF is not enabled for all customers. Only paying customers receive the WAF.

We work with our customers to define site specific rules for them and regularly put in place WAF rules to block site specific attacks. In future, we plan to roll out a user interface where customers can write and upload their rules for their sites.

Is speed important to you? What is your philosophy?

Yes, speed matters enormously because of the scale of CloudFlare and because part of our service is performance. We have a variety of benchmarking tools but perhaps more important is our metrics system that allows us to examine real-time and historical performance information (including WAF performance).

Our goal is to run on average in under 1ms for each request being processed by the WAF. Currently we are in the 100s of µs (10th’s of milliseconds) per request. As an example, in the last 24 hours we have blocked 1.2 billion HTTP requests (that’s about 14,000 per second).

#statporn
 14,000 blocked reqs/sec  1.2 billion blocked reqs/day
• Goal: exec all rules <= 1ms • actual execution ~400µs
 1,937 string matches  5,682 general rules
 102 Cloudflare Rules  

When you first launched, what kind of latency did you see?

When the code was first written and tested we were seeing about 10ms latency on a laptop machine. That was optimized using techniques like function memoization and then some architectural changes (mostly the elimination of the use of closures) and the latency was close to 1ms. After that the WAF was put into production and work was done using systemtap and internal tools to analyze LuaJIT and PCRE performance. We worked closely with Mike Pall (the LuaJIT maintainer) to ensure that WAF-specific functions we need are JITed.

Using LuaJIT is night and day. We would not ever use lua itself in production. LuaJIT is way more performant than Lua on x64 hardware (see http://luajit.org/performance_x86.html).

How do you speed things up and look for slow execution?

For the initial tuning of the WAF code we used Lua-based profiling tools (and wrote one ourselves) to look at performance of the Lua code that implements the WAF. Once in production we usedsystemtap and flamegraphs to identify hotspots and optimize them. When launching into production, we did not need to change anything in our physical infrastructure. We did not purchase or use any new hardware. The WAF is mostly CPU intensive.

< 1ms Latency

Before we implemented the new WAF, CloudFlare has been running Apache alongside nginx just to be able to use mod_security. This combination was very slow and cumbersome. Ultimately it didn’t scale with CloudFlare’s growing business so we started working on a new WAF using nginx + LuaJIT.

CloudFlare is operating one of the world’s largest deployments of nginx + LuaJIT. Every fraction of a microsecond that can be shaved off for processing a request has significant impact so we decided to sponsor some changes to the LuaJIT opensource project.

The overall goal of the project was to get the median WAF block/allow decision made under 1ms in real world scenarios. Optimizations were made by examining the WAF’s performance under a test harness with line-level timing information. We ran the WAF in CloudFlare’s network with very detailed systemtap-based instrumentation.

Information from the systemtap is fed into a pastebin which parses it and produces a flame graph showing where the code is running.

The flamegraphs early on showed extensive uses of closures which was causing slowness in LuaJIT. Some parts of the compiler were rewritten to remove their use and make it run faster.

Here’s another view generated from the same information which identified hot functions. Here it shows that string matching and regular expressions are the most expensive operations.

To make these matching functions run faster, We have implemented our own version of the Aho-Corasick algorithm. The Aho-Corasick algorithm is a fast string matching algorithm that can match a large set of keywords simultaneously against incoming text. The advantage of the algorithm is that it can match multiple strings in a single pass over a large body of text, compared to searching for the strings individually using the Boyer-Moore search which requires multiple passes over the text. In this article, the author shows how Aho-Corasick is implemented using Haskell. CloudFlare has also open-sourced a custom Aho-Corasick implementation in Golang and C++ with LUA.

Optimizations in the Lua language, the LuaJIT compiler and the WAF core meant that for a very fast and flexible all Lua WAF which runs within nginx’s core.

See an example in LUA »

Read & watch more about building a low-latency WAF inside NGINX using Lua

Watch John’s presentation on “Building a low-latency WAF inside NGINX using Lua” on YouTube. You can also download the presentation used in this video here.

Related Posts

  • Rolling Your Own CDN – Build A 3 Continent CDN For $25 In 1 Hour
  • The Making of OnMetal
  • News From Around the Web
  • Open Source Software to Help You Scale
  • Utilizing GPUs in The Cloud

你可能感兴趣的:(安全,web)