Anatomy of Cloudflare’s CloudBleed: what you need to know and fix
This post gathers what you need to know, and what you need to do, if you use CloudFlare, or if you personally used a website using CloudFlare.
CloudFlare is a CDN-like technology, deployed over 2 million of websites, accelerating traffic and providing protection against denial of service attacks (DoS), as well as security features such as a Web Application Firewall.
A CDN (Content Delivery Network) is in charge of accelerating traffic by delivering static assets faster.
Unlike most CDNs, CloudFlare redirects the entire traffic to inspect it and modify it on the fly. The traffic is routed to CloudFlare network and inspected on the go.
Recap of Cloudflare’s Cloudbleed security bug
Tavis Ormandy, from Google’s Project Zero, revealed on 2017-02-19 a critical security breach on CloudFlare servers.
The bug implies that web pages fetched through Cloudflare can embed any data from any other Cloudflare user (more details in the technical analysis below).
Here is a timeline of the events:
- 2016-09-22 – First deploy of the vulnerable parser → First potential leak of data
- 2017-01-30 – Migration to a new more vulnerable architecture → Leak more prominent
- 2017-02-18 – Report by Google & fix 7h later → Leak stop
Technical analysis of Cloudbleed
Use case: a client requested a resource on a service using Cloudflare. Cloudflare needed to parse the HTML to rewrite HTTP links to HTTPS or inject Google Analytics tags etc.
The HTML parser used by Cloudflare involved C code that suffered from a buffer overrun bug.
This bug implied that the HTML sent to a user could include data coming from any previous requests handled by the server, so possibly coming from any other user.
Let’s take an example:
- A website (let’s call it website A) is protected with Cloudflare
- You browse to website A and you log in.
- Cloudflare handles your request. Thus, your inputs are stored in their server’s memory
- Later, Google crawls any websites using Cloudflare. Your previous inputs may be included in any response sent back by Cloudflare.
- The Google crawler stores the retrieved data in the Google search result caches.
- Your inputs are now publicly available.
Cloudflare is a distributed CDN, thus has different server caches. This implies that not all data can be shared with all customers, but the complexity of this architecture makes it hard to predict what kinds of mixing could occur.
Status and scope of the attack
Since the website leak, Google & most major US-based search engine are now cleaning caches containing leaked data.
All the websites protected by CloudFlare were affected since September 2016. The amount of potentially leaked information is huge. By design, the leaked bytes were not encrypted and were immediately usable for anyone that was lucky enough to find it. Even now, people looking at search engine caches are still finding passwords, various tokens, and even credit cards numbers…
The bug at Cloudflare is now fixed, but the leaked data will potentially be available for a very long time in caches: search engines caches, companies HTTP caches, ISP HTTP caches… Hence until all these caches are cleared or overridden, they may contain any piece of data that went through the Cloudflare network.
In the next coming weeks, we can expect people extracting data from these caches in order to exploit sensitive information.
As an Internet user, how am I impacted?
If you are the user of websites using Cloudflare (a list can be found here, but famous ones are Uber, 1Password, Hacker News, …) here are some example of potentially leaked data:
- HTTP requests content
- Credentials / Authentication token
- Personally identifiable information
There is, unfortunately, no way to know for sure if your data was exposed.
So, what can you do?
CloudFlare user checklist
To make sure none of your account gets compromised, you need to:
- Reset your passwords on the impacted websites you connected to between the 2016-09-22 and the 2017-02-18
- Reconfigure your device based 2FA (e.g. Google Authenticator, but not text message based 2FA) on websites if you configured it between the 2016-09-22 and the 2017-02-18
- If you are using any other secret from these websites (e.g. a Slack token, secret question, etc.), you also need to replace it.
As a developer using Cloudflare on my website, what should I do?
If the website you own is using Cloudflare, your users, and therefore your application, is at higher risk.
Everything you provide to your users needs to be considered at risk:
- Technical data: API keys, API tokens, URL based secrets to your users,
- User credentials.
If you are using 3rd party services that rely on Cloudflare (performance monitoring, exception management, log storage, business data…), this may also put your infrastructure at risk.
My website uses HTTPS, does it mean I’m safe?
No. Cloudflare decrypts your SSL traffic just as it reaches CloudFlare’s infrastructure. All the data processed by Cloudflare and hence impacted by this bug are impacted, whether you use HTTPS or not.
CloudFlare application owner checklist
- If any service in your company uses Cloudflare, ask your colleagues to reset their password.
- Tell your users their data has been potentially compromised.
- Ask your users to change their passwords if they logged in between the 2016-09-22 and the 2017-02-18. If you handle highly sensitive information, you may want to proactively lock their accounts and send them a mail to reset their password.
- User using 2FA that did the setup between the 2016-09-22 and the 2017-02-18 should redo their 2FA setup.
Cloudflare’s “Cloudbleed” is a major security issue that affected millions of Internet users and application developers. We explained the Cloudbleed bug and its’ implications for internet users and developers regarding data privacy and security.
Analyzing the root cause of this issue raises the question of redirecting web app traffic to try to protect them against attacks.
CDNs and web application firewalls are great solutions to mitigate risks such as denial of service (DoS, DDoS). But there are better ways to protect applications against application vulnerabilities without redirecting traffic.
At Sqreen, we enforce security from inside the application.
Sqreen relies on existing frameworks for parsing the incoming HTTP requests. This has many advantages:
- Performance: parsing and decoding is only done once, by the application
- Security: no new code (e.g. parser) is added. Thus the security is not lowered by new code. Security checks are performed on the data that application are processing, not on raw HTTP requests.
- No false positives: having the application context allows Sqreen to only block attacks and not legitimate traffic.