Thursday, July 20. 2017
Lately, some attention was drawn to a widespread problem with TLS certificates. Many people are accidentally publishing their private keys. Sometimes they are released as part of applications, in Github repositories or with common filenames on web servers.
If a private key is compromised, a certificate authority is obliged to revoke it. The Baseline Requirements – a set of rules that browsers and certificate authorities agreed upon – regulate this and say that in such a case a certificate authority shall revoke the key within 24 hours (Section 18.104.22.168 in the current Baseline Requirements 1.4.8). These rules exist despite the fact that revocation has various problems and doesn’t work very well, but that’s another topic.
I reported various key compromises to certificate authorities recently and while not all of them reacted in time, they eventually revoked all certificates belonging to the private keys. I wondered however how thorough they actually check the key compromises. Obviously one would expect that they cryptographically verify that an exposed private key really is the private key belonging to a certificate.
I registered two test domains at a provider that would allow me to hide my identity and not show up in the whois information. I then ordered test certificates from Symantec (via their brand RapidSSL) and Comodo. These are the biggest certificate authorities and they both offer short term test certificates for free. I then tried to trick them into revoking those certificates with a fake private key.
Forging a private key
To understand this we need to get a bit into the details of RSA keys. In essence a cryptographic key is just a set of numbers. For RSA a public key consists of a modulus (usually named N) and a public exponent (usually called e). You don’t have to understand their mathematical meaning, just keep in mind: They’re nothing more than numbers.
An RSA private key is also just numbers, but more of them. If you have heard any introductory RSA descriptions you may know that a private key consists of a private exponent (called d), but in practice it’s a bit more. Private keys usually contain the full public key (N, e), the private exponent (d) and several other values that are redundant, but they are useful to speed up certain things. But just keep in mind that a public key consists of two numbers and a private key is a public key plus some additional numbers. A certificate ultimately is just a public key with some additional information (like the host name that says for which web page it’s valid) signed by a certificate authority.
A naive check whether a private key belongs to a certificate could be done by extracting the public key parts of both the certificate and the private key for comparison. However it is quite obvious that this isn’t secure. An attacker could construct a private key that contains the public key of an existing certificate and the private key parts of some other, bogus key. Obviously such a fake key couldn’t be used and would only produce errors, but it would survive such a naive check.
I created such fake keys for both domains and uploaded them to Pastebin. If you want to create such fake keys on your own here’s a script. To make my report less suspicious I searched Pastebin for real, compromised private keys belonging to certificates. This again shows how problematic the leakage of private keys is: I easily found seven private keys for Comodo certificates and three for Symantec certificates, plus several more for other certificate authorities, which I also reported. These additional keys allowed me to make my report to Symantec and Comodo less suspicious: I could hide my fake key report within other legitimate reports about a key compromise.
Symantec revoked a certificate based on a forged private key
Comodo didn’t fall for it. They answered me that there is something wrong with this key. Symantec however answered me that they revoked all certificates – including the one with the fake private key.
No harm was done here, because the certificate was only issued for my own test domain. But I could’ve also fake private keys of other peoples' certificates. Very likely Symantec would have revoked them as well, causing downtimes for those sites. I even could’ve easily created a fake key belonging to Symantec’s own certificate.
The communication by Symantec with the domain owner was far from ideal. I first got a mail that they were unable to process my order. Then I got another mail about a “cancellation request”. They didn’t explain what really happened and that the revocation happened due to a key uploaded on Pastebin.
I then informed Symantec about the invalid key (from my “real” identity), claiming that I just noted there’s something wrong with it. At that point they should’ve been aware that they revoked the certificate in error. Then I contacted the support with my “domain owner” identity and asked why the certificate was revoked. The answer: “I wanted to inform you that your FreeSSL certificate was cancelled as during a log check it was determined that the private key was compromised.”
To summarize: Symantec never told the domain owner that the certificate was revoked due to a key leaked on Pastebin. I assume in all the other cases they also didn’t inform their customers. Thus they may have experienced a certificate revocation, but don’t know why. So they can’t learn and can’t improve their processes to make sure this doesn’t happen again. Also, Symantec still insisted to the domain owner that the key was compromised even after I already had informed them that the key was faulty.
How to check if a private key belongs to a certificate?
In case you wonder how you properly check whether a private key belongs to a certificate you may of course resort to a Google search. And this was fascinating – and scary – to me: I searched Google for “check if private key matches certificate”. I got plenty of instructions. Almost all of them were wrong. The first result is a page from SSLShopper. They recommend to compare the MD5 hash of the modulus. That they use MD5 is not the problem here, the problem is that this is a naive check only comparing parts of the public key. They even provide a form to check this. (That they ask you to put your private key into a form is a different issue on its own, but at least they have a warning about this and recommend to check locally.)
Furthermore we get the same wrong instructions from the University of Wisconsin, Comodo (good that their engineers were smart enough not to rely on their own documentation), tbs internet (“SSL expert since 1996”), ShellHacks, IBM and RapidSSL (aka Symantec). A post on Stackexchange is the only result that actually mentions a proper check for RSA keys. Two more Stackexchange posts are not related to RSA, I haven’t checked their solutions in detail.
Going to Google results page two among some unrelated links we find more wrong instructions and tools from Symantec (Update 2020: Link offline), SSL247 (“Symantec Specialist Partner Website Security” - they learned from the best) and some private blog. A documentation by Aspera (belonging to IBM) at least mentions that you can check the private key, but in an unrelated section of the document. Also we get more tools that ask you to upload your private key and then not properly check it from SSLChecker.com, the SSL Store (Symantec “Website Security Platinum Partner”), GlobeSSL (“in SSL we trust”) and - well - RapidSSL.
Documented Security Vulnerability in OpenSSL
So if people google for instructions they’ll almost inevitably end up with non-working instructions or tools. But what about other options? Let’s say we want to automate this and have a tool that verifies whether a certificate matches a private key using OpenSSL. We may end up finding that OpenSSL has a function
x509_check_private_key() that can be used to “check the consistency of a private key with the public key in an X509 certificate or certificate request”. Sounds like exactly what we need, right?
Well, until you read the full docs and find out that it has a BUGS section: “The check_private_key functions don't check if k itself is indeed a private key or not. It merely compares the public materials (e.g. exponent and modulus of an RSA key) and/or key parameters (e.g. EC params of an EC key) of a key pair.”
I think this is a security vulnerability in OpenSSL (discussion with OpenSSL here). And that doesn’t change just because it’s a documented security vulnerability. Notably there are downstream consumers of this function that failed to copy that part of the documentation, see for example the corresponding PHP function (the limitation is however mentioned in a comment by a user).
So how do you really check whether a private key matches a certificate?
Ultimately there are two reliable ways to check whether a private key belongs to a certificate. One way is to check whether the various values of the private key are consistent and then check whether the public key matches. For example a private key contains values p and q that are the prime factors of the public modulus N. If you multiply them and compare them to N you can be sure that you have a legitimate private key. It’s one of the core properties of RSA that it’s secure based on the assumption that it’s not feasible to calculate p and q from N.
You can use OpenSSL to check the consistency of a private key:
openssl rsa -in [privatekey] -check
For my forged keys it will tell you:
RSA key error: n does not equal p q
You can then compare the public key, for example by calculating the so-called SPKI SHA256 hash:
openssl pkey -in [privatekey] -pubout -outform der | sha256sum
openssl x509 -in [certificate] -pubkey |openssl pkey -pubin -pubout -outform der | sha256sum
Another way is to sign a message with the private key and then verify it with the public key. You could do it like this:
openssl x509 -in [certificate] -noout -pubkey > pubkey.pem
dd if=/dev/urandom of=rnd bs=32 count=1
openssl rsautl -sign -pkcs -inkey [privatekey] -in rnd -out sig
openssl rsautl -verify -pkcs -pubin -inkey pubkey.pem -in sig -out check
cmp rnd check
rm rnd check sig pubkey.pem
If cmp produces no output then the signature matches.
As this is all quite complex due to OpenSSLs arcane command line interface I have put this all together in a script. You can pass a certificate and a private key, both in ASCII/PEM format, and it will do both checks.
Symantec did a major blunder by revoking a certificate based on completely forged evidence. There’s hardly any excuse for this and it indicates that they operate a certificate authority without a proper understanding of the cryptographic background.
Apart from that the problem of checking whether a private key and certificate match seems to be largely documented wrong. Plenty of erroneous guides and tools may cause others to fall for the same trap.
Update: Symantec answered with a blog post.
Thursday, June 15. 2017
Coredumps are a feature of Linux and other Unix systems to analyze crashing software. If a software crashes, for example due to an invalid memory access, the operating system can save the current content of the application's memory to a file. By default it is simply called
While this is useful for debugging purposes it can produce a security risk. If a web application crashes the coredump may simply end up in the web server's root folder. Given that its file name is known an attacker can simply download it via an URL of the form
https://example.org/core. As coredumps contain an application's memory they may expose secret information. A very typical example would be passwords.
PHP used to crash relatively often. Recently a lot of these crash bugs have been fixed, in part because PHP now has a bug bounty program. But there are still situations in which PHP crashes. Some of them likely won't be fixed.
How to disclose?
With a scan of the Alexa Top 1 Million domains for exposed core dumps I found around 1.000 vulnerable hosts. I was faced with a challenge: How can I properly disclose this? It is obvious that I wouldn't write hundreds of manual mails. So I needed an automated way to contact the site owners.
Abusix runs a service where you can query the abuse contacts of IP addresses via a DNS query. This turned out to be very useful for this purpose. One could also imagine contacting domain owners directly, but that's not very practical. The domain whois databases have rate limits and don't always expose contact mail addresses in a machine readable way.
Using the abuse contacts doesn't reach all of the affected host operators. Some abuse contacts were nonexistent mail addresses, others didn't have abuse contacts at all. I also got all kinds of automated replies, some of them asking me to fill out forms or do other things, otherwise my message wouldn't be read. Due to the scale I ignored those. I feel that if people make it hard for me to inform them about security problems that's not my responsibility.
I took away two things that I changed in a second batch of disclosures. Some abuse contacts seem to automatically search for IP addresses in the abuse mails. I originally only included affected URLs. So I changed that to include the affected IPs as well.
In many cases I was informed that the affected hosts are not owned by the company I contacted, but by a customer. Some of them asked me if they're allowed to forward the message to them. I thought that would be obvious, but I made it explicit now. Some of them asked me that I contact their customers, which again, of course, is impractical at scale. And sorry: They are your customers, not mine.
How to fix and prevent it?
If you have a coredump on your web host, the obvious fix is to remove it from there. However you obviously also want to prevent this from happening again.
There are two settings that impact coredump creation: A limits setting, configurable via
ulimit and a sysctl interface that can be found under
The limits setting is a size limit for coredumps. If it is set to zero then no core dumps are created. To set this as the default you can add something like this to your
* soft core 0
The sysctl interface sets a pattern for the file name and can also contain a path. You can set it to something like this:
This would store all coredumps under
/var/log/core/ and add the executable name, process id, host name and timestamp to the filename. The directory needs to be writable by all users, you should use a directory with the sticky bit (
If you set this via the proc file interface it will only be temporary until the next reboot. To set this permanently you can add it to
kernel.core_pattern = /var/log/core/core.%e.%p.%h.%t
Some Linux distributions directly forward core dumps to crash analysis tools. This can be done by prefixing the pattern with a pipe (|). These tools like apport from Ubuntu or abrt from Fedora have also been the source of security vulnerabilities in the past. However that's a separate issue.
Look out for coredumps
My scans showed that this is a relatively common issue. Among popular web pages around one in a thousand were affected before my disclosure attempts. I recommend that pentesters and developers of security scan tools consider checking for this. It's simple: Just try download the
/core file and check if it looks like an executable. In most cases it will be an ELF file, however sometimes it may be a Mach-O (OS X) or an a.out file (very old Linux and Unix systems).
Image credit: NASA/JPL-Université Paris Diderot
Friday, May 19. 2017
Update (2020-09-16): While three years old, people still find this blog post when looking for information about Stapling problems. For Apache the situation has improved considerably in the meantime: mod_md, which is part of recent apache releases, comes with a new stapling implementation which you can enable with the setting MDStapling on.
Today the OCSP servers from Let’s Encrypt were offline for a while. This has caused far more trouble than it should have, because in theory we have all the technologies available to handle such an incident. However due to failures in how they are implemented they don’t really work.
We have to understand some background. Encrypted connections using the TLS protocol like HTTPS use certificates. These are essentially cryptographic public keys together with a signed statement from a certificate authority that they belong to a certain host name.
CRL and OCSP – two technologies that don’t work
Certificates can be revoked. That means that for some reason the certificate should no longer be used. A typical scenario is when a certificate owner learns that his servers have been hacked and his private keys stolen. In this case it’s good to avoid that the stolen keys and their corresponding certificates can still be used. Therefore a TLS client like a browser should check that a certificate provided by a server is not revoked.
That’s the theory at least. However the history of certificate revocation is a history of two technologies that don’t really work.
One method are certificate revocation lists (CRLs). It’s quite simple: A certificate authority provides a list of certificates that are revoked. This has an obvious limitation: These lists can grow. Given that a revocation check needs to happen during a connection it’s obvious that this is non-workable in any realistic scenario.
The second method is called OCSP (Online Certificate Status Protocol). Here a client can query a server about the status of a single certificate and will get a signed answer. This avoids the size problem of CRLs, but it still has a number of problems. Given that connections should be fast it’s quite a high cost for a client to make a connection to an OCSP server during each handshake. It’s also concerning for privacy, as it gives the operator of an OCSP server a lot of information.
However there’s a more severe problem: What happens if an OCSP server is not available? From a security point of view one could say that a certificate that can’t be OCSP-checked should be considered invalid. However OCSP servers are far too unreliable. So practically all clients implement OCSP in soft fail mode (or not at all). Soft fail means that if the OCSP server is not available the certificate is considered valid.
That makes the whole OCSP concept pointless: If an attacker tries to abuse a stolen, revoked certificate he can just block the connection to the OCSP server – and thus a client can’t learn that it’s revoked. Due to this inherent security failure Chrome decided to disable OCSP checking altogether. As a workaround they have something called CRLsets and Mozilla has something similar called OneCRL, which is essentially a big revocation list for important revocations managed by the browser vendor. However this is a weak workaround that doesn’t cover most certificates.
OCSP Stapling and Must Staple to the rescue?
There are two technologies that could fix this: OCSP Stapling and Must-Staple.
OCSP Stapling moves the querying of the OCSP server from the client to the server. The server gets OCSP replies and then sends them within the TLS handshake. This has several advantages: It avoids the latency and privacy implications of OCSP. It also allows surviving short downtimes of OCSP servers, because a TLS server can cache OCSP replies (they’re usually valid for several days).
However it still does not solve the security issue: If an attacker has a stolen, revoked certificate it can be used without Stapling. The browser won’t know about it and will query the OCSP server, this request can again be blocked by the attacker and the browser will accept the certificate.
Therefore an extension for certificates has been introduced that allows us to require Stapling. It’s usually called OCSP Must-Staple and is defined in RFC 7633 (although the RFC doesn’t mention the name Must-Staple, which can cause some confusion). If a browser sees a certificate with this extension that is used without OCSP Stapling it shouldn’t accept it.
So we should be fine. With OCSP Stapling we can avoid the latency and privacy issues of OCSP and we can avoid failing when OCSP servers have short downtimes. With OCSP Must-Staple we fix the security problems. No more soft fail. All good, right?
The OCSP Stapling implementations of Apache and Nginx are broken
Well, here come the implementations. While a lot of protocols use TLS, the most common use case is the web and HTTPS. According to Netcraft statistics by far the biggest share of active sites on the Internet run on Apache (about 46%), followed by Nginx (about 20 %). It’s reasonable to say that if these technologies should provide a solution for revocation they should be usable with the major products in that area. On the server side this is only OCSP Stapling, as OCSP Must Staple only needs to be checked by the client.
What would you expect from a working OCSP Stapling implementation? It should try to avoid a situation where it’s unable to send out a valid OCSP response. Thus roughly what it should do is to fetch a valid OCSP response as soon as possible and cache it until it gets a new one or it expires. It should furthermore try to fetch a new OCSP response long before the old one expires (ideally several days). And it should never throw away a valid response unless it has a newer one. Google developer Ryan Sleevi wrote a detailed description of what a proper OCSP Stapling implementation could look like.
Apache does none of this.
If Apache tries to renew the OCSP response and gets an error from the OCSP server – e. g. because it’s currently malfunctioning – it will throw away the existing, still valid OCSP response and replace it with the error. It will then send out stapled OCSP errors. Which makes zero sense. Firefox will show an error if it sees this. This has been reported in 2014 and is still unfixed.
Now there’s an option in Apache to avoid this behavior: SSLStaplingReturnResponderErrors. It’s defaulting to on. If you switch it off you won’t get sane behavior (that is – use the still valid, cached response), instead Apache will disable Stapling for the time it gets errors from the OCSP server. That’s better than sending out errors, but it obviously makes using Must Staple a no go.
It gets even crazier. I have set this option, but this morning I still got complaints that Firefox users were seeing errors. That’s because in this case the OCSP server wasn’t sending out errors, it was completely unavailable. For that situation Apache has a feature that will fake a tryLater error to send out to the client. If you’re wondering how that makes any sense: It doesn’t. The “tryLater” error of OCSP isn’t useful at all in TLS, because you can’t try later during a handshake which only lasts seconds.
This is controlled by another option: SSLStaplingFakeTryLater. However if we read the documentation it says “Only effective if SSLStaplingReturnResponderErrors is also enabled.” So if we disabled SSLStapingReturnResponderErrors this shouldn’t matter, right? Well: The documentation is wrong.
There are more problems: Apache doesn’t get the OCSP responses on startup, it only fetches them during the handshake. This causes extra latency on the first connection and increases the risk of hitting a situation where you don’t have a valid OCSP response. Also cached OCSP responses don’t survive server restarts, they’re kept in an in-memory cache.
There’s currently no way to configure Apache to handle OCSP stapling in a reasonable way. Here’s the configuration I use, which will at least make sure that it won’t send out errors and cache the responses a bit longer than it does by default:
I’m less familiar with Nginx, but from what I hear it isn’t much better either. According to this blogpost it doesn’t fetch OCSP responses on startup and will send out the first TLS connections without stapling even if it’s enabled. Here’s a blog post that recommends to work around this by connecting to all configured hosts after the server has started.
To summarize: This is all a big mess. Both Apache and Nginx have OCSP Stapling implementations that are essentially broken. As long as you’re using either of those then enabling Must-Staple is a reliable way to shoot yourself in the foot and get into trouble. Don’t enable it if you plan to use Apache or Nginx.
Certificate revocation is broken. It has been broken since the invention of SSL and it’s still broken. OCSP Stapling and OCSP Must-Staple could fix it in theory. But that would require working and stable implementations in the most widely used server products.
Wednesday, April 19. 2017
A while ago I wanted to report a bug in one of Nextcloud's apps. They use the Github issue tracker, after creating a new issue I was welcomed with a long list of things they wanted to know about my installation. I filled the info to the best of my knowledge, until I was asked for this:
The content of config/config.php:
Which made me stop and wonder: The config file probably contains sensitive information like passwords. I quickly checked, and yes it does. It depends on the configuration of your Nextcloud installation, but in many cases the configuration contains variables for the database password (dbpassword), the smtp mail server password (mail_smtppassword) or both. Combined with other information from the config file (e. g. it also contains the smtp hostname) this could be very valuable information for an attacker.
A few lines later the bug reporting template has a warning (“Without the database password, passwordsalt and secret”), though this is incomplete, as it doesn't mention the smtp password. It also provides an alternative way of getting the content of the config file via the command line.
However... you know, this is the Internet. People don't read the fineprint. If you ask them to paste the content of their config file they might just do it.
User's passwords publicly accessible
The issues on github are all public and the URLs are of a very simple form and numbered (e. g. https://github.com/nextcloud/calendar/issues/[number]), so downloading all issues from a project is trivial. Thus with a quick check I could confirm that some users indeed posted real looking passwords to the bug tracker.
Nextcoud is a fork of Owncloud, so I checked that as well. The bug reporting template contained exactly the same words, probably Nextcloud just copied it over when they forked. So I reported the issue to both Owncloud and Nextcloud via their HackerOne bug bounty programs. That was in January.
I proposed that both projects should go through their past bug reports and remove everything that looks like a password or another sensitive value. I also said that I think asking for the content of the configuration file is inherently dangerous and should be avoided. To allow users to share configuration options in a safe way I proposed to offer an option similar to the command line tool (which may not be available or usable for all users) in the web interface.
The reaction wasn't overwhelming. Apart from confirming that both projects acknowledged the problem nothing happened for quite a while. During FOSDEM I reached out to members of both projects and discussed the issue in person. Shortly after that I announced that I intended to disclose this issue three months after the initial report.
Disclosure deadline was nearing with passwords still public
The deadline was nearing and I didn't receive any report on any actions being taken by Owncloud or Nextcloud. I sent out this tweet which received quite some attention (and I'm sorry that some people got worried about a vulnerability in Owncloud/Nextcloud itself, I got a couple of questions):
In all fairness to NextCloud, they had actually started scrubbing data from the existing bug reports, they just hadn't informed me. After the tweet Nextcloud gave me an update and Owncloud asked for a one week extension of the disclosure deadline which I agreed to.
The outcome by now isn't ideal. Both projects have scrubbed all obvious passwords from existing bug reports, although I still find values where it's not entirely clear whether they are replacement values or just very bad passwords (e. g. things like “123456”, but you might argue that people using such passwords have other problems).
Nextcloud has changed the wording of the bug reporting template. The new template still asks for the config file, but it mentions the safer command line option first and has the warning closer to the mentioning of the config. This is still far from ideal and I wouldn't be surprised if people continue pasting their passwords. However Nextcloud developers have indicated in the HackerOne discussion that they might pick up my idea of offering a GUI version to export a scrubbed config file. Owncloud has changed nothing yet.
If you have reported bugs to Owncloud or Nextcloud in the past and are unsure whether you may have pasted your password it's probably best to change it. Even if it's been removed now it may still be available within search engine caches or it might have already been recorded by an attacker.
Saturday, April 8. 2017
I want to tell a little story here. I am usually relatively savvy in IT security issues. Yet I was made aware of a quite severe mistake today that caused a security issue in my web page. I want to learn from mistakes, but maybe also others can learn something as well.
I have a private web page. Its primary purpose is to provide a list of links to articles I wrote elsewhere. It's probably not a high value target, but well, being an IT security person I wanted to get security right.
Of course the page uses TLS-encryption via HTTPS. It also uses HTTP Strict Transport Security (HSTS), TLS 1.2 with an AEAD and forward secrecy, has a CAA record and even HPKP (although I tend to tell people that they shouldn't use HPKP, because it's too easy to get wrong). Obviously it has an A+ rating on SSL Labs.
I also thought about SQL injection. While an SQL injection would be quite unlikely – you remember, no user supplied input – I'm using prepared statements, so SQL injections should be practically impossible.
All in all I felt that I have a pretty secure web page. So what could possibly go wrong?
Well, this morning someone send me this screenshot:
And before you ask: Yes, this was the real database password. (I changed it now.)
So what happened? The mysql server was down for a moment. It had crashed for reasons unrelated to this web page. I had already taken care of that and hadn't noted the password leak. The crashed mysql server subsequently let to an error message by PDO (PDO stands for PHP Database Object and is the modern way of doing database operations in PHP).
The PDO error message contains a stack trace of the function call including function parameters. And this led to the password leak: The password is passed to the PDO constructor as a parameter.
There are a few things to note here. First of all for this to happen the PHP option display_errors needs to be enabled. It is recommended to disable this option in production systems, however it is enabled by default. (Interesting enough the PHP documentation about display_errors doesn't even tell you what the default is.)
display_errors wasn't enabled by accident. It was actually disabled in the past. I made a conscious decision to enable it. Back when we had display_errors disabled on the server I once tested a new PHP version where our custom config wasn't enabled yet. I noticed several bugs in PHP pages. So my rationale was that disabling display_errors hides bugs, thus I'd better enable it. In hindsight it was a bad idea. But well... hindsight is 20/20.
The second thing to note is that this only happens because PDO throws an exception that is unhandled. To be fair, the PDO documentation mentions this risk. Other kinds of PHP bugs don't show stack traces. If I had used mysqli – the second supported API to access MySQL databases in PHP – the error message would've looked like this:
PHP Warning: mysqli::__construct(): (HY000/1045): Access denied for user 'test'@'localhost' (using password: YES) in /home/[...]/mysqli.php on line 3
While this still leaks the username, it's much less dangerous. This is a subtlety that is far from obvious. PHP functions have different modes of error reporting. Object oriented functions – like PDO – throw exceptions. Unhandled exceptions will lead to stack traces. Other functions will just report error messages without stack traces.
If you wonder about the impact: It's probably minor. People could've seen the password, but I haven't noticed any changes in the database. I obviously changed it immediately after being notified. I'm pretty certain that there is no way that a database compromise could be used to execute code within the web page code. It's far too simple for that.
Of course there are a number of ways this could've been prevented, and I've implemented several of them. I'm now properly handling exceptions from PDO. I also set a general exception handler that will inform me (and not the web page visitor) if any other unhandled exceptions occur. And finally I've changed the server's default to display_errors being disabled.
While I don't want to shift too much blame here, I think PHP is making this far too easy to happen. There exists a bug report about the leaking of passwords in stack traces from 2014, but nothing happened. I think there are a variety of unfortunate decisions made by PHP. If display_errors is dangerous and discouraged for production systems then it shouldn't be enabled by default.
PHP could avoid sending stack traces by default and make this a separate option from display_errors. It could also introduce a way to make exceptions fatal for functions so that calling those functions is prevented outside of a try/catch block that handles them. (However that obviously would introduce compatibility problems with existing applications, as Craig Young pointed out to me.)
So finally maybe a couple of takeaways:
Wednesday, March 15. 2017
I've complained in the past about the lack of rigorous science in large parts of IT security. However there's no lack of reports and publications that claim to provide data about this space.
Recently RAND Corporation, a US-based think tank, published a report about zero day vulnerabilities. Many people praised it, an article on Motherboard quotes people saying that we finally have “cold hard data” and quoting people from the zero day business who came to the conclusion that this report clearly confirms what they already believed.
I read the report. I wasn't very impressed. The data is so weak that I think the conclusions are almost entirely meaningless.
The story that is spun around this report needs some context: There's a market for secret security vulnerabilities, often called zero days or 0days. These are vulnerabilities in IT products that some actors (government entities, criminals or just hackers who privately collect them) don't share with the vendor of that product or the public, so the vendor doesn't know about them and can't provide a fix.
One potential problem of this are bug collisions. Actor A may find or buy a security bug and choose to not disclose it and use it for its own purposes. If actor B finds the same bug then he might use it to attack actor A or attack someone else. If A had disclosed that bug to the vendor of the software it could've been fixed and B couldn't have used it, at least not against people who regularly update their software. Depending on who A and B are (more or less democratic nation states, nation states in conflict with each other or simply criminals) one can argue how problematic that is.
One question that arises here is how common that is. If you found a bug – how likely is it that someone else will find the same bug? The argument goes that if this rate is low then stockpiling vulnerabilities is less problematic. This is how the RAND report is framed. It tries to answer that question and comes to the conclusion that bug collisions are relatively rare. Thus many people now use it to justify that zero day stockpiling isn't so bad.
The data is hardly trustworthy
The basis of the whole report is an analysis of 207 bugs by an entity that shared this data with the authors of the report. It is incredibly vague about that source. They name their source with the hypothetical name BUSBY.
We can learn that it's a company in the zero day business and indirectly we can learn how many people work there on exploit development. Furthermore we learn: “Some BUSBY researchers have worked for nation-states (so
their skill level and methodology rival that of nation-state teams), and many of BUSBY’s products are used by nation-states.” That's about it. To summarize: We don't know where the data came from.
The authors of the study believe that this is a representative data set. But it is not really explained why they believe so. There are numerous problems with this data:
Naturally BUSBY has an interest in a certain outcome and interpretation of that data. This creates a huge conflict of interest. It is entirely possible that they only chose to share that data because they expected a certain outcome. And obviously the reverse is also true: Other companies may have decided not to share such data to avoid a certain outcome. It creates an ideal setup for publication bias, where only the data supporting a certain outcome is shared.
It is inexcusable that the problem of conflict of interest isn't even mentioned or discussed anywhere in the whole report.
A main outcome is based on a very dubious assumption
The report emphasizes two main findings. One is that the lifetime of a vulnerability is roughly seven years. With the caveat that the data is likely biased, this claim can be derived from the data available. It can reasonably be claimed that this lifetime estimate is true for the 207 analyzed bugs.
The second claim is about the bug collision rate and is much more problematic:
“For a given stockpile of zero-day vulnerabilities, after a year, approximately 5.7 percent have been discovered by an outside entity.”
Now think about this for a moment. It is absolutely impossible to know that based on the data available. This would only be possible if they had access to all the zero days discovered by all actors in that space in a certain time frame. It might be possible to extrapolate this if you'd know how many bugs there are in total on the market - but you don't.
So how does this report solve this? Well, let it speak for itself:
Ideally, we would want similar data on Red (i.e., adversaries of Blue, or other private-use groups), to examine the overlap between Blue and Red, but we could not obtain that data. Instead, we focus on the overlap between Blue and the public (i.e., the teal section in the figures above) to infer what might be a baseline for what Red has. We do this based on the assumption that what happens in the public groups is somewhat similar to what happens in other groups. We acknowledge that this is a weak assumption, given that the composition, focus, motivation, and sophistication of the public and private groups can be fairly different, but these are the only data available at this time. (page 12)
Okay, weak assumption may be the understatement of the year. Let's summarize this: They acknowledge that they can't answer the question they want to answer. So they just answer an entirely different question (bug collision rate between the 207 bugs they have data about and what is known in public) and then claim that's about the same. To their credit they recognize that this is a weak assumption, but you have to read the report to learn that. Neither the summary nor the press release nor any of the favorable blog posts and media reports mention that.
If you wonder what the Red and Blue here means, that's also quite interesting, because it gives some insights about the mode of thinking of the authors. Blue stands for the “own team”, a company or government or anyone else who has knowledge of zero day bugs. Red is “the adversary” and then there is the public. This is of course a gross oversimplification. It's like a world where there are two nation states fighting each other and no other actors that have any interest in hacking IT systems. In reality there are multiple Red, Blue and in-between actors, with various adversarial and cooperative relations between them.
Sometimes the best answer is: We don't know
The line of reasoning here is roughly: If we don't have good data to answer a question, we'll just replace it with bad data.
I can fully understand the call for making decisions based on data. That is usually a good thing. However, it may simply be that this is a scenario where getting reliable data is incredibly hard or simply impossible. In such a situation the best thing one can do is admit that and live with it. I don't think it's helpful to rely on data that's so weak that it's basically meaningless.
The core of the problem is that we're talking about an industry that wants to be secret. This secrecy is in a certain sense in direct conflict with good scientific practice. Transparency and data sharing are cornerstones of good science.
I should mention here that shortly afterwards another study was published by Trey Herr and Bruce Schneier which also tries to answer the question of bug collisions. I haven't read it yet, from a brief look it seems less bad than the RAND report. However I have my doubts about it as well. It is only based on public bug findings, which is at least something that has a chance of being verifiable by others. It has the same problem that one can hardly draw conclusions about the non-public space based on that. (My personal tie in to that is that I had a call with Trey Herr a while ago where he asked me about some of my bug findings. I told him my doubts about this.)
The bigger picture: We need better science
IT security isn't a field that's rich of rigorous scientific data.
There's a lively debate right now going on in many fields of science about the integrity of their methods. Psychologists had to learn that many theories they believed for decades were based on bad statistics and poor methodology and are likely false. Whenever someone tries to replicate other studies the replication rates are abysmal. Smart people claim that the majority scientific outcomes are not true.
I don't see this debate happening in computer science. It's certainly not happening in IT security. Almost nobody is doing replications. Meta analyses, trials registrations or registered reports are mostly unheard of.
Instead we have cargo cult science like this RAND report thrown around as “cold hard data” we should rely upon. This is ridiculous.
I obviously have my own thoughts on the zero days debate. But my opinion on the matter here isn't what this is about. What I do think is this: We need good, rigorous science to improve the state of things. We largely don't have that right now. And bad science is a poor replacement for good science.
Friday, July 15. 2016
In early April I reported security problems with the update process to the security contact of Joomla. While the issue has been fixed in Joomla 3.6, the communication process was far from ideal.
The issue itself is pretty simple: Up until recently Joomla fetched information about its updates over unencrypted and unauthenticated HTTP without any security measures.
The update process works in three steps. First of all the Joomla backend fetches a file list.xml from update.joomla.org that contains information about current versions. If a newer version than the one installed is found then the user gets a button that allows him to update Joomla. The file list.xml references an URL for each version with further information about the update called extension_sts.xml. Interestingly this file is fetched over HTTPS, while - in version 3.5 - the file list.xml is not. However this does not help, as the attacker can already intervene at the first step and serve a malicious list.xml that references whatever he wants. In extension_sts.xml there is a download URL for a zip file that contains the update.
Exploiting this for a Man-in-the-Middle-attacker is trivial: Requests to update.joomla.org need to be redirected to an attacker-controlled host. Then the attacker can place his own list.xml, which will reference his own extension_sts.xml, which will contain a link to a backdoored update. I have created a trivial proof of concept for this (just place that on the HTTP host that update.joomla.org gets redirected to).
I think it should be obvious that software updates are a security sensitive area and need to be protected. Using HTTPS is one way of doing that. Using any kind of cryptographic signature system is another way. Unfortunately it seems common web applications are only slowly learning that. Drupal only switched to HTTPS updates earlier this year. It's probably worth checking other web applications that have integrated update processes if they are secure (Wordpress is secure fwiw).
Now here's how the Joomla developers handled this issue: I contacted Joomla via their webpage on April 6th. Their webpage form didn't have a way to attach files, so I offered them to contact me via email so I could send them the proof of concept. I got a reply to that shortly after asking for it. This was the only communication from their side. Around two months later, on June 14th, I asked about the status of this issue and warned that I would soon publish it if I don't get a reaction. I never got any reply.
In the meantime Joomla had published beta versions of the then upcoming version 3.6. I checked that and noted that they have changed the update url from http://update.joomla.org/ to https://update.joomla.org/. So while they weren't communicating with me it seemed a fix was on its way. I then found that there was a pull request and a Github discussion that started even before I first contacted them. Joomla 3.6 was released recently, therefore the issue is fixed. However the release announcement doesn't mention it.
So all in all I contacted them about a security issue they were already in the process of fixing. The problem itself is therefore solved. But the lack of communication about the issue certainly doesn't cast a good light on Joomla's security process.
Monday, April 4. 2016
The Owncloud web application has an encryption module. I first became aware of it when a press release was published advertising this encryption module containing this:
“Imagine you are an IT organization using industry standard AES 256 encryption keys. Let’s say that a vulnerability is found in the algorithm, and you now need to improve your overall security by switching over to RSA-2048, a completely different algorithm and key set. Now, with ownCloud’s modular encryption approach, you can swap out the existing AES 256 encryption with the new RSA algorithm, giving you added security while still enabling seamless access to enterprise-class file sharing and collaboration for all of your end-users.”
To anyone knowing anything about crypto this sounds quite weird. AES and RSA are very different algorithms – AES is a symmetric algorithm and RSA is a public key algorithm - and it makes no sense to replace one by the other. Also RSA is much older than AES. This press release has since been removed from the Owncloud webpage, but its content can still be found in this Reuters news article. This and some conversations with Owncloud developers caused me to have a look at this encryption module.
First it is important to understand what this encryption module is actually supposed to do and understand the threat scenario. The encryption provides no security against a malicious server operator, because the encryption happens on the server. The only scenario where this encryption helps is if one has a trusted server that is using an untrusted storage space.
When one uploads a file with the encryption module enabled it ends up under the same filename in the user's directory on the file storage. Now here's a first, quite obvious problem: The filename itself is not protected, so an attacker that is assumed to be able to see the storage space can already learn something about the supposedly encrypted data.
The content of the file starts with this:
It is then padded with further dashes until position 0x2000 and then the encrypted contend follows Base64-encoded in blocks of 8192 bytes. The header tells us what encryption algorithm and mode is used: AES-256 in CFB-mode. CFB stands for Cipher Feedback.
Authenticated and unauthenticated encryption modes
In order to proceed we need some basic understanding of encryption modes. AES is a block cipher with a block size of 128 bit. That means we cannot just encrypt arbitrary input with it, the algorithm itself only encrypts blocks of 128 bit (or 16 byte) at a time. The naive way to encrypt more data is to split it into 16 byte blocks and encrypt every block. This is called Electronic Codebook mode or ECB and it should never be used, because it is completely insecure.
Common modes for encryption are Cipherblock Chaining (CBC) and Counter mode (CTR). These modes are unauthenticated and have a property that's called malleability. This means an attacker that is able to manipulate encrypted data is able to manipulate it in a way that may cause a certain defined behavior in the output. Often this simply means an attacker can flip bits in the ciphertext and the same bits will be flipped in the decrypted data.
To counter this these modes are usually combined with some authentication mechanism, a common one is called HMAC. However experience has shown that this combining of encryption and authentication can go wrong. Many vulnerabilities in both TLS and SSH were due to bad combinations of these two mechanism. Therefore modern protocols usually use dedicated authenticated encryption modes (AEADs), popular ones include Galois/Counter-Mode (GCM), Poly1305 and OCB.
Cipher Feedback (CFB) mode is a self-correcting mode. When an error happens, which can be simple data transmission error or a hard disk failure, two blocks later the decryption will be correct again. This also allows decrypting parts of an encrypted data stream. But the crucial thing for our attack is that CFB is not authenticated and malleable. And Owncloud didn't use any authentication mechanism at all.
Therefore the data is encrypted and an attacker cannot see the content of a file (however he learns some metadata: the size and the filename), but an Owncloud user cannot be sure that the downloaded data is really the data that was uploaded in the first place. The malleability of CFB mode works like this: An attacker can flip arbitrary bits in the ciphertext, the same bit will be flipped in the decrypted data. However if he flips a bit in any block then the following block will contain unpredictable garbage.
Backdooring an EXE file
How does that matter in practice? Let's assume we have a group of people that share a software package over Owncloud. One user uploads a Windows EXE installer and the others download it from there and install it. Let's further assume that the attacker doesn't know the content of the EXE file (this is a generous assumption, in many cases he will know, as he knows the filename).
EXE files start with a so-called MZ-header, which is the old DOS EXE header that gets usually ignored. At a certain offset (0x3C), which is at the end of the fourth 16 byte block, there is an address of the PE header, which on Windows systems is the real EXE header. After the MZ header even on modern executables there is still a small DOS program. This starts with the fifth 16 byte block. This DOS program usually only shows the message “Th is program canno t be run in DOS mode”. And this DOS stub program is almost always the exactly the same.
Therefore our attacker can do the following: First flip any non-relevant bit in the third 16 byte block. This will cause the fourth block to contain garbage. The fourth block contains the offset of the PE header. As this is now garbled Windows will no longer consider this executable to be a Windows application and will therefore execute the DOS stub.
The attacker can then XOR 16 bytes of his own code with the first 16 bytes of the standard DOS stub code. He then XORs the result with the fifth block of the EXE file where he expects the DOS stub to be. Voila: The resulting decrypted EXE file will contain 16 bytes of code controlled by the attacker.
I created a proof of concept of this attack. This isn't enough to launch a real attack, because an attacker only has 16 bytes of DOS assembler code, which is very little. For a real attack an attacker would have to identify further pieces of the executable that are predictable and jump through the code segments.
The first fix
I reported this to Owncloud via Hacker One in January. The first fix they proposed was a change where they used Counter-Mode (CTR) in combination with HMAC. They still encrypt the file in blocks of 8192 bytes size. While this is certainly less problematic than the original construction it still had an obvious problem: All the 8192 bytes sized file blocks where encrypted the same way. Therefore an attacker can swap or remove chunks of a file. The encryption is still malleable.
The second fix then included a counter of the file and also avoided attacks where an attacker can go back to an earlier version of a file. This solution is shipped in Owncloud 9.0, which has recently been released.
Is this new construction secure? I honestly don't know. It is secure enough that I didn't find another obvious flaw in it, but that doesn't mean a whole lot.
You may wonder at this point why they didn't switch to an authenticated encryption mode like GCM. The reason for that is that PHP doesn't support any authenticated encryption modes. There is a proposal and most likely support for authenticated encryption will land in PHP 7.1. However given that using outdated PHP versions is a very widespread practice it will probably take another decade till anyone can use that in mainstream web applications.
Don't invent your own crypto protocols
The practical relevance of this vulnerability is probably limited, because the scenario that it protects from is relatively obscure. But I think there is a lesson to learn here. When people without a strong cryptographic background create ad-hoc designs of cryptographic protocols it will almost always go wrong.
It is widely known that designing your own crypto algorithms is a bad idea and that you should use standardized and well tested algorithms like AES. But using secure algorithms doesn't automatically create a secure protocol. One has to know the interactions and limitations of crypto primitives and this is far from trivial. There is a worrying trend – especially since the Snowden revelations – that new crypto products that never saw any professional review get developed and advertised in masses. A lot of these products are probably extremely insecure and shouldn't be trusted at all.
If you do crypto you should either do it right (which may mean paying someone to review your design or to create it in the first place) or you better don't do it at all. People trust your crypto, and if that trust isn't justified you shouldn't ship a product that creates the impression it contains secure cryptography.
There's another thing that bothers me about this. Although this seems to be a pretty standard use case of crypto – you have a symmetric key and you want to encrypt some data – there is no straightforward and widely available standard solution for it. Using authenticated encryption solves a number of issues, but not all of them (this talk by Adam Langley covers some interesting issues and caveats with authenticated encryption).
The proof of concept can be found on Github. I presented this vulnerability in a talk at the Easterhegg conference, a video recording is available.
Update (2020): Kevin Niehage had a much more detailed look at the encryption module of Owncloud and its fork Nextcloud. Among other things he noted that a downgrade attack allows re-enabling the attack I described. He found several other design flaws and bad design decisions and has written a paper about it.
Tuesday, January 26. 2016
Update: When I wrote this blog post it was an open question for me whether using Address Sanitizer in production is a good idea. A recent analysis posted on the oss-security mailing list explains in detail why using Asan in its current form is almost certainly not a good idea. Having any suid binary built with Asan enables a local root exploit - and there are various other issues. Therefore using Gentoo with Address Sanitizer is only recommended for developing and debugging purposes.
Address Sanitizer is a remarkable feature that is part of the gcc and clang compilers. It can be used to find many typical C bugs - invalid memory reads and writes, use after free errors etc. - while running applications. It has found countless bugs in many software packages. I'm often surprised that many people in the free software community seem to be unaware of this powerful tool.
Address Sanitizer is mainly intended to be a debugging tool. It is usually used to test single applications, often in combination with fuzzing. But as Address Sanitizer can prevent many typical C security bugs - why not use it in production? It doesn't come for free. Address Sanitizer takes significantly more memory and slows down applications by 50 - 100 %. But for some security sensitive applications this may be a reasonable trade-off. The Tor project is already experimenting with this with its Hardened Tor Browser.
One project I've been working on in the past months is to allow a Gentoo system to be compiled with Address Sanitizer. Today I'm publishing this and want to allow others to test it. I have created a page in the Gentoo Wiki that should become the central documentation hub for this project. I published an overlay with several fixes and quirks on Github.
I see this work as part of my Fuzzing Project. (I'm posting it here because the Gentoo category of my personal blog gets indexed by Planet Gentoo.)
I am not sure if using Gentoo with Address Sanitizer is reasonable for a production system. One thing that makes me uneasy in suggesting this for high security requirements is that it's currently incompatible with Grsecurity. But just creating this project already caused me to find a whole number of bugs in several applications. Some notable examples include Coreutils/shred, Bash (, ), man-db, Pidgin-OTR, Courier, Syslog-NG, Screen, Claws-Mail (, ), ProFTPD (, ) ICU, TCL (), Dovecot. I think it was worth the effort.
I will present this work in a talk at FOSDEM in Brussels this Saturday, 14:00, in the Security Devroom.
Friday, December 11. 2015
Important notice: After I published this text Adam Langley pointed out that a major assumption is wrong: Android 2.2 actually has no problems with SHA256-signed certificates. I checked this myself and in an emulated Android 2.2 instance I was able to connect to a site with a SHA256-signed certificate. I apologize for that error, I trusted the Cloudflare blog post on that. This whole text was written with that assumption in mind, so it's hard to change without rewriting it from scratch. I have marked the parts that are likely to be questioned. Most of it is still true and Android 2 has a problematic TLS stack (no SNI), but the specific claim regarding SHA256-certificates seems wrong.
This week both Cloudflare and Facebook announced that they want to delay the deprecation of certificates signed with the SHA1 algorithm. This spurred some hot debates whether or not this is a good idea – with two seemingly good causes: On the one side people want to improve security, on the other side access to webpages should remain possible for users of old devices, many of them living in poor countries. I want to give some background on the issue and ask why that unfortunate situation happened in the first place, because I think it highlights some of the most important challenges in the TLS space and more generally in IT security.
SHA1 broken since 2005
The SHA1 algorithm is a cryptographic hash algorithm and it has been know for quite some time that its security isn't great. In 2005 the Chinese researcher Wang Xiaoyun published an attack that would allow to create a collision for SHA1. The attack wasn't practically tested, because it is quite expensive to do so, but it was clear that a financially powerful adversary would be able to perform such an attack. A year before the even older hash function MD5 was broken practically, in 2008 this led to a practical attack against the issuance of TLS certificates. In the past years browsers pushed for the deprecation of SHA1 certificates and it was agreed that starting January 2016 no more certificates signed with SHA1 must be issued, instead the stronger algorithm SHA256 should be used. Many felt this was already far too late, given that it's been ten years since we knew that SHA1 is broken.
A few weeks before the SHA1 deadline Cloudflare and Facebook now question this deprecation plan. They have some strong arguments. According to Cloudflare's numbers there is still a significant number of users that use browsers without support for SHA256-certificates. And those users are primarily in relatively poor, repressive or war-ridden countries. The top three on the list are China, Cameroon and Yemen. Their argument, which is hard to argue with, is that cutting of SHA1 support will primarily affect the poorest users.
Cloudflare and Facebook propose a new mechanism to get legacy validated certificates. These certificates should only be issued to site operators that will use a technology to separate users based on their TLS handshake and only show the SHA1 certificate to those that use an older browser. Facebook already published the code to do that, Cloudflare also announced that they will release the code of their implementation. Right now it's still possible to get SHA1 certificates, therefore those companies could just register them now and use them for three years to come. Asking for this legacy validation process indicates that Cloudflare and Facebook don't see this as a short-term workaround, instead they seem to expect that this will be a solution they use for years to come, without any decided end date.
It's a tough question whether or not this is a good idea. But I want to ask a different question: Why do we have this problem in the first place, why is it hard to fix and what can we do to prevent similar things from happening in the future? One thing is remarkable about this problem: It's a software problem. In theory software can be patched and the solution to a software problem is to update the software. So why can't we just provide updates and get rid of these legacy problems?
Windows XP and Android Froyo
According to Cloudflare there are two main reason why so many users can't use sites with SHA256 certificates: Windows XP and old versions of Android (SHA256 support was added in Android 2.3, so this affects mostly Android 2.2 aka Froyo). We all know that Windows XP shouldn't be used any more, that its support has ended in 2014. But that clearly clashes with realities. People continue using old systems and Windows XP is still alive in many countries, especially in China.
But I'm inclined to say that Windows XP is probably the smaller problem here. With Service Pack 3 Windows XP introduced support for SHA256 certificates. By using an alternative browser (Firefox is still supported on Windows XP if you install SP3) it is even possible to have a relatively safe browsing experience. I'm not saying that I recommend it, but given the circumstances advising people how to update their machines and to install an alternative browser can party provide a way to reduce the reliance on broken algorithms.
The Updatability Gap
But the problem with Android is much, much worse, and I think this brings us to probably the biggest problem in IT security we have today. For years one of the most important messages to users in IT security was: Keep your software up to date. But at the same time the industry has created new software ecosystems where very often that just isn't an option.
In the Android case Google says that it's the responsibility of device vendors and carriers to deliver security updates. The dismal reality is that in most cases they just don't do that. But even if device vendors are willing to provide updates it usually only happens for a very short time frame. Google only supports the latest two Android major versions. For them Android 2.2 is ancient history, but for a significant portion of users it is still the operating system they use.
What we have here is a huge gap between the time frame devices get security updates and the time frame users use these devices. And pretty much everything tells us that the vendors in the Internet of Things ignore these problems even more and the updatability gap will become larger. Many became accustomed to the idea that phones get only used for a year, but it's hard to imagine how that's going to work for a fridge. What's worse: Whether you look at phones or other devices, they often actively try to prevent users from replacing the software on their own.
This is a hard problem to tackle, but it's probably the biggest problem IT security is facing in the upcoming years. We need to get a working concept for updates – a concept that matches the real world use of devices.
Substandard TLS implementations
But there's another part of the SHA1 deprecation story. As I wrote above since 2005 it was clear that SHA1 needs to go away. That was three years before Android was even published. But in 2010 Android still wasn't capable of supporting SHA256 certificates. Google has to take a large part of the blame here. While these days they are at the forefront of deploying high quality and up to date TLS stacks, they shipped a substandard and outdated TLS implementation in Android 2. (Another problem is that all Android 2 versions don't support Server Name Indication, a technology that allows to use different certificates for different hosts on one IP address.)
This is not the first such problem we are facing. With the POODLE vulnerability it became clear that the old SSL version 3 is broken beyond repair and it's impossible to use it safely. The only option was to deprecate it. However doing so was painful, because a lot of devices out there didn't support better protocols. The successor protocol TLS 1.0 (SSL/TLS versions are confusing, I know) was released in 1999. But the problem wasn't that people were using devices older than 1999. The problem was that many vendors shipped devices and software that only supported SSLv3 in recent years.
One example was Windows Phone 7. In 2011 this was the operating system on Microsoft's and Nokia's flagship product, the Lumia 800. Its mail client is unable to connect to servers not supporting SSLv3. It is just inexcusable that in 2011 Microsoft shipped a product which only supported a protocol that was deprecated 12 years earlier. It's even more inexcusable that they refused to fix it later, because it only came to light when Windows Phone 7 was already out of support.
The takeaway from this is that sloppiness from the past can bite you many years later. And this is what we're seeing with Android 2.2 now.
But you might think given these experiences this has stopped today. It hasn't. The largest deployer of substandard TLS implementations these days is Apple. Up until recently (before El Capitan) Safari on OS X didn't support any authenticated encryption cipher suites with AES-GCM and relied purely on the CBC block mode. The CBC cipher suites are a hot candidate for the next deprecation plan, because 2013 the Lucky 13 attack has shown that they are really hard to implement safely. The situation for applications other than the browser (Mail etc.) is even worse on Apple devices. They only support the long deprecated TLS 1.0 protocol – and that's still the case on their latest systems.
There is widespread agreement in the TLS and cryptography community that the only really safe way to use TLS these days is TLS 1.2 with a cipher suite using forward secrecy and authenticated encryption (AES-GCM is the only standardized option for that right now, however ChaCha20/Poly1304 will come soon).
For the specific case of the SHA1 deprecation I would propose the following: Cloudflare and Facebook should go ahead with their handshake workaround for the next years, as long as their current certificates are valid. But this time should be used to find solutions. Reach out to those users visiting your sites and try to understand what could be done to fix the situation. For the Windows XP users this is relatively easy – help them updating their machines and preferably install another browser, most likely that'd be Firefox. For Android there is probably no easy solution, but we have some of the largest Internet companies involved here. Please seriously ask the question: Is it possible to retrofit Android 2.2 with a reasonable TLS stack? What ways are there to get that onto the devices? Is it possible to install a browser app with its own TLS stack on at least some of those devices? This probably doesn't work in most cases, because on many cheap phones there just isn't enough space to install large apps. In the long term I hope that the tech community will start tackling the updatability problem.
In the TLS space I think we need to make sure that no more substandard TLS deployments get shipped today. Point out the vendors that do so and pressure them to stop. It wasn't acceptable in 2010 to ship TLS with long-known problems and it isn't acceptable today.
Image source: Wikimedia Commons
Monday, November 30. 2015
tl;dr Older GnuTLS versions (2.x) fail to check the first byte of the padding in CBC modes. Various stable Linux distributions, including Ubuntu LTS and Debian wheezy (oldstable) use this version. Current GnuTLS versions are not affected.
A few days ago an email on the ssllabs mailing list catched my attention. A Canonical developer had observed that the SSL Labs test would report the GnuTLS version used in Ubuntu 14.04 (the current long time support version) as vulnerable to the POODLE TLS vulnerability, while other tests for the same vulnerability showed no such issue.
A little background: The original POODLE vulnerability is a weakness of the old SSLv3 protocol that's now officially deprecated. POODLE is based on the fact that SSLv3 does not specify the padding of the CBC modes and the padding bytes can contain arbitrary bytes. A while after POODLE Adam Langley reported that there is a variant of POODLE in TLS, however while the original POODLE is a protocol issue the POODLE TLS vulnerability is an implementation issue. TLS specifies the values of the padding bytes, but some implementations don't check them. Recently Yngve Pettersen reported that there are different variants of this POODLE TLS vulnerability: Some implementations only check parts of the padding. This is the reason why sometimes different tests lead to different results. A test that only changes one byte of the padding will lead to different results than one that changes all padding bytes. Yngve Pettersen uncovered POODLE variants in devices from Cisco (Cavium chip) and Citrix.
I looked at the Ubuntu issue and found that this was exactly such a case of an incomplete padding check: The first byte wasn't checked. I believe this might explain some of the vulnerable hosts Yngve Pettersen found. This is the code:
for (i = 2; i <= pad; i++)
The padding in TLS is defined that the rightmost byte of the last block contains the length of the padding. This value is also used in all padding bytes. However the length field itself is not part of the padding. Therefore if we have e. g. a padding length of three this would result in four bytes with the value 3. The above code misses one byte. i goes from 2 (setting block length minus 2) to pad (block length minus pad length), which sets pad length minus one bytes. To correct it we need to change the loop to end with pad+1. The code is completely reworked in current GnuTLS versions, therefore they are not affected. Upstream has officially announced the end of life for GnuTLS 2, but some stable Linux distributions still use it.
The story doesn't end here: After I found this bug I talked about it with Juraj Somorovsky. He mentioned that he already read about this before: In the paper of the Lucky Thirteen attack. That was published in 2013 by Nadhem AlFardan and Kenny Paterson. Here's what the Lucky Thirteen paper has to say about this issue on page 13:
for (i = 2; i < pad; i++)
It is not hard to see that this loop should also cover the edge case i=pad in order to carry out a full padding check. This means that one byte of what should be padding actually has a free format.
If you look closely you will see that this code is actually different from the one I quoted above. The reason is that the GnuTLS version in question already contained a fix that was applied in response to the Lucky Thirteen paper. However what the Lucky Thirteen paper missed is that the original check was off by two bytes, not just one byte. Therefore it only got an incomplete fix reducing the attack surface from two bytes to one.
In a later commit this whole code was reworked in response to the Lucky Thirteen attack and there the problem got fixed for good. However that change never made it into version 2 of GnuTLS. Red Hat / CentOS packages contain a backport patch of those changes, therefore they are not affected.
You might wonder what the impact of this bug is. I'm not totally familiar with the details of all the possible attacks, but the POODLE attack gets increasingly harder if fewer bytes of the padding can be freely set. It most likely is impossible if there is only one byte. The Lucky Thirteen paper says: "This would enable, for example, a variant of the short MAC attack of  even if variable length padding was not supported.". People that know more about crypto than I do should be left with the judgement whether this might be practically exploitabe.
Fixing this bug is a simple one-line patch I have attached here. This will silence all POODLE checks, however this doesn't apply all the changes that were made in response to the Lucky Thirteen attack. I'm not sure if the code is practically vulnerable, but Lucky Thirteen is a tricky issue, recently a variant of that attack was shown against Amazon's s2n library.
The missing padding check for the first byte got CVE-2015-8313 assigned. Currently I'm aware of Ubuntu LTS (now fixed) and Debian oldstable (Wheezy) being affected.
Monday, November 23. 2015
tl;dr Dell laptops come preinstalled with a root certificate and a corresponding private key. That completely compromises the security of encrypted HTTPS connections. I've provided an online check, affected users should delete the certificate.
It seems that Dell hasn't learned anything from the Superfish-scandal earlier this year: Laptops from the company come with a preinstalled root certificate that will be accepted by browsers. The private key is also installed on the system and has been published now. Therefore attackers can use Man in the Middle attacks against Dell users to show them manipulated HTTPS webpages or read their encrypted data.
The certificate, which is installed in the system's certificate store under the name "eDellRoot", gets installed by a software called Dell Foundation Services. This software is still available on Dell's webpage. According to the somewhat unclear description from Dell it is used to provide "foundational services facilitating customer serviceability, messaging and support functions".
The private key of this certificate is marked as non-exportable in the Windows certificate store. However this provides no real protection, there are Tools to export such non-exportable certificate keys. A user of the plattform Reddit has posted the Key there.
For users of the affected Laptops this is a severe security risk. Every attacker can use this root certificate to create valid certificates for arbitrary web pages. Even HTTP Public Key Pinning (HPKP) does not protect against such attacks, because browser vendors allow locally installed certificates to override the key pinning protection. This is a compromise in the implementation that allows the operation of so-called TLS interception proxies.
I was made aware of this issue a while ago by Kristof Mattei. We asked Dell for a statement three weeks ago and didn't get any answer.
It is currently unclear which purpose this certificate served. However it seems unliklely that it was placed there deliberately for surveillance purposes. In that case Dell wouldn't have installed the private key on the system.
Affected are only users that use browsers or other applications that use the system's certificate store. Among the common Windows browsers this affects the Internet Explorer, Edge and Chrome. Not affected are Firefox-users, Mozilla's browser has its own certificate store.
Users of Dell laptops can check if they are affected with an online check tool. Affected users should immediately remove the certificate in the Windows certificate manager. The certificate manager can be started by clicking "Start" and typing in "certmgr.msc". The "eDellRoot" certificate can be found under "Trusted Root Certificate Authorities". You also need to remove the file Dell.Foundation.Agent.Plugins.eDell.dll, Dell has now posted an instruction and a removal tool.
This incident is almost identical with the Superfish-incident. Earlier this year it became public that Lenovo had preinstalled a software called Superfish on its Laptops. Superfish intercepts HTTPS-connections to inject ads. It used a root certificate for that and the corresponding private key was part of the software. After that incident several other programs with the same vulnerability were identified, they all used a software module called Komodia. Similar vulnerabilities were found in other software products, for example in Privdog and in the ad blocker Adguard.
This article is mostly a translation of a German article I wrote for Golem.de.
Image source and license: Wistula / Wikimedia Commons, Creative Commons by 3.0
Update (2015-11-24): Second Dell root certificate DSDTestProvider
I just found out that there is a second root certificate installed with some Dell software that causes exactly the same issue. It is named DSDTestProvider and comes with a software called Dell System Detect. Unlike the Dell Foundations Services this one does not need a Dell computer to be installed, therefore it was trivial to extract the certificate and the private key. My online test now checks both certificates. This new certificate is not covered by Dell's removal instructions yet.
Dell has issued an official statement on their blog and in the comment section a user mentioned this DSDTestProvider certificate. After googling what DSD might be I quickly found it. There have been concerns about the security of Dell System Detect before, Malwarebytes has an article about it from April mentioning that it was vulnerable to a remote code execution vulnerability.
Update (2015-11-26): Service tag information disclosure
Another unrelated issue on Dell PCs was discovered in a tool called Dell Foundation Services. It allows webpages to read an unique service tag. There's also an online check.
Saturday, September 5. 2015
At the recent Chaos Communication Camp I held a talk summarizing the problems with TLS interception or Man-in-the-Middle proxies. This was initially motivated by the occurence of Superfish and my own investigations on Privdog, but I learned in the past month that this is a far bigger problem. I was surprised and somewhat shocked to learn that it seems to be almost a default feature of various security products, especially in the so-called "Enterprise" sector. I hope I have contributed to a discussion about the dangers of these devices and software products.
There is a video recording of the talk avaliable and I'm also sharing the slides (also on Slideshare).
I noticed after the talk that I had a mistake on the slides: When describing Filippo's generic attack on Komodia software I said and wrote SNI (Server Name Indication) on the slides. However the feature that is used here is called SAN (Subject Alt Name). SNI is a feature to have different certificates on one IP, SAN is a feature to have different domain names on one certificate, so they're related and I got confused, sorry for that.
I got a noteworthy comment in the discussion after the talk I also would like to share: These TLS interception proxies by design break client certificate authentication. Client certificates are rarely used, however that's unfortunate, because they are a very useful feature of TLS. This is one more reason to avoid any software that is trying to mess with your TLS connections.
Thursday, August 13. 2015
In February the discovery of a software called Superfish caused widespread attention. Superfish caused a severe security vulnerability by intercepting HTTPS connections with a Man-in-the-Middle-certificate. The certificate and the corresponding private key was shared amongst all installations.
The use of Man-in-the-Middle-proxies for traffic interception is a widespread method, an application installs a root certificate into the browser and later intercepts connections by creating signed certificates for webpages on the fly. It quickly became clear that Superfish was only the tip of the iceberg. The underlying software module Komodia was used in a whole range of applications all suffering from the same bug. Later another software named Privdog was found that also intercepted HTTPS traffic and I published a blog post explaining that it was broken in a different way: It completely failed to do any certificate verification on its connections.
In a later blogpost I analyzed several Antivirus applications that also intercept HTTPS traffic. They were not as broken as Superfish or Privdog, but all of them decreased the security of the TLS encryption in one way or another. The most severe issue was that Kaspersky was at that point still vulnerable to the FREAK bug, more than a month after it was discovered. In a comment to that blogpost I was asked about a software called Adguard. I have to apologize that it took me so long to write this up.
Different certificate, same key
The first thing I did was to install Adguard two times in different VMs and look at the root certificate that got installed into the browser. The fingerprint of the certificates was different. However a closer look revealed something interesting: The RSA modulus was the same. It turned out that Adguard created a new root certificate with a changing serial number for every installation, but it didn't generate a new key. Therefore it is vulnerable to the same attacks as Superfish.
I reported this issue to Adguard. Adguard has fixed this issue, however they still intercept HTTPS traffic.
I learned that Adguard did not always use the same key, instead it chose one out of ten different keys based on the CPU. All ten keys could easily be extracted from a file called ProtocolFilters.dll that was shipped with Adguard. Older versions of Adguard only used one key shared amongst all installations. There also was a very outdated copy of the nss library. It suffers from various vulnerabilities, however it seems they are not exploitable. The library is not used for TLS connections, its only job is to install certificates into the Firefox root store.
Meet Privdog again
The outdated nss version gave me a hint, because I had seen this before: In Privdog. I had spend some time trying to find out if Privdog would be vulnerable to known nss issues (which had the positive side effect that Filippo created proof of concept code for the BERserk vulnerability). What I didn't notice back then was the shared key issue. Privdog also used the same key amongst different installations. So it turns out Privdog was completely broken in two different ways: By sharing the private key amongst installations and by not verifying certificates.
The latest version of Privdog no longer intercepts HTTPS traffic, it works as a browser plugin now. I don't know whether this vulnerability was still present after the initial fix caused by my original blog post.
Now what is this ProtocolFilters.dll? It is a commercial software module that is supposed to be used along with a product called Netfilter SDK. I wondered where else this would be found and if we would have another widely used software module like Komodia.
ProtocolFilters.dll is mentioned a lot in the web, mostly in the context of Potentially Unwanted Applications, also called Crapware. That means software that is either preinstalled or that gets bundled with installers from other software and is often installed without users consent or by tricking the user into clicking some "ok" button without knowing that he or she agrees to install another software. Unfortunately I was unable to get my hands on any other software using it.
Lots of "Potentially Unwanted Applications" use ProtocolFilters.dll
Software names that I found that supposedly include ProtocolFilters.dll: Coupoon, CashReminder, SavingsDownloader, Scorpion Saver, SavingsbullFilter, BRApp, NCupons, Nurjax, Couponarific, delshark, rrsavings, triosir, screentk. If anyone has any of them or any other piece of software bundling ProtocolFilters.dll I'd be interested in receiving a copy.
I'm publishing all Adguard keys and the Privdog key together with example certificates here. I also created a trivial script that can be used to extract keys from ProtocolFilters.dll (or other binary files that include TLS private keys in their binary form). It looks for anything that could be a private key by its initial bytes and then calls OpenSSL to try to decode it. If OpenSSL succeeds it will dump the key.
Finally an announcement for visitors of the Chaos Communication Camp: I will give a talk about TLS interception issues and the whole story of Superfish, Privdog and friends on Sunday.
Update: Due to the storm the talk was delayed. It will happen on Monday at 12:30 in Track South.
Tuesday, June 23. 2015
tl;dr Most servers running a multi-user webhosting setup with Apache HTTPD probably have a security problem. Unless you're using Grsecurity there is no easy fix.
I am part of a small webhosting business that I run as a side project since quite a while. We offer customers user accounts on our servers running Gentoo Linux and webspace with the typical Apache/PHP/MySQL combination. We recently became aware of a security problem regarding Symlinks. I wanted to share this, because I was appalled by the fact that there was no obvious solution.
Apache has an option FollowSymLinks which basically does what it says. If a symlink in a webroot is accessed the webserver will follow it. In a multi-user setup this is a security problem. Here's why: If I know that another user on the same system is running a typical web application - let's say Wordpress - I can create a symlink to his config file (for Wordpress that's wp-config.php). I can't see this file with my own user account. But the webserver can see it, so I can access it with the browser over my own webpage. As I'm usually allowed to disable PHP I'm able to prevent the server from interpreting the file, so I can read the other user's database credentials. The webserver needs to be able to see all files, therefore this works. While PHP and CGI scripts usually run with user's rights (at least if the server is properly configured) the files are still read by the webserver. For this to work I need to guess the path and name of the file I want to read, but that's often trivial. In our case we have default paths in the form /home/[username]/websites/[hostname]/htdocs where webpages are located.
So the obvious solution one might think about is to disable the FollowSymLinks option and forbid users to set it themselves. However symlinks in web applications are pretty common and many will break if you do that. It's not feasible for a common webhosting server.
Apache supports another Option called SymLinksIfOwnerMatch. It's also pretty self-explanatory, it will only follow symlinks if they belong to the same user. That sounds like it solves our problem. However there are two catches: First of all the Apache documentation itself says that "this option should not be considered a security restriction". It is still vulnerable to race conditions.
But even leaving the race condition aside it doesn't really work. Web applications using symlinks will usually try to set FollowSymLinks in their .htaccess file. An example is Drupal which by default comes with such an .htaccess file. If you forbid users to set FollowSymLinks then the option won't be just ignored, the whole webpage won't run and will just return an error 500. What you could do is changing the FollowSymLinks option in the .htaccess manually to SymlinksIfOwnerMatch. While this may be feasible in some cases, if you consider that you have a lot of users you don't want to explain to all of them that in case they want to install some common web application they have to manually edit some file they don't understand. (There's a bug report for Drupal asking to change FollowSymLinks to SymlinksIfOwnerMatch, but it's been ignored since several years.)
So using SymLinksIfOwnerMatch is neither secure nor really feasible. The documentation for Cpanel discusses several possible solutions. The recommended solutions require proprietary modules. None of the proposed fixes work with a plain Apache setup, which I think is a pretty dismal situation. The most common web server has a severe security weakness in a very common situation and no usable solution for it.
The one solution that we chose is a feature of Grsecurity. Grsecurity is a Linux kernel patch that greatly enhances security and we've been very happy with it in the past. There are a lot of reasons to use this patch, I'm often impressed that local root exploits very often don't work on a Grsecurity system.
Grsecurity has an option like SymlinksIfOwnerMatch (CONFIG_GRKERNSEC_SYMLINKOWN) that operates on the kernel level. You can define a certain user group (which in our case is the "apache" group) for which this option will be enabled. For us this was the best solution, as it required very little change.
I haven't checked this, but I'm pretty sure that we were not alone with this problem. I'd guess that a lot of shared web hosting companies are vulnerable to this problem.
Here's the German blog post on our webpage and here's the original blogpost from an administrator at Uberspace (also German) which made us aware of this issue.
You can find my web page with links to my work as a journalist at https://hboeck.de/.
You may also find my newsletter about climate change and decarbonization technologies interesting.
Hanno on Mastodon
Show tagged entries