Monday, April 4. 2016
The Owncloud web application has an encryption module. I first became aware of it when a press release was published advertising this encryption module containing this:
“Imagine you are an IT organization using industry standard AES 256 encryption keys. Let’s say that a vulnerability is found in the algorithm, and you now need to improve your overall security by switching over to RSA-2048, a completely different algorithm and key set. Now, with ownCloud’s modular encryption approach, you can swap out the existing AES 256 encryption with the new RSA algorithm, giving you added security while still enabling seamless access to enterprise-class file sharing and collaboration for all of your end-users.”
To anyone knowing anything about crypto this sounds quite weird. AES and RSA are very different algorithms – AES is a symmetric algorithm and RSA is a public key algorithm - and it makes no sense to replace one by the other. Also RSA is much older than AES. This press release has since been removed from the Owncloud webpage, but its content can still be found in this Reuters news article. This and some conversations with Owncloud developers caused me to have a look at this encryption module.
First it is important to understand what this encryption module is actually supposed to do and understand the threat scenario. The encryption provides no security against a malicious server operator, because the encryption happens on the server. The only scenario where this encryption helps is if one has a trusted server that is using an untrusted storage space.
When one uploads a file with the encryption module enabled it ends up under the same filename in the user's directory on the file storage. Now here's a first, quite obvious problem: The filename itself is not protected, so an attacker that is assumed to be able to see the storage space can already learn something about the supposedly encrypted data.
The content of the file starts with this:
It is then padded with further dashes until position 0x2000 and then the encrypted contend follows Base64-encoded in blocks of 8192 bytes. The header tells us what encryption algorithm and mode is used: AES-256 in CFB-mode. CFB stands for Cipher Feedback.
Authenticated and unauthenticated encryption modes
In order to proceed we need some basic understanding of encryption modes. AES is a block cipher with a block size of 128 bit. That means we cannot just encrypt arbitrary input with it, the algorithm itself only encrypts blocks of 128 bit (or 16 byte) at a time. The naive way to encrypt more data is to split it into 16 byte blocks and encrypt every block. This is called Electronic Codebook mode or ECB and it should never be used, because it is completely insecure.
Common modes for encryption are Cipherblock Chaining (CBC) and Counter mode (CTR). These modes are unauthenticated and have a property that's called malleability. This means an attacker that is able to manipulate encrypted data is able to manipulate it in a way that may cause a certain defined behavior in the output. Often this simply means an attacker can flip bits in the ciphertext and the same bits will be flipped in the decrypted data.
To counter this these modes are usually combined with some authentication mechanism, a common one is called HMAC. However experience has shown that this combining of encryption and authentication can go wrong. Many vulnerabilities in both TLS and SSH were due to bad combinations of these two mechanism. Therefore modern protocols usually use dedicated authenticated encryption modes (AEADs), popular ones include Galois/Counter-Mode (GCM), Poly1305 and OCB.
Cipher Feedback (CFB) mode is a self-correcting mode. When an error happens, which can be simple data transmission error or a hard disk failure, two blocks later the decryption will be correct again. This also allows decrypting parts of an encrypted data stream. But the crucial thing for our attack is that CFB is not authenticated and malleable. And Owncloud didn't use any authentication mechanism at all.
Therefore the data is encrypted and an attacker cannot see the content of a file (however he learns some metadata: the size and the filename), but an Owncloud user cannot be sure that the downloaded data is really the data that was uploaded in the first place. The malleability of CFB mode works like this: An attacker can flip arbitrary bits in the ciphertext, the same bit will be flipped in the decrypted data. However if he flips a bit in any block then the following block will contain unpredictable garbage.
Backdooring an EXE file
How does that matter in practice? Let's assume we have a group of people that share a software package over Owncloud. One user uploads a Windows EXE installer and the others download it from there and install it. Let's further assume that the attacker doesn't know the content of the EXE file (this is a generous assumption, in many cases he will know, as he knows the filename).
EXE files start with a so-called MZ-header, which is the old DOS EXE header that gets usually ignored. At a certain offset (0x3C), which is at the end of the fourth 16 byte block, there is an address of the PE header, which on Windows systems is the real EXE header. After the MZ header even on modern executables there is still a small DOS program. This starts with the fifth 16 byte block. This DOS program usually only shows the message “Th is program canno t be run in DOS mode”. And this DOS stub program is almost always the exactly the same.
Therefore our attacker can do the following: First flip any non-relevant bit in the third 16 byte block. This will cause the fourth block to contain garbage. The fourth block contains the offset of the PE header. As this is now garbled Windows will no longer consider this executable to be a Windows application and will therefore execute the DOS stub.
The attacker can then XOR 16 bytes of his own code with the first 16 bytes of the standard DOS stub code. He then XORs the result with the fifth block of the EXE file where he expects the DOS stub to be. Voila: The resulting decrypted EXE file will contain 16 bytes of code controlled by the attacker.
I created a proof of concept of this attack. This isn't enough to launch a real attack, because an attacker only has 16 bytes of DOS assembler code, which is very little. For a real attack an attacker would have to identify further pieces of the executable that are predictable and jump through the code segments.
The first fix
I reported this to Owncloud via Hacker One in January. The first fix they proposed was a change where they used Counter-Mode (CTR) in combination with HMAC. They still encrypt the file in blocks of 8192 bytes size. While this is certainly less problematic than the original construction it still had an obvious problem: All the 8192 bytes sized file blocks where encrypted the same way. Therefore an attacker can swap or remove chunks of a file. The encryption is still malleable.
The second fix then included a counter of the file and also avoided attacks where an attacker can go back to an earlier version of a file. This solution is shipped in Owncloud 9.0, which has recently been released.
Is this new construction secure? I honestly don't know. It is secure enough that I didn't find another obvious flaw in it, but that doesn't mean a whole lot.
You may wonder at this point why they didn't switch to an authenticated encryption mode like GCM. The reason for that is that PHP doesn't support any authenticated encryption modes. There is a proposal and most likely support for authenticated encryption will land in PHP 7.1. However given that using outdated PHP versions is a very widespread practice it will probably take another decade till anyone can use that in mainstream web applications.
Don't invent your own crypto protocols
The practical relevance of this vulnerability is probably limited, because the scenario that it protects from is relatively obscure. But I think there is a lesson to learn here. When people without a strong cryptographic background create ad-hoc designs of cryptographic protocols it will almost always go wrong.
It is widely known that designing your own crypto algorithms is a bad idea and that you should use standardized and well tested algorithms like AES. But using secure algorithms doesn't automatically create a secure protocol. One has to know the interactions and limitations of crypto primitives and this is far from trivial. There is a worrying trend – especially since the Snowden revelations – that new crypto products that never saw any professional review get developed and advertised in masses. A lot of these products are probably extremely insecure and shouldn't be trusted at all.
If you do crypto you should either do it right (which may mean paying someone to review your design or to create it in the first place) or you better don't do it at all. People trust your crypto, and if that trust isn't justified you shouldn't ship a product that creates the impression it contains secure cryptography.
There's another thing that bothers me about this. Although this seems to be a pretty standard use case of crypto – you have a symmetric key and you want to encrypt some data – there is no straightforward and widely available standard solution for it. Using authenticated encryption solves a number of issues, but not all of them (this talk by Adam Langley covers some interesting issues and caveats with authenticated encryption).
The proof of concept can be found on Github. I presented this vulnerability in a talk at the Easterhegg conference, a video recording is available.
Friday, December 11. 2015
Important notice: After I published this text Adam Langley pointed out that a major assumption is wrong: Android 2.2 actually has no problems with SHA256-signed certificates. I checked this myself and in an emulated Android 2.2 instance I was able to connect to a site with a SHA256-signed certificate. I apologize for that error, I trusted the Cloudflare blog post on that. This whole text was written with that assumption in mind, so it's hard to change without rewriting it from scratch. I have marked the parts that are likely to be questioned. Most of it is still true and Android 2 has a problematic TLS stack (no SNI), but the specific claim regarding SHA256-certificates seems wrong.
This week both Cloudflare and Facebook announced that they want to delay the deprecation of certificates signed with the SHA1 algorithm. This spurred some hot debates whether or not this is a good idea – with two seemingly good causes: On the one side people want to improve security, on the other side access to webpages should remain possible for users of old devices, many of them living in poor countries. I want to give some background on the issue and ask why that unfortunate situation happened in the first place, because I think it highlights some of the most important challenges in the TLS space and more generally in IT security.
SHA1 broken since 2005
The SHA1 algorithm is a cryptographic hash algorithm and it has been know for quite some time that its security isn't great. In 2005 the Chinese researcher Wang Xiaoyun published an attack that would allow to create a collision for SHA1. The attack wasn't practically tested, because it is quite expensive to do so, but it was clear that a financially powerful adversary would be able to perform such an attack. A year before the even older hash function MD5 was broken practically, in 2008 this led to a practical attack against the issuance of TLS certificates. In the past years browsers pushed for the deprecation of SHA1 certificates and it was agreed that starting January 2016 no more certificates signed with SHA1 must be issued, instead the stronger algorithm SHA256 should be used. Many felt this was already far too late, given that it's been ten years since we knew that SHA1 is broken.
A few weeks before the SHA1 deadline Cloudflare and Facebook now question this deprecation plan. They have some strong arguments. According to Cloudflare's numbers there is still a significant number of users that use browsers without support for SHA256-certificates. And those users are primarily in relatively poor, repressive or war-ridden countries. The top three on the list are China, Cameroon and Yemen. Their argument, which is hard to argue with, is that cutting of SHA1 support will primarily affect the poorest users.
Cloudflare and Facebook propose a new mechanism to get legacy validated certificates. These certificates should only be issued to site operators that will use a technology to separate users based on their TLS handshake and only show the SHA1 certificate to those that use an older browser. Facebook already published the code to do that, Cloudflare also announced that they will release the code of their implementation. Right now it's still possible to get SHA1 certificates, therefore those companies could just register them now and use them for three years to come. Asking for this legacy validation process indicates that Cloudflare and Facebook don't see this as a short-term workaround, instead they seem to expect that this will be a solution they use for years to come, without any decided end date.
It's a tough question whether or not this is a good idea. But I want to ask a different question: Why do we have this problem in the first place, why is it hard to fix and what can we do to prevent similar things from happening in the future? One thing is remarkable about this problem: It's a software problem. In theory software can be patched and the solution to a software problem is to update the software. So why can't we just provide updates and get rid of these legacy problems?
Windows XP and Android Froyo
According to Cloudflare there are two main reason why so many users can't use sites with SHA256 certificates: Windows XP and old versions of Android (SHA256 support was added in Android 2.3, so this affects mostly Android 2.2 aka Froyo). We all know that Windows XP shouldn't be used any more, that its support has ended in 2014. But that clearly clashes with realities. People continue using old systems and Windows XP is still alive in many countries, especially in China.
But I'm inclined to say that Windows XP is probably the smaller problem here. With Service Pack 3 Windows XP introduced support for SHA256 certificates. By using an alternative browser (Firefox is still supported on Windows XP if you install SP3) it is even possible to have a relatively safe browsing experience. I'm not saying that I recommend it, but given the circumstances advising people how to update their machines and to install an alternative browser can party provide a way to reduce the reliance on broken algorithms.
The Updatability Gap
But the problem with Android is much, much worse, and I think this brings us to probably the biggest problem in IT security we have today. For years one of the most important messages to users in IT security was: Keep your software up to date. But at the same time the industry has created new software ecosystems where very often that just isn't an option.
In the Android case Google says that it's the responsibility of device vendors and carriers to deliver security updates. The dismal reality is that in most cases they just don't do that. But even if device vendors are willing to provide updates it usually only happens for a very short time frame. Google only supports the latest two Android major versions. For them Android 2.2 is ancient history, but for a significant portion of users it is still the operating system they use.
What we have here is a huge gap between the time frame devices get security updates and the time frame users use these devices. And pretty much everything tells us that the vendors in the Internet of Things ignore these problems even more and the updatability gap will become larger. Many became accustomed to the idea that phones get only used for a year, but it's hard to imagine how that's going to work for a fridge. What's worse: Whether you look at phones or other devices, they often actively try to prevent users from replacing the software on their own.
This is a hard problem to tackle, but it's probably the biggest problem IT security is facing in the upcoming years. We need to get a working concept for updates – a concept that matches the real world use of devices.
Substandard TLS implementations
But there's another part of the SHA1 deprecation story. As I wrote above since 2005 it was clear that SHA1 needs to go away. That was three years before Android was even published. But in 2010 Android still wasn't capable of supporting SHA256 certificates. Google has to take a large part of the blame here. While these days they are at the forefront of deploying high quality and up to date TLS stacks, they shipped a substandard and outdated TLS implementation in Android 2. (Another problem is that all Android 2 versions don't support Server Name Indication, a technology that allows to use different certificates for different hosts on one IP address.)
This is not the first such problem we are facing. With the POODLE vulnerability it became clear that the old SSL version 3 is broken beyond repair and it's impossible to use it safely. The only option was to deprecate it. However doing so was painful, because a lot of devices out there didn't support better protocols. The successor protocol TLS 1.0 (SSL/TLS versions are confusing, I know) was released in 1999. But the problem wasn't that people were using devices older than 1999. The problem was that many vendors shipped devices and software that only supported SSLv3 in recent years.
One example was Windows Phone 7. In 2011 this was the operating system on Microsoft's and Nokia's flagship product, the Lumia 800. Its mail client is unable to connect to servers not supporting SSLv3. It is just inexcusable that in 2011 Microsoft shipped a product which only supported a protocol that was deprecated 12 years earlier. It's even more inexcusable that they refused to fix it later, because it only came to light when Windows Phone 7 was already out of support.
The takeaway from this is that sloppiness from the past can bite you many years later. And this is what we're seeing with Android 2.2 now.
But you might think given these experiences this has stopped today. It hasn't. The largest deployer of substandard TLS implementations these days is Apple. Up until recently (before El Capitan) Safari on OS X didn't support any authenticated encryption cipher suites with AES-GCM and relied purely on the CBC block mode. The CBC cipher suites are a hot candidate for the next deprecation plan, because 2013 the http://www.isg.rhul.ac.uk/tls/Lucky13.html Lucky 13 attack has shown that they are really hard to implement safely. The situation for applications other than the browser (Mail etc.) is even worse on Apple devices. They only support the long deprecated TLS 1.0 protocol – and that's still the case on their latest systems.
There is widespread agreement in the TLS and cryptography community that the only really safe way to use TLS these days is TLS 1.2 with a cipher suite using forward secrecy and authenticated encryption (AES-GCM is the only standardized option for that right now, however ChaCha20/Poly1304 will come soon).
For the specific case of the SHA1 deprecation I would propose the following: Cloudflare and Facebook should go ahead with their handshake workaround for the next years, as long as their current certificates are valid. But this time should be used to find solutions. Reach out to those users visiting your sites and try to understand what could be done to fix the situation. For the Windows XP users this is relatively easy – help them updating their machines and preferably install another browser, most likely that'd be Firefox. For Android there is probably no easy solution, but we have some of the largest Internet companies involved here. Please seriously ask the question: Is it possible to retrofit Android 2.2 with a reasonable TLS stack? What ways are there to get that onto the devices? Is it possible to install a browser app with its own TLS stack on at least some of those devices? This probably doesn't work in most cases, because on many cheap phones there just isn't enough space to install large apps. In the long term I hope that the tech community will start tackling the updatability problem.
In the TLS space I think we need to make sure that no more substandard TLS deployments get shipped today. Point out the vendors that do so and pressure them to stop. It wasn't acceptable in 2010 to ship TLS with long-known problems and it isn't acceptable today.
Image source: Wikimedia Commons
Monday, November 30. 2015
tl;dr Older GnuTLS versions (2.x) fail to check the first byte of the padding in CBC modes. Various stable Linux distributions, including Ubuntu LTS and Debian wheezy (oldstable) use this version. Current GnuTLS versions are not affected.
A few days ago an email on the ssllabs mailing list catched my attention. A Canonical developer had observed that the SSL Labs test would report the GnuTLS version used in Ubuntu 14.04 (the current long time support version) as vulnerable to the POODLE TLS vulnerability, while other tests for the same vulnerability showed no such issue.
A little background: The original POODLE vulnerability is a weakness of the old SSLv3 protocol that's now officially deprecated. POODLE is based on the fact that SSLv3 does not specify the padding of the CBC modes and the padding bytes can contain arbitrary bytes. A while after POODLE Adam Langley reported that there is a variant of POODLE in TLS, however while the original POODLE is a protocol issue the POODLE TLS vulnerability is an implementation issue. TLS specifies the values of the padding bytes, but some implementations don't check them. Recently Yngve Pettersen reported that there are different variants of this POODLE TLS vulnerability: Some implementations only check parts of the padding. This is the reason why sometimes different tests lead to different results. A test that only changes one byte of the padding will lead to different results than one that changes all padding bytes. Yngve Pettersen uncovered POODLE variants in devices from Cisco (Cavium chip) and Citrix.
I looked at the Ubuntu issue and found that this was exactly such a case of an incomplete padding check: The first byte wasn't checked. I believe this might explain some of the vulnerable hosts Yngve Pettersen found. This is the code:
for (i = 2; i <= pad; i++)
if (ciphertext.data[ciphertext.size - i] != pad)
pad_failed = GNUTLS_E_DECRYPTION_FAILED;
The padding in TLS is defined that the rightmost byte of the last block contains the length of the padding. This value is also used in all padding bytes. However the length field itself is not part of the padding. Therefore if we have e. g. a padding length of three this would result in four bytes with the value 3. The above code misses one byte. i goes from 2 (setting block length minus 2) to pad (block length minus pad length), which sets pad length minus one bytes. To correct it we need to change the loop to end with pad+1. The code is completely reworked in current GnuTLS versions, therefore they are not affected. Upstream has officially announced the end of life for GnuTLS 2, but some stable Linux distributions still use it.
The story doesn't end here: After I found this bug I talked about it with Juraj Somorovsky. He mentioned that he already read about this before: In the paper of the Lucky Thirteen attack. That was published in 2013 by Nadhem AlFardan and Kenny Paterson. Here's what the Lucky Thirteen paper has to say about this issue on page 13:
for (i = 2; i < pad; i++)
if (ciphertext->data[ciphertext->size - i] != ciphertext->data[ciphertext->size - 1])
pad_failed = GNUTLS_E_DECRYPTION_FAILED;
It is not hard to see that this loop should also cover the edge case i=pad in order to carry out a full padding check. This means that one byte of what should be padding actually has a free format.
If you look closely you will see that this code is actually different from the one I quoted above. The reason is that the GnuTLS version in question already contained a fix that was applied in response to the Lucky Thirteen paper. However what the Lucky Thirteen paper missed is that the original check was off by two bytes, not just one byte. Therefore it only got an incomplete fix reducing the attack surface from two bytes to one.
In a later commit this whole code was reworked in response to the Lucky Thirteen attack and there the problem got fixed for good. However that change never made it into version 2 of GnuTLS. Red Hat / CentOS packages contain a backport patch of those changes, therefore they are not affected.
You might wonder what the impact of this bug is. I'm not totally familiar with the details of all the possible attacks, but the POODLE attack gets increasingly harder if fewer bytes of the padding can be freely set. It most likely is impossible if there is only one byte. The Lucky Thirteen paper says: "This would enable, for example, a variant of the short MAC attack of  even if variable length padding was not supported.". People that know more about crypto than I do should be left with the judgement whether this might be practically exploitabe.
Fixing this bug is a simple one-line patch I have attached here. This will silence all POODLE checks, however this doesn't apply all the changes that were made in response to the Lucky Thirteen attack. I'm not sure if the code is practically vulnerable, but Lucky Thirteen is a tricky issue, recently a variant of that attack was shown against Amazon's s2n library.
The missing padding check for the first byte got CVE-2015-8313 assigned. Currently I'm aware of Ubuntu LTS (now fixed) and Debian oldstable (Wheezy) being affected.
Monday, November 23. 2015
tl;dr Dell laptops come preinstalled with a root certificate and a corresponding private key. That completely compromises the security of encrypted HTTPS connections. I've provided an online check, affected users should delete the certificate.
It seems that Dell hasn't learned anything from the Superfish-scandal earlier this year: Laptops from the company come with a preinstalled root certificate that will be accepted by browsers. The private key is also installed on the system and has been published now. Therefore attackers can use Man in the Middle attacks against Dell users to show them manipulated HTTPS webpages or read their encrypted data.
The certificate, which is installed in the system's certificate store under the name "eDellRoot", gets installed by a software called Dell Foundation Services. This software is still available on Dell's webpage. According to the somewhat unclear description from Dell it is used to provide "foundational services facilitating customer serviceability, messaging and support functions".
The private key of this certificate is marked as non-exportable in the Windows certificate store. However this provides no real protection, there are Tools to export such non-exportable certificate keys. A user of the plattform Reddit has posted the Key there.
For users of the affected Laptops this is a severe security risk. Every attacker can use this root certificate to create valid certificates for arbitrary web pages. Even HTTP Public Key Pinning (HPKP) does not protect against such attacks, because browser vendors allow locally installed certificates to override the key pinning protection. This is a compromise in the implementation that allows the operation of so-called TLS interception proxies.
I was made aware of this issue a while ago by Kristof Mattei. We asked Dell for a statement three weeks ago and didn't get any answer.
It is currently unclear which purpose this certificate served. However it seems unliklely that it was placed there deliberately for surveillance purposes. In that case Dell wouldn't have installed the private key on the system.
Affected are only users that use browsers or other applications that use the system's certificate store. Among the common Windows browsers this affects the Internet Explorer, Edge and Chrome. Not affected are Firefox-users, Mozilla's browser has its own certificate store.
Users of Dell laptops can check if they are affected with an online check tool. Affected users should immediately remove the certificate in the Windows certificate manager. The certificate manager can be started by clicking "Start" and typing in "certmgr.msc". The "eDellRoot" certificate can be found under "Trusted Root Certificate Authorities". You also need to remove the file Dell.Foundation.Agent.Plugins.eDell.dll, Dell has now posted an instruction and a removal tool.
This incident is almost identical with the Superfish-incident. Earlier this year it became public that Lenovo had preinstalled a software called Superfish on its Laptops. Superfish intercepts HTTPS-connections to inject ads. It used a root certificate for that and the corresponding private key was part of the software. After that incident several other programs with the same vulnerability were identified, they all used a software module called Komodia. Similar vulnerabilities were found in other software products, for example in Privdog and in the ad blocker Adguard.
This article is mostly a translation of a German article I wrote for Golem.de.
Image source and license: Wistula / Wikimedia Commons, Creative Commons by 3.0
Update (2015-11-24): Second Dell root certificate DSDTestProvider
I just found out that there is a second root certificate installed with some Dell software that causes exactly the same issue. It is named DSDTestProvider and comes with a software called Dell System Detect. Unlike the Dell Foundations Services this one does not need a Dell computer to be installed, therefore it was trivial to extract the certificate and the private key. My online test now checks both certificates. This new certificate is not covered by Dell's removal instructions yet.
Dell has issued an official statement on their blog and in the comment section a user mentioned this DSDTestProvider certificate. After googling what DSD might be I quickly found it. There have been concerns about the security of Dell System Detect before, Malwarebytes has an article about it from April mentioning that it was vulnerable to a remote code execution vulnerability.
Update (2015-11-26): Service tag information disclosure
Another unrelated issue on Dell PCs was discovered in a tool called Dell Foundation Services. It allows webpages to read an unique service tag. There's also an online check.
Thursday, August 13. 2015
In February the discovery of a software called Superfish caused widespread attention. Superfish caused a severe security vulnerability by intercepting HTTPS connections with a Man-in-the-Middle-certificate. The certificate and the corresponding private key was shared amongst all installations.
The use of Man-in-the-Middle-proxies for traffic interception is a widespread method, an application installs a root certificate into the browser and later intercepts connections by creating signed certificates for webpages on the fly. It quickly became clear that Superfish was only the tip of the iceberg. The underlying software module Komodia was used in a whole range of applications all suffering from the same bug. Later another software named Privdog was found that also intercepted HTTPS traffic and I published a blog post explaining that it was broken in a different way: It completely failed to do any certificate verification on its connections.
In a later blogpost I analyzed several Antivirus applications that also intercept HTTPS traffic. They were not as broken as Superfish or Privdog, but all of them decreased the security of the TLS encryption in one way or another. The most severe issue was that Kaspersky was at that point still vulnerable to the FREAK bug, more than a month after it was discovered. In a comment to that blogpost I was asked about a software called Adguard. I have to apologize that it took me so long to write this up.
Different certificate, same key
The first thing I did was to install Adguard two times in different VMs and look at the root certificate that got installed into the browser. The fingerprint of the certificates was different. However a closer look revealed something interesting: The RSA modulus was the same. It turned out that Adguard created a new root certificate with a changing serial number for every installation, but it didn't generate a new key. Therefore it is vulnerable to the same attacks as Superfish.
I reported this issue to Adguard. Adguard has fixed this issue, however they still intercept HTTPS traffic.
I learned that Adguard did not always use the same key, instead it chose one out of ten different keys based on the CPU. All ten keys could easily be extracted from a file called ProtocolFilters.dll that was shipped with Adguard. Older versions of Adguard only used one key shared amongst all installations. There also was a very outdated copy of the nss library. It suffers from various vulnerabilities, however it seems they are not exploitable. The library is not used for TLS connections, its only job is to install certificates into the Firefox root store.
Meet Privdog again
The outdated nss version gave me a hint, because I had seen this before: In Privdog. I had spend some time trying to find out if Privdog would be vulnerable to known nss issues (which had the positive side effect that Filippo created proof of concept code for the BERserk vulnerability). What I didn't notice back then was the shared key issue. Privdog also used the same key amongst different installations. So it turns out Privdog was completely broken in two different ways: By sharing the private key amongst installations and by not verifying certificates.
The latest version of Privdog no longer intercepts HTTPS traffic, it works as a browser plugin now. I don't know whether this vulnerability was still present after the initial fix caused by my original blog post.
Now what is this ProtocolFilters.dll? It is a commercial software module that is supposed to be used along with a product called Netfilter SDK. I wondered where else this would be found and if we would have another widely used software module like Komodia.
ProtocolFilters.dll is mentioned a lot in the web, mostly in the context of Potentially Unwanted Applications, also called Crapware. That means software that is either preinstalled or that gets bundled with installers from other software and is often installed without users consent or by tricking the user into clicking some "ok" button without knowing that he or she agrees to install another software. Unfortunately I was unable to get my hands on any other software using it.
Lots of "Potentially Unwanted Applications" use ProtocolFilters.dll
Software names that I found that supposedly include ProtocolFilters.dll: Coupoon, CashReminder, SavingsDownloader, Scorpion Saver, SavingsbullFilter, BRApp, NCupons, Nurjax, Couponarific, delshark, rrsavings, triosir, screentk. If anyone has any of them or any other piece of software bundling ProtocolFilters.dll I'd be interested in receiving a copy.
I'm publishing all Adguard keys and the Privdog key together with example certificates here. I also created a trivial script that can be used to extract keys from ProtocolFilters.dll (or other binary files that include TLS private keys in their binary form). It looks for anything that could be a private key by its initial bytes and then calls OpenSSL to try to decode it. If OpenSSL succeeds it will dump the key.
Finally an announcement for visitors of the Chaos Communication Camp: I will give a talk about TLS interception issues and the whole story of Superfish, Privdog and friends on Sunday.
Update: Due to the storm the talk was delayed. It will happen on Monday at 12:30 in Track South.
Sunday, May 17. 2015
tl;dr News about a broken 4096 bit RSA key are not true. It is just a faulty copy of a valid key.
Earlier today a blog post claiming the factoring of a 4096 bit RSA key was published and quickly made it to the top of Hacker News. The key in question was the PGP key of a well-known Linux kernel developer. I already commented on Hacker News why this is most likely wrong, but I thought I'd write up some more details. To understand what is going on I have to explain some background both on RSA and on PGP keyservers. This by itself is pretty interesting.
RSA public keys consist of two values called N and e. The N value, called the modulus, is the interesting one here. It is the product of two very large prime numbers. The security of RSA relies on the fact that these two numbers are secret. If an attacker would be able to gain knowledge of these numbers he could use them to calculate the private key. That's the reason why RSA depends on the hardness of the factoring problem. If someone can factor N he can break RSA. For all we know today factoring is hard enough to make RSA secure (at least as long as there are no large quantum computers).
Now imagine you have two RSA keys, but they have been generated with bad random numbers. They are different, but one of their primes is the same. That means we have N1=p*q1 and N2=p*q2. In this case RSA is no longer secure, because calculating the greatest common divisor (GCD) of two large numbers can be done very fast with the euclidean algorithm, therefore one can calculate the shared prime value.
It is not only possible to break RSA keys if you have two keys with one shared factors, it is also possible to take a large set of keys and find shared factors between them. In 2012 Arjen Lenstra and his team published a paper using this attack on large scale key sets and at the same time Nadia Heninger and a team at the University of Michigan independently also performed this attack. This uncovered a lot of vulnerable keys on embedded devices, but these were mostly SSH and TLS keys. Lenstra's team however also found two vulnerable PGP keys. For more background you can watch this 29C3 talk by Nadia Heninger, Dan Bernstein and Tanja Lange.
PGP keyservers have been around since quite some time and they have a property that makes them especially interesting for this kind of research: They usually never delete anything. You can add a key to a keyserver, but you cannot remove it, you can only mark it as invalid by revoking it. Therefore using the data from the keyservers gives you a large set of cryptographic keys.
Okay, so back to the news about the supposedly broken 4096 bit key: There is a service called Phuctor where you can upload a key and it'll check it against a set of known vulnerable moduli. This service identified the supposedly vulnerable key.
The key in question has the key id e99ef4b451221121 and belongs to the master key bda06085493bace4. Here is the vulnerable modulus:
c844a98e3372d67f 562bd881da8ea66c a71df16deab1541c e7d68f2243a37665 c3f07d3dd6e651cc d17a822db5794c54 ef31305699a6c77c 043ac87cafc022a3 0a2a717a4aa6b026 b0c1c818cfc16adb aae33c47b0803152 f7e424b784df2861 6d828561a41bdd66 bd220cb46cd288ce 65ccaf9682b20c62 5a84ef28c63e38e9 630daa872270fa15 80cb170bfc492b80 6c017661dab0e0c9 0a12f68a98a98271 82913ff626efddfb f8ae8f1d40da8d13 a90138686884bad1 9db776bb4812f7e3 b288b47114e486fa 2de43011e1d5d7ca 8daf474cb210ce96 2aafee552f192ca0 32ba2b51cfe18322 6eb21ced3b4b3c09 362b61f152d7c7e6 51e12651e915fc9f 67f39338d6d21f55 fb4e79f0b2be4d49 00d442d567bacf7b 6defcd5818b050a4 0db6eab9ad76a7f3 49196dcc5d15cc33 69e1181e03d3b24d a9cf120aa7403f40 0e7e4eca532eac24 49ea7fecc41979d0 35a8e4accea38e1b 9a33d733bea2f430 362bd36f68440ccc 4dc3a7f07b7a7c8f cdd02231f69ce357 4568f303d6eb2916 874d09f2d69e15e6 33c80b8ff4e9baa5 6ed3ace0f65afb43 60c372a6fd0d5629 fdb6e3d832ad3d33 d610b243ea22fe66 f21941071a83b252 201705ebc8e8f2a5 cc01112ac8e43428 50a637bb03e511b2 06599b9d4e8e1ebc eb1e820d569e31c5 0d9fccb16c41315f 652615a02603c69f e9ba03e78c64fecc 034aa783adea213b
In fact this modulus is easily factorable, because it can be divided by 3. However if you look at the master key bda06085493bace4 you'll find another subkey with this modulus:
c844a98e3372d67f 562bd881da8ea66c a71df16deab1541c e7d68f2243a37665 c3f07d3dd6e651cc d17a822db5794c54 ef31305699a6c77c 043ac87cafc022a3 0a2a717a4aa6b026 b0c1c818cfc16adb aae33c47b0803152 f7e424b784df2861 6d828561a41bdd66 bd220cb46cd288ce 65ccaf9682b20c62 5a84ef28c63e38e9 630daa872270fa15 80cb170bfc492b80 6c017661dab0e0c9 0a12f68a98a98271 82c37b8cca2eb4ac 1e889d1027bc1ed6 664f3877cd7052c6 db5567a3365cf7e2 c688b47114e486fa 2de43011e1d5d7ca 8daf474cb210ce96 2aafee552f192ca0 32ba2b51cfe18322 6eb21ced3b4b3c09 362b61f152d7c7e6 51e12651e915fc9f 67f39338d6d21f55 fb4e79f0b2be4d49 00d442d567bacf7b 6defcd5818b050a4 0db6eab9ad76a7f3 49196dcc5d15cc33 69e1181e03d3b24d a9cf120aa7403f40 0e7e4eca532eac24 49ea7fecc41979d0 35a8e4accea38e1b 9a33d733bea2f430 362bd36f68440ccc 4dc3a7f07b7a7c8f cdd02231f69ce357 4568f303d6eb2916 874d09f2d69e15e6 33c80b8ff4e9baa5 6ed3ace0f65afb43 60c372a6fd0d5629 fdb6e3d832ad3d33 d610b243ea22fe66 f21941071a83b252 201705ebc8e8f2a5 cc01112ac8e43428 50a637bb03e511b2 06599b9d4e8e1ebc eb1e820d569e31c5 0d9fccb16c41315f 652615a02603c69f e9ba03e78c64fecc 034aa783adea213b
You may notice that these look pretty similar. But they are not the same. The second one is the real subkey, the first one is just a copy of it with errors.
If you run a batch GCD analysis on the full PGP key server data you will find a number of such keys (Nadia Heninger has published code to do a batch GCD attack). I don't know how they appear on the key servers, I assume they are produced by network errors, harddisk failures or software bugs. It may also be that someone just created them in some experiment.
The important thing is: Everyone can generate a subkey to any PGP key and upload it to a key server. That's just the way the key servers work. They don't check keys in any way. However these keys should pose no threat to anyone. The only case where this could matter would be a broken implementation of the OpenPGP key protocol that does not check if subkeys really belong to a master key.
However you won't be able to easily import such a key into your local GnuPG installation. If you try to fetch this faulty sub key from a key server GnuPG will just refuse to import it. The reason is that every sub key has a signature that proves that it belongs to a certain master key. For those faulty keys this signature is obviously wrong.
Now here's my personal tie in to this story: Last year I started a project to analyze the data on the PGP key servers. And at some point I thought I had found a large number of vulnerable PGP keys – including the key in question here. In a rush I wrote a mail to all people affected. Only later I found out that something was not right and I wrote to all affected people again apologizing. Most of the keys I thought I had found were just faulty keys on the key servers.
The code I used to parse the PGP key server data is public, I also wrote a background paper and did a talk at the BsidesHN conference.
Tuesday, April 7. 2015
tl;dr With a reasonably simple fuzzing setup I was able to rediscover the Heartbleed bug. This uses state-of-the-art fuzzing and memory protection technology (american fuzzy lop and Address Sanitizer), but it doesn't require any prior knowledge about specifics of the Heartbleed bug or the TLS Heartbeat extension. We can learn from this to find similar bugs in the future.
Exactly one year ago a bug in the OpenSSL library became public that is one of the most well-known security bug of all time: Heartbleed. It is a bug in the code of a TLS extension that up until then was rarely known by anybody. A read buffer overflow allowed an attacker to extract parts of the memory of every server using OpenSSL.
Can we find Heartbleed with fuzzing?
Heartbleed was introduced in OpenSSL 1.0.1, which was released in March 2012, two years earlier. Many people wondered how it could've been hidden there for so long. David A. Wheeler wrote an essay discussing how fuzzing and memory protection technologies could've detected Heartbleed. It covers many aspects in detail, but in the end he only offers speculation on whether or not fuzzing would have found Heartbleed. So I wanted to try it out.
Of course it is easy to find a bug if you know what you're looking for. As best as reasonably possible I tried not to use any specific information I had about Heartbleed. I created a setup that's reasonably simple and similar to what someone would also try it without knowing anything about the specifics of Heartbleed.
Heartbleed is a read buffer overflow. What that means is that an application is reading outside the boundaries of a buffer. For example, imagine an application has a space in memory that's 10 bytes long. If the software tries to read 20 bytes from that buffer, you have a read buffer overflow. It will read whatever is in the memory located after the 10 bytes. These bugs are fairly common and the basic concept of exploiting buffer overflows is pretty old. Just to give you an idea how old: Recently the Chaos Computer Club celebrated the 30th anniversary of a hack of the German BtX-System, an early online service. They used a buffer overflow that was in many aspects very similar to the Heartbleed bug. (It is actually disputed if this is really what happened, but it seems reasonably plausible to me.)
Fuzzing is a widely used strategy to find security issues and bugs in software. The basic idea is simple: Give the software lots of inputs with small errors and see what happens. If the software crashes you likely found a bug.
When buffer overflows happen an application doesn't always crash. Often it will just read (or write if it is a write overflow) to the memory that happens to be there. Whether it crashes depends on a lot of circumstances. Most of the time read overflows won't crash your application. That's also the case with Heartbleed. There are a couple of technologies that improve the detection of memory access errors like buffer overflows. An old and well-known one is the debugging tool Valgrind. However Valgrind slows down applications a lot (around 20 times slower), so it is not really well suited for fuzzing, where you want to run an application millions of times on different inputs.
Address Sanitizer finds more bug
A better tool for our purpose is Address Sanitizer. David A. Wheeler calls it “nothing short of amazing”, and I want to reiterate that. I think it should be a tool that every C/C++ software developer should know and should use for testing.
Address Sanitizer is part of the C compiler and has been included into the two most common compilers in the free software world, gcc and llvm. To use Address Sanitizer one has to recompile the software with the command line parameter -fsanitize=address . It slows down applications, but only by a relatively small amount. According to their own numbers an application using Address Sanitizer is around 1.8 times slower. This makes it feasible for fuzzing tasks.
For the fuzzing itself a tool that recently gained a lot of popularity is american fuzzy lop (afl). This was developed by Michal Zalewski from the Google security team, who is also known by his nick name lcamtuf. As far as I'm aware the approach of afl is unique. It adds instructions to an application during the compilation that allow the fuzzer to detect new code paths while running the fuzzing tasks. If a new interesting code path is found then the sample that created this code path is used as the starting point for further fuzzing.
Currently afl only uses file inputs and cannot directly fuzz network input. OpenSSL has a command line tool that allows all kinds of file inputs, so you can use it for example to fuzz the certificate parser. But this approach does not allow us to directly fuzz the TLS connection, because that only happens on the network layer. By fuzzing various file inputs I recently found two issues in OpenSSL, but both had been found by Brian Carpenter before, who at the same time was also fuzzing OpenSSL.
Let OpenSSL talk to itself
So to fuzz the TLS network connection I had to create a workaround. I wrote a small application that creates two instances of OpenSSL that talk to each other. This application doesn't do any real networking, it is just passing buffers back and forth and thus doing a TLS handshake between a server and a client. Each message packet is written down to a file. It will result in six files, but the last two are just empty, because at that point the handshake is finished and no more data is transmitted. So we have four files that contain actual data from a TLS handshake. If you want to dig into this, a good description of a TLS handshake is provided by the developers of OCaml-TLS and MirageOS.
Then I added the possibility of switching out parts of the handshake messages by files I pass on the command line. By calling my test application selftls with a number and a filename a handshake message gets replaced by this file. So to test just the first part of the server handshake I'd call the test application, take the output file packed-1 and pass it back again to the application by running selftls 1 packet-1. Now we have all the pieces we need to use american fuzzy lop and fuzz the TLS handshake.
I compiled OpenSSL 1.0.1f, the last version that was vulnerable to Heartbleed, with american fuzzy lop. This can be done by calling ./config and then replacing gcc in the Makefile with afl-gcc. Also we want to use Address Sanitizer, to do so we have to set the environment variable AFL_USE_ASAN to 1.
There are some issues when using Address Sanitizer with american fuzzy lop. Address Sanitizer needs a lot of virtual memory (many Terabytes). American fuzzy lop limits the amount of memory an application may use. It is not trivially possible to only limit the real amount of memory an application uses and not the virtual amount, therefore american fuzzy lop cannot handle this flawlessly. Different solutions for this problem have been proposed and are currently developed. I usually go with the simplest solution: I just disable the memory limit of afl (parameter -m -1). This poses a small risk: A fuzzed input may lead an application to a state where it will use all available memory and thereby will cause other applications on the same system to malfuction. Based on my experience this is very rare, so I usually just ignore that potential problem.
After having compiled OpenSSL 1.0.1f we have two files libssl.a and libcrypto.a. These are static versions of OpenSSL and we will use them for our test application. We now also use the afl-gcc to compile our test application:
AFL_USE_ASAN=1 afl-gcc selftls.c -o selftls libssl.a libcrypto.a -ldl
Now we run the application. It needs a dummy certificate. I have put one in the repo. To make things faster I'm using a 512 bit RSA key. This is completely insecure, but as we don't want any security here – we just want to find bugs – this is fine, because a smaller key makes things faster. However if you want to try fuzzing the latest OpenSSL development code you need to create a larger key, because it'll refuse to accept such small keys.
The application will give us six packet files, however the last two will be empty. We only want to fuzz the very first step of the handshake, so we're interested in the first packet. We will create an input directory for american fuzzy lop called in and place packet-1 in it. Then we can run our fuzzing job:
afl-fuzz -i in -o out -m -1 -t 5000 ./selftls 1 @@
We pass the input and output directory, disable the memory limit and increase the timeout value, because TLS handshakes are slower than common fuzzing tasks. On my test machine around 6 hours later afl found the first crash. Now we can manually pass our output to the test application and will get a stack trace by Address Sanitizer:
==2268==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000013748 at pc 0x7f228f5f0cfa bp 0x7fffe8dbd590 sp 0x7fffe8dbcd38
READ of size 32768 at 0x629000013748 thread T0
#0 0x7f228f5f0cf9 (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x2fcf9)
#1 0x43d075 in memcpy /usr/include/bits/string3.h:51
#2 0x43d075 in tls1_process_heartbeat /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/t1_lib.c:2586
#3 0x50e498 in ssl3_read_bytes /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_pkt.c:1092
#4 0x51895c in ssl3_get_message /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_both.c:457
#5 0x4ad90b in ssl3_get_client_hello /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:941
#6 0x4c831a in ssl3_accept /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:357
#7 0x412431 in main /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/selfs.c:85
#8 0x7f228f03ff9f in __libc_start_main (/lib64/libc.so.6+0x1ff9f)
#9 0x4252a1 (/data/openssl/openssl-handshake/openssl-1.0.1f-nobreakrng-afl-asan-fuzz/selfs+0x4252a1)
0x629000013748 is located 0 bytes to the right of 17736-byte region [0x62900000f200,0x629000013748)
allocated by thread T0 here:
#0 0x7f228f6186f7 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x576f7)
#1 0x57f026 in CRYPTO_malloc /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/crypto/mem.c:308
We can see here that the crash is a heap buffer overflow doing an invalid read access of around 32 Kilobytes in the function tls1_process_heartbeat(). It is the Heartbleed bug. We found it.
I want to mention a couple of things that I found out while trying this. I did some things that I thought were necessary, but later it turned out that they weren't. After Heartbleed broke the news a number of reports stated that Heartbleed was partly the fault of OpenSSL's memory management. A mail by Theo De Raadt claiming that OpenSSL has “exploit mitigation countermeasures” was widely quoted. I was aware of that, so I first tried to compile OpenSSL without its own memory management. That can be done by calling ./config with the option no-buf-freelist.
But it turns out although OpenSSL uses its own memory management that doesn't defeat Address Sanitizer. I could replicate my fuzzing finding with OpenSSL compiled with its default options. Although it does its own allocation management, it will still do a call to the system's normal malloc() function for every new memory allocation. A blog post by Chris Rohlf digs into the details of the OpenSSL memory allocator.
Breaking random numbers for deterministic behaviour
When fuzzing the TLS handshake american fuzzy lop will report a red number counting variable runs of the application. The reason for that is that a TLS handshake uses random numbers to create the master secret that's later used to derive cryptographic keys. Also the RSA functions will use random numbers. I wrote a patch to OpenSSL to deliberately break the random number generator and let it only output ones (it didn't work with zeros, because OpenSSL will wait for non-zero random numbers in the RSA function).
During my tests this had no noticeable impact on the time it took afl to find Heartbleed. Still I think it is a good idea to remove nondeterministic behavior when fuzzing cryptographic applications. Later in the handshake there are also timestamps used, this can be circumvented with libfaketime, but for the initial handshake processing that I fuzzed to find Heartbleed that doesn't matter.
You may ask now what the point of all this is. Of course we already know where Heartbleed is, it has been patched, fixes have been deployed and it is mostly history. It's been analyzed thoroughly.
The question has been asked if Heartbleed could've been found by fuzzing. I'm confident to say the answer is yes. One thing I should mention here however: American fuzzy lop was already available back then, but it was barely known. It only received major attention later in 2014, after Michal Zalewski used it to find two variants of the Shellshock bug. Earlier versions of afl were much less handy to use, e. g. they didn't have 64 bit support out of the box. I remember that I failed to use an earlier version of afl with Address Sanitizer, it was only possible after a couple of issues were fixed. A lot of other things have been improved in afl, so at the time Heartbleed was found american fuzzy lop probably wasn't in a state that would've allowed to find it in an easy, straightforward way.
I think the takeaway message is this: We have powerful tools freely available that are capable of finding bugs like Heartbleed. We should use them and look for the other Heartbleeds that are still lingering in our software. Take a look at the Fuzzing Project if you're interested in further fuzzing work. There are beginner tutorials that I wrote with the idea in mind to show people that fuzzing is an easy way to find bugs and improve software quality.
I already used my sample application to fuzz the latest OpenSSL code. Nothing was found yet, but of course this could be further tweaked by trying different protocol versions, extensions and other variations in the handshake.
I also wrote a German article about this finding for the IT news webpage Golem.de.
I want to point out some feedback I got that I think is noteworthy.
On Twitter it was mentioned that Codenomicon actually found Heartbleed via fuzzing. There's a Youtube video from Codenomicon's Antti Karjalainen explaining the details. However the way they did this was quite different, they built a protocol specific fuzzer. The remarkable feature of afl is that it is very powerful without knowing anything specific about the used protocol. Also it should be noted that Heartbleed was found twice, the first one was Neel Mehta from Google.
Kostya Serebryany mailed me that he was able to replicate my findings with his own fuzzer which is part of LLVM, and it was even faster.
In the comments Michele Spagnuolo mentions that by compiling OpenSSL with -DOPENSSL_TLS_SECURITY_LEVEL=0 one can use very short and insecure RSA keys even in the latest version. Of course this shouldn't be done in production, but it is helpful for fuzzing and other testing efforts.
Sunday, March 15. 2015
Just wanted to quickly announce two talks I'll give in the upcoming weeks: One at BSidesHN (Hannover, 20th March) about some findings related to PGP and keyservers and one at the Easterhegg (Braunschweig, 4th April) about the current state of TLS.
A look at the PGP ecosystem and its keys
PGP-based e-mail encryption is widely regarded as an important tool to provide confidential and secure communication. The PGP ecosystem consists of the OpenPGP standard, different implementations (mostly GnuPG and the original PGP) and keyservers.
The PGP keyservers operate on an add-only basis. That means keys can only be uploaded and never removed. We can use these keyservers as a tool to investigate potential problems in the cryptography of PGP-implementations. Similar projects regarding TLS and HTTPS have uncovered a large number of issues in the past.
The talk will present a tool to parse the data of PGP keyservers and put them into a database. It will then have a look at potential cryptographic problems. The tools used will be published under a free license after the talk.
A look at the PGP ecosystem through the key server data (background paper)
Some tales from TLS
The TLS protocol is one of the foundations of Internet security. In recent years it's been under attack: Various vulnerabilities, both in the protocol itself and in popular implementations, showed how fragile that foundation is.
On the other hand new features allow to use TLS in a much more secure way these days than ever before. Features like Certificate Transparency and HTTP Public Key Pinning allow us to avoid many of the security pitfals of the Certificate Authority system.
Update: Slides and video available. Bonus: Contains rant about DNSSEC/DANE.
Slides PDF, LaTeX, Slideshare
Video recording, also on Youtube
Monday, February 23. 2015
tl;dr There is a software called Privdog. It totally breaks HTTPS security in a similar way as Superfish.
In case you haven't heard it the past days an Adware called Superfish made headlines. It was preinstalled on Lenovo laptops and it is bad: It totally breaks the security of HTTPS connections. The story became bigger when it became clear that a lot of other software packages were using the same technology Komodia with the same security risk.
What Superfish and other tools do is that it intercepts encrypted HTTPS traffic to insert Advertising on webpages. It does so by breaking the HTTPS encryption with a Man-in-the-Middle-attack, which is possible because it installs its own certificate into the operating system.
A number of people gathered in a chatroom and we noted a thread on Hacker News where someone asked whether a tool called PrivDog is like Superfish. PrivDog's functionality is to replace advertising in web pages with it's own advertising "from trusted sources". That by itself already sounds weird even without any security issues.
A quick analysis shows that it doesn't have the same flaw as Superfish, but it has another one which arguably is even bigger. While Superfish used the same certificate and key on all hosts PrivDog recreates a key/cert on every installation. However here comes the big flaw: PrivDog will intercept every certificate and replace it with one signed by its root key. And that means also certificates that weren't valid in the first place. It will turn your Browser into one that just accepts every HTTPS certificate out there, whether it's been signed by a certificate authority or not. We're still trying to figure out the details, but it looks pretty bad. (with some trickery you can do something similar on Superfish/Komodia, too)
There are some things that are completely weird. When one surfs to a webpage that has a self-signed certificate (really self-signed, not signed by an unknown CA) it adds another self-signed cert with 512 bit RSA into the root certificate store of Windows. All other certs get replaced by 1024 bit RSA certs signed by a locally created PrivDog CA.
US-CERT writes: "Adtrustmedia PrivDog is promoted by the Comodo Group, which is an organization that offers SSL certificates and authentication solutions." A variant of PrivDog that is not affected by this issue is shipped with products produced by Comodo (see below). This makes this case especially interesting because Comodo itself is a certificate authority (they had issues before). As ACLU technologist Christopher Soghoian points out on Twitter the founder of PrivDog is the CEO of Comodo. (See this blog post.)
We will try to collect information on this and other simliar software in a Wiki on Github. Discussions also happen on irc.ringoflightning.net #kekmodia.)
Thanks to Filippo, slipstream / raylee and others for all the analysis that has happened on this issue.
Update/Clarification: The dangerous TLS interception behaviour is part of the latest version of PrivDog 18.104.22.168, which can be downloaded from the PrivDog webpage. Comodo Internet Security bundles an earlier version of PrivDog that works with a browser extension, so it is not directly vulnerable to this threat. According to online sources PrivDog 22.214.171.124 was released in December 2014 and changed the TLS interception technology.
Update 2: Privdog published an Advisory.
Tuesday, November 4. 2014
The world of SSL/TLS Internet encryption is in trouble again. You may have heard that recently a new vulnerability called POODLE has been found in the ancient SSLv3 protocol. Shortly before another vulnerability that's called BERserk has been found (which hasn't received the attention it deserved because it was published on the same day as Shellshock).
The latest SSL attack was called POODLE. Image source
I think it is crucial to understand what led to these vulnerabilities. I find POODLE and BERserk so interesting because these two vulnerabilities were both unnecessary and could've been avoided by intelligent design choices. Okay, let's start by investigating what went wrong.
The mess with CBC
POODLE (Padding Oracle On Downgraded Legacy Encryption) is a weakness in the CBC block mode and the padding of the old SSL protocol. If you've followed previous stories about SSL/TLS vulnerabilities this shouldn't be news. There have been a whole number of CBC-related vulnerabilities, most notably the Padding oracle (2003), the BEAST attack (2011) and the Lucky Thirteen attack (2013) (Lucky Thirteen is kind of my favorite, because it was already more or less mentioned in the TLS 1.2 standard). The POODLE attack builds on ideas already used in previous attacks.
CBC is a so-called block mode. For now it should be enough to understand that we have two kinds of ciphers we use to authenticate and encrypt connections – block ciphers and stream ciphers. Block ciphers need a block mode to operate. There's nothing necessarily wrong with CBC, it's the way CBC is used in SSL/TLS that causes problems. There are two weaknesses in it: Early versions (before TLS 1.1) use a so-called implicit Initialization Vector (IV) and they use a method called MAC-then-Encrypt (used up until the very latest TLS 1.2, but there's a new extension to fix it) which turned out to be quite fragile when it comes to security. The CBC details would be a topic on their own and I won't go into the details now. The long-term goal should be to get rid of all these (old-style) CBC modes, however that won't be possible for quite some time due to compatibility reasons. As most of these problems have been known since 2003 it's about time.
The evil Protocol Dance
The interesting question with POODLE is: Why does a security issue in an ancient protocol like SSLv3 bother us at all? SSL was developed by Netscape in the mid 90s, it has two public versions: SSLv2 and SSLv3. In 1999 (15 years ago) the old SSL was deprecated and replaced with https://tools.ietf.org/html/rfc2246 TLS 1.0 standardized by the IETF. Now people still used SSLv3 up until very recently mostly for compatibility reasons. But even that in itself isn't the problem. SSL/TLS has a mechanism to safely choose the best protocol available. In a nutshell it works like this:
a) A client (e. g. a browser) connects to a server and may say something like "I want to connect with TLS 1.2“
b) The server may answer "No, sorry, I don't understand TLS 1.2, can you please connect with TLS 1.0?“
c) The client says "Ok, let's connect with TLS 1.0“
The point here is: Even if both server and client support the ancient SSLv3, they'd usually not use it. But this is the idealized world of standards. Now welcome to the real world, where things like this happen:
a) A client (e. g. a browser) connects to a server and may say something like "I want to connect with TLS 1.2“
b) The server thinks "Oh, TLS 1.2, never heard of that. What should I do? I better say nothing at all...“
c) The browser thinks "Ok, server doesn't answer, maybe we should try something else. Hey, server, I want to connect with TLS 1.1“
d) The browser will retry all SSL versions down to SSLv3 till it can connect.
So here's our problem: There are broken servers out there that don't answer at all if they see a connection attempt with an unknown protocol. The well known SSL test by Qualys checks for this behaviour and calls it „Protocol intolerance“ (but „Protocol brokenness“ would be more precise). On connection fails the browsers will try all old protocols they know until they can connect. This behaviour is now known as the „Protocol Dance“ - and it causes all kinds of problems.
The Protocol Dance is a Dance with the Devil. Image source
I first encountered the Protocol Dance back in 2008. Back then I already used a technology called SNI (Server Name Indication) that allows to have multiple websites with multiple certificates on a single IP address. I regularly got complains from people who saw the wrong certificates on those SNI webpages. A bug report to Firefox and some analysis revealed the reason: The protocol downgrades don't just happen when servers don't answer to new protocol requests, they also can happen on faulty or weak internet connections. SSLv3 does not support SNI, so when a downgrade to SSLv3 happens you get the wrong certificate. This was quite frustrating: A compatibility feature that was purely there to support broken hardware caused my completely legit setup to fail every now and then.
But the more severe problem is this: The Protocol Dance will allow an attacker to force downgrades to older (less secure) protocols. He just has to stop connection attempts with the more secure protocols. And this is why the POODLE attack was an issue after all: The problem was not backwards compatibility. The problem was attacker-controlled backwards compatibility.
The idea that the Protocol Dance might be a security issue wasn't completely new either. At the Black Hat conference this year Antoine Delignat-Lavaud presented a variant of an attack he calls "Virtual Host Confusion“ where he relied on downgrading connections to force SSLv3 connections.
"Whoever breaks it first“ - principle
The Protocol Dance is an example for something that I feel is an unwritten rule of browser development today: Browser vendors don't want things to break – even if the breakage is the fault of someone else. So they add all kinds of compatibility technologies that are purely there to support broken hardware. The idea is: When someone introduced broken hardware at some point – and it worked because the brokenness wasn't triggered at that point – the broken stuff is allowed to stay and all others have to deal with it.
To avoid the Protocol Dance a new feature is now on its way: It's called SCSV and the idea is that the Protocol Dance is stopped if both the server and the client support this new protocol feature. I'm extremely uncomfortable with that solution because it just adds another layer of duct tape and increases the complexity of TLS which already is much too complex.
There's another recent example which is very similar: At some point people found out that BIG-IP load balancers by the company F5 had trouble with TLS connection attempts larger than 255 bytes. However it was later revealed that connection attempts bigger than 512 bytes also succeed. So a padding extension was invented and it's now widespread behaviour of TLS implementations to avoid connection attempts between 256 and 511 bytes. To make matters completely insane: It was later found out that there is other broken hardware – SMTP servers by Ironport – that breaks when the handshake is larger than 511 bytes.
I have a principle when it comes to fixing things: Fix it where its broken. But the browser world works differently. It works with the „whoever breaks it first defines the new standard of brokenness“-principle. This is partly due to an unhealthy competition between browsers. Unfortunately they often don't compete very well on the security level. What you'll constantly hear is that browsers can't break any webpages because that will lead to people moving to other browsers.
I'm not sure if I entirely buy this kind of reasoning. For a couple of months the support for the ftp protocol in Chrome / Chromium is broken. I'm no fan of plain, unencrypted ftp and its only legit use case – unauthenticated file download – can just as easily be fulfilled with unencrypted http, but there are a number of live ftp servers that implement a legit and working protocol. I like Chromium and it's my everyday browser, but for a while the broken ftp support was the most prevalent reason I tend to start Firefox. This little episode makes it hard for me to believe that they can't break connections to some (broken) ancient SSL servers. (I just noted that the very latest version of Chromium has fixed ftp support again.)
BERserk, small exponents and PKCS #1 1.5
Okay, now let's talk about the other recent TLS vulnerability: BERserk. Independently Antoine Delignat-Lavaud and researchers at Intel found this vulnerability which affected NSS (and thus Chrome and Firefox), CyaSSL, some unreleased development code of OpenSSL and maybe others.
We have a problem with weak keys. Image source
BERserk is actually a variant of a quite old vulnerability (you may begin to see a pattern here): The Bleichenbacher attack on RSA first presented at Crypto 2006. Now here things get confusing, because the cryptographer Daniel Bleichenbacher found two independent vulnerabilities in RSA. One in the RSA encryption in 1998 and one in RSA signatures in 2006, for convenience I'll call them BB98 (encryption) and BB06 (signatures). Both of these vulnerabilities expose faulty implementations of the old RSA standard PKCS #1 1.5. And both are what I like to call "zombie vulnerabilities“. They keep coming back, no matter how often you try to fix them. In April the BB98 vulnerability was re-discovered in the code of Java and it was silently fixed in OpenSSL some time last year.
But BERserk is about the other one: BB06. BERserk exposes the fact that inside the RSA function an algorithm identifier for the used hash function is embedded and its encoded with BER. BER is part of ASN.1. I could tell horror stories about ASN.1, but I'll spare you that for now, maybe this is a topic for another blog entry. It's enough to know that it's a complicated format and this is what bites us here: With some trickery in the BER encoding one can add further data into the RSA function – and this allows in certain situations to create forged signatures.
One thing should be made clear: Both the original BB06 attack and BERserk are flaws in the implementation of PKCS #1 1.5. If you do everything correct then you're fine. These attacks exploit the relatively simple structure of the old PKCS standard and they only work when RSA is done with a very small exponent. RSA public keys consist of two large numbers. The modulus N (which is a product of two large primes) and the exponent.
In his presentation at Crypto 2006 Daniel Bleichenbacher already proposed what would have prevented this attack: Just don't use RSA keys with very small exponents like three. This advice also went into various recommendations (e. g. by NIST) and today almost everyone uses 65537 (the reason for this number is that due to its binary structure calculations with it are reasonably fast).
There's just one problem: A small number of keys are still there that use the exponent e=3. And six of them are used by root certificates installed in every browser. These root certificates are the trust anchor of TLS (which in itself is a problem, but that's another story). Here's our problem: As long as there is one single root certificate with e=3 with such an attack you can create as many fake certificates as you want. If we had deprecated e=3 keys BERserk would've been mostly a non-issue.
There is one more aspect of this story: What's this PKCS #1 1.5 thing anyway? It's an old standard for RSA encryption and signatures. I want to quote Adam Langley on the PKCS standards here: "In a modern light, they are all completely terrible. If you wanted something that was plausible enough to be widely implemented but complex enough to ensure that cryptography would forever be hamstrung by implementation bugs, you would be hard pressed to do better."
Now there's a successor to the PKCS #1 1.5 standard: PKCS #1 2.1, which is based on technologies called PSS (Probabilistic Signature Scheme) and OAEP (Optimal Asymmetric Encryption Padding). It's from 2002 and in many aspects it's much better. I am kind of a fan here, because I wrote my thesis about this. There's just one problem: Although already standardized 2002 people still prefer to use the much weaker old PKCS #1 1.5. TLS doesn't have any way to use the newer PKCS #1 2.1 and even the current drafts for TLS 1.3 stick to the older - and weaker - variant.
What to do
I would take bets that POODLE wasn't the last TLS/CBC-issue we saw and that BERserk wasn't the last variant of the BB06-attack. Basically, I think there are a number of things TLS implementers could do to prevent further similar attacks:
* The Protocol Dance should die. Don't put another layer of duct tape around it (SCSV), just get rid of it. It will break a small number of already broken devices, but that is a reasonable price for avoiding the next protocol downgrade attack scenario. Backwards compatibility shouldn't compromise security.
* More generally, I think the working around for broken devices has to stop. Replace the „whoever broke it first“ paradigm with a „fix it where its broken“ paradigm. That also means I think the padding extension should be scraped.
* Keys with weak choices need to be deprecated at some point. In a long process browsers removed most certificates with short 1024 bit keys. They're working hard on deprecating signatures with the weak SHA1 algorithm. I think e=3 RSA keys should be next on the list for deprecation.
* At some point we should deprecate the weak CBC modes. This is probably the trickiest part, because up until very recently TLS 1.0 was all that most major browsers supported. The only way to avoid them is either using the GCM mode of TLS 1.2 (most browsers just got support for that in recent months) or using a very new extension that's rarely used at all today.
* If we have better technologies we should start using them. PKCS #1 2.1 is clearly superior to PKCS #1 1.5, at least if new standards get written people should switch to it.
Update: I just read that Mozilla Firefox devs disabled the protocol dance in their latest nightly build. Let's hope others follow.
Saturday, July 12. 2014
Yesterday the LibreSSL project released the first portable version that works on Linux. LibreSSL is a fork of OpenSSL and was created by the OpenBSD team in the aftermath of the Heartbleed bug.
Yesterday and today I played around with it on Gentoo Linux. I was able to replace my system's OpenSSL completely with LibreSSL and with few exceptions was able to successfully rebuild all packages using OpenSSL.
After getting this running on my own system I installed it on a test server. The Webpage tlsfun.de runs on that server. The functionality changes are limited, the only thing visible from the outside is the support for the experimental, not yet standardized ChaCha20-Poly1305 cipher suites, which is a nice thing.
A warning ahead: This is experimental, in no way stable or supported and if you try any of this you do it at your own risk. Please report any bugs you have with my overlay to me or leave a comment and don't disturb anyone else (from Gentoo or LibreSSL) with it. If you want to try it, you can get a portage overlay in a subversion repository. You can check it out with this command:
svn co https://svn.hboeck.de/libressl-overlay/
git clone https://github.com/gentoo/libressl.git
This is what I had to do to get things running:
First of all the Gentoo tree contains a lot of packages that directly depend on openssl, so I couldn't just replace that. The correct solution to handle such issues would be to create a virtual package and change all packages depending directly on openssl to depend on the virtual. This is already discussed in the appropriate Gentoo bug, but this would mean patching hundreds of packages so I skipped it and worked around it by leaving a fake openssl package in place that itself depends on libressl.
LibreSSL deprecates some APIs from OpenSSL. The first thing that stopped me was that various programs use the functions RAND_egd() and RAND_egd_bytes(). I didn't know until yesterday what egd is. It stands for Entropy Gathering Daemon and is a tool written in perl meant to replace the functionality of /dev/(u)random on non-Linux-systems. The LibreSSL-developers consider it insecure and after having read what it is I have to agree. However, the removal of those functions causes many packages not to build, upon them wget, python and ruby. My workaround was to add two dummy functions that just return -1, which is the error code if the Entropy Gathering Daemon is not available. So the API still behaves like expected. I also posted the patch upstream, but the LibreSSL devs don't like it. So on the long term it's probably better to fix applications to stop trying to use egd, but for now these dummy functions make it easier for me to build my system.
The second issue popping up was that the libcrypto.so from libressl contains an undefined main() function symbol which causes linking problems with a couple of applications (subversion, xorg-server, hexchat). According to upstream this undefined symbol is intended and most likely these are bugs in the applications having linking problems. However, for now it was easier for me to patch the symbol out instead of fixing all the apps. Like the egd issue on the long term fixing the applications is better.
The third issue was that LibreSSL doesn't ship pkg-config (.pc) files, some apps use them to get the correct compilation flags. I grabbed the ones from openssl and adjusted them accordingly.
This was the most interesting issue from all of them.
To understand this you have to understand how both LibreSSL and OpenSSH are developed. They are both from OpenBSD and they use some functions that are only available there. To allow them to be built on other systems they release portable versions which ship the missing OpenBSD-only-functions. One of them is arc4random().
Both LibreSSL and OpenSSH ship their compatibility version of arc4random(). The one from OpenSSH calls RAND_bytes(), which is a function from OpenSSL. The RAND_bytes() function from LibreSSL however calls arc4random(). Due to the linking order OpenSSH uses its own arc4random(). So what we have here is a nice recursion. arc4random() and RAND_bytes() try to call each other. The result is a segfault.
I fixed it by using the LibreSSL arc4random.c file for OpenSSH. I had to copy another function called arc4random_stir() from OpenSSH's arc4random.c and the header file thread_private.h. Surprisingly, this seems to work flawlessly.
This package contains the perl bindings for openssl. The problem is a check for the openssl version string that expected the name OpenSSL and a version number with three numbers and a letter (like 1.0.1h). LibreSSL prints the version 2.0. I just hardcoded the OpenSSL version numer, which is not a real fix, but it works for now.
SpamAssassin's code for spamc requires SSLv2 functions to be available. SSLv2 is heavily insecure and should not be used at all and therefore the LibreSSL devs have removed all SSLv2 function calls. Luckily, Debian had a patch to remove SSLv2 that I could use.
libesmtp / gwenhywfar
Some DES-related functions (DES is the old Data Encryption Standard) in OpenSSL are available in two forms: With uppercase DES_ and with lowercase des_. I can only guess that the des_ variants are for backwards compatibliity with some very old versions of OpenSSL. According to the docs the DES_ variants should be used. LibreSSL has removed the des_ variants.
For gwenhywfar I wrote a small patch and sent it upstream. For libesmtp all the code was in ntlm. After reading that ntlm is an ancient, proprietary Microsoft authentication protocol I decided that I don't need that anyway so I just added --disable-ntlm to the ebuild.
In Dovecot two issues popped up. LibreSSL removed the SSL Compression functionality (which is good, because since the CRIME attack we know it's not secure). Dovecot's configure script checks for it, but the check doesn't work. It checks for a function that LibreSSL keeps as a stub. For now I just disabled the check in the configure script. The solution is probably to remove all remaining stub functions. The configure script could probably also be changed to work in any case.
The second issue was that the Dovecot code has some #ifdef clauses that check the openssl version number for the ECDH auto functionality that has been added in OpenSSL 1.0.2 beta versions. As the LibreSSL version number 2.0 is higher than 1.0.2 it thinks it is newer and tries to enable it, but the code is not present in LibreSSL. I changed the #ifdefs to check for the actual functionality by checking a constant defined by the ECDH auto code.
The Apache http compilation complained about a missing ENGINE_CTRL_CHIL_SET_FORKCHECK. I have no idea what it does, but I found a patch to fix the issue, so I didn't investigate it further.
Someone else tried to get things running on Sabotage Linux.
Update: I've abandoned my own libressl overlay, a LibreSSL overlay by various Gentoo developers is now maintained at GitHub.
Wednesday, June 18. 2014
Today I had a little twitter conversation which made me think about the responsibilities a science journalist has. It all started with a quote from Ivan Oransky (who is the editor of Retraction Watch) who said reporting on a study without reading it is 'journalist malpractice'. The source of this is another person who probably just heard him saying that, so I'm not sure what his exact words were.
Admittedly my first thought was: "He is right, too many journalists report about things they don't understand." My second thought was: "If he is right then I am probably guilty of 'journalist malpractice'." So I gave it a second thought and I probably won't agree with the statement any more.
I had a quick look at articles I wrote in the past and I have identified the last ten ones that more or less were coverages of a scientific piece of work. I have marked the ones I actually read with a [Y] and the ones I didn't read with a [N]. I've linked the appropriate scientific works and my articles (all in German). I must admit that I defined "read" widely, meaning that I haven't neccesarrily read the whole study/article in detail, I sometimes have just tried to parse the important parts for me.
Now the first thing that comes to mind is that I seem to have become lazier recently in reading studies. I hope this isn't the case and I hoestly think this is mostly coincidence. Now let's get into some details: The first example (the Turing Test) is interesting because it seems there is no scientific publication at all, just a press release. This probably tells you something about the quality of that "research", but while I read the press release I haven't even bothered to check if there is a scientific publication I could read.
The second example becomes interesting. I understand enough to know what a "quasi-polynomial algorithm for discrete logarithm in finite fields of small characteristic" actually is and I think I also understand what it means, but there's just no way I could understand the paper itself. This is complex mathematics. I seriously doubt that any journalist who covered this work actually read it. If there is I'd like to meet that person. I'm also very sure that the people who wrote the press release overselling this research have neither read this paper nor understood its implications.
I think this example gets to the point why I would disagree with the very general statement that a journalist should've read every scientific piece he writes about: It's sometimes so specialized that it's basically impossible. And I don't think this is an out of the line example. Just think about the Higgs Boson: Certainly this is something we want journalists to write about. But I'm pretty sure there are very few - if any - journalists who are able to read the scientific publications that are the basis of this discovery.
Some quick notes on the others: Number 4 was part of a 200-page-thesis and the press release was already pretty detailed and technically, I think it was legitimate to not read the original source in that case. Number 5 is somewhat similar to 2, because it is about an algorithm that includes complex math. Number 8 is not really a scientific paper, it is merely a news item on the Nature webpage. In the above list, the only case where I think maybe I should've read the scientific paper and I didn't is the Cochrane-Review on Tamiflu.
Conclusion: Don't get me wrong. I certainly welcome the idea that science journalists should have a look into the original scientific papers they write about more often - and this doesn't exclude myself. However, as shown above I doubt that this works in all cases.
Sunday, June 15. 2014
I recently held a workshop about cryptography for web developers at the company Internations. I am publishing the slides here.
Part 1: Crypto and Web [PDF] [LaTeX], [Slideshare]
Part 2: How broken is TLS? [PDF] [LaTeX], [Slideshare]
Part 3: Don't do this yourself [PDF] [LaTeX], [Slideshare]
Part 4: Hashing, Tokens, Randomness [PDF] [LaTeX], [Slideshare]
Part 5: Don't believe the Crypto Hype [PDF] [LaTeX] [Slideshare]
Part 2 is the same talk I recently have at the Easterhegg conference about TLS.
Friday, June 6. 2014
I recently switched my personal web page and my blog to deliver content exclusively encrypted via HTTPS. I want to take this opportunity to give some facts about enabling TLS encryption by default and problems you may face.
First of all the non-problems: Enabling HTTPS by default is almost never a significant performance problem. If people tell me that they can not possibly enable HTTPS due to performance reasons the first thing I ask is if they believe this or if they have real benchmark data showing this. If you don't believe me on that, I can quote Adam Langley from Google here: "In January this year (2010), Gmail switched to using HTTPS for everything by default. Previously it had been introduced as an option, but now all of our users use HTTPS to secure their email between their browsers and Google, all the time. In order to do this we had to deploy no additional machines and no special hardware. On our production frontend machines, SSL/TLS accounts for less than 1% of the CPU load, less than 10KB of memory per connection and less than 2% of network overhead."
Enabling HTTPS may cause a number of compatibility issues you may not instantly think about. First of all, we know that IPs in the IPv4 space are limited and expensive these days, so many people probably can't afford having a distinct IP for their web page. The solution to that is a TLS extension called SNI (Server Name Indication) which allows to have different certificates for different domain names on the same IP. It works in all major browsers and has been working for quite some time. The only major browser you'll face these days that doesn't support SNI is the Android 2.x browser.
There are some subtle issues with SNI. One is that browsers have fallback modes if they cannot connect via TLS and that may lead to a connection downgrade to SSLv3. And that ancient protocol doesn't support extensions and thus no SNI. So you may have irregular certificate errors if you are on a bad connection. A solution to that on the server side is to just disable SSLv3. It will make SNI much more reliable.
I don't really have a clear picture how many browsers will fail with SNI. There are probably a number of embedded devices out there like smart TVs with browsers or things alike that have problems. If you have any experiences feel free to post them in the comments.
The first issue I only noticed after I switched to HTTPS: I had an application called RSS Graffiti set up to automatically post all articles I write to a facebook fan page. After changing to HTTPS only it silently stopped working. Re-adding my feed didn't work. I now found a similar service called dlvr.it that I now use to post my RSS feed to facebook. I can only assume that this is a glimpse of a much bigger problem: There are probably tons of applications and online services out there not prepared for an encrypted Internet. If we want more people to deploy encryption by default we need to find these issues, document them and hopefully put enough pressure on their developers to fix them.
Another yet unfixed issue is the Yandex Bot. Yandex is a search engine and although you may never have heard of it it's probably one of the few companies in this area that can claim to be a serious competitor to Google. The reason you may not know it is that it's mostly operating in Russian language. Depending on who your page visitors are this may matter more or less.
The Yandex Bot speaks SSL but according to the Qualys SSL test it only supports the ancient SSLv3. So you have a choice between three possibilities: Don't enable HTTPS by default, enable HTTPS with a shitty configuration supporting ancient technology that will cause trouble for SNI or enable HTTPS with a sane configuration and get no traffic from the leading Russian search engine. None of them sounds very good to me.
Tuesday, April 29. 2014
A number of people seem to be confused how to correctly install certificate chains for TLS servers. This happens quite often on HTTPS sites and to avoid having to explain things again and again I thought I'd write up something so I can refer to it. A few days ago flattr.com had a missing certificate chain (fixed now after I reported it) and various pages from the Chaos Computer Club have no certificate chain (not the main page, but several subdomains like events.ccc.de and frab.cccv.de). I've tried countless times to tell someone, but the problem persists. Maybe someone in charge will read this and fix it.
Web browsers ship a list of certificate authorities (CAs) that are allowed to issue certificates for HTTPS websites. The whole system is inherently problematic, but right now that's not the point I want to talk about. Most of the time, people don't get their certificate from one of the root CAs but instead from a subordinate CA. Every CA is allowed to have unlimited numbers of sub CAs.
The correct way of delivering a certificate issued by a sub CA is to deliver both the host certificate and the certificate of the sub CA. This is neccesarry so the browser can check the complete chain from the root to the host. For example if you buy your certificate from RapidSSL then the RapidSSL cert is not in the browser. However, the RapidSSL certificate is signed by GeoTrust and that is in your browser. So if your HTTPS website delivers both its own certificate by RapidSSL and the RapidSSL certificate, the browser can validate the whole chain.
However, and here comes the tricky part: If you forget to deliver the chain certificate you often won't notice. The reason is that browsers cache chain certificates. In our example above if a user first visits a website with a certificate from RapidSSL and the correct chain the browser will already know the RapidSSL certificate. If the user then surfs to a page where the chain is missing the browser will still consider the certificate as valid. Such certificates with missing chain have been called transvalid, I think the term was first used by the EFF for their SSL Observatory.
Now the CCC uses certificates from CAcert.org. Two more issues pop up here that make things even more complicated. First of all, the root certificate of CAcert is not in browsers, users have to manually import it. But CAcert offers both their root (Class 1) and sub (Class 3) certificate on the same webpage and doesn't really tell users that they usually only have to import the root. So everyone who imports both certificates will see transvalid CAcert certificates as valid. The second issue that pops up is that browsers sometimes do weird things when it comes to certificate error messages. I have no idea why exactly this is happening, but if you have the CAcert root installed and use Chromium to surf to a page with a transvalid CAcert certificate, it'll warn you about a weak signature algorithm. This doesn't make any sense, I can only assume that it has something to do with the fact that the CAcert root is self-signed with MD5 (which isn't a security issue, because self-signatures don't really have any meaning, they're just there because X.509 doesn't allow certificates without a signature).
Chromium with bogus error message on a transvalid certificate
So how can you check if you have a transvalid certificate? One way is to use a fresh browser installation without anything cached. If you then surf to a page with a transvalid certificate, you'll get an error message (however, as we've just seen, not neccessarily a meaningful one). An easier way is to use the SSL Test from Qualys. It has a line "Chain issues" and if it shows "None" you're fine. If it shows "Incomplete" then your certificate is most likely transvalid. If it shows anything else you have other things to look after (a common issues is that people unneccesarily send the root certificate, which doesn't cause issues but may make things slower). The Qualys test test will tell you all kinds of other things about your TLS configuration. If it tells you something is insecure you should probably look after that, too.
(Page 1 of 4, totaling 58 entries) » next page