SHA-1 chosen prefix collisions and DNSSEC

2020-01-09 - News - Tony Finch

Thanks to Viktor Dukhovni for helpful discussions about some of the details that went in to this post.

On the 7th January, a new more flexible and efficient collision attack against SHA-1 was announced: SHA-1 is a shambles. SHA-1 is deprecated but still used in DNSSEC, and this collision attack means that some attacks against DNSSEC are now merely logistically challenging rather than being cryptographically infeasible.

As a consequence, anyone who is using a SHA-1 DNSKEY algorithm (algorithm numbers 7 or less) should upgrade. The recommended algorithms are 13 (ECDSAP256SHA256) or 8 (RSASHA256, with 2048 bit keys).

Update: I have written a follow-up note about SHA-1 and DNSSEC validation

The SHAmbles attack

In 2017 the SHAttered attack demonstrated the first SHA-1 collision. This was not an immediate disaster for DNSSEC because SHAttered required the start of the input to have a special structure that causes a collision, and there are not many input formats that are malleable enough to accommodate the attack. SHAttered made PDF files with colliding SHA-1 hashes, but DNSSEC avoided the worst.

The SHAmbles attack is a chosen prefix collision, so an attacker can construct two input prefixes with complete freedom, and calculate suffixes that will make their SHA-1 hashes collide.

For an attack against DNSSEC, the two prefixes will be a pair of RRsets: one consisting of superficially benign trojan horse records with an owner name that is under the control of the attacker; and the other being attack records with an owner name that the attacker is targeting. Something like,

trojan-horse.example.  3600 IN TXT ...

attack-target.example. 3600 IN TXT ...

The signature metadata, owner names and other DNS rubric appear at the start of the input to the signature algorithm, and the suffixes that make them collide need to be smuggled into the right-hand-side of two DNS records.

Attacker capabilities

For the purpose of this analysis, we assume that the attacker has full control over the network, so they are able to intercept and spoof traffic at will. In this situation, DNSSEC should be able to detect and prevent spoofed DNS records, so the attacker can only deny service without being able to successfully alter the DNS.

(The argument is that pwning the network should look easy compared to the difficulty of breaking the cryptography.)

Our attacker doesn't have access to any endpoints or people in their target, except for a small toehold. They want to use DNSSEC to expand their toehold to compromise other systems, for example an intranet site that accepts passwords from staff in the target organization.

Targeting the DNS

One way the attacker can get interesting effects from spoofing the DNS is to make some third party believe that the attacker is in control of a domain name that they are not. For example, cloud providers such as Amazon, Google, and Microsoft commonly authenticate control over a domain name using TXT records in the DNS; this is also how the ACME dns-01 challenge authenticates a request for a TLS certificate.

Our attacker wants to use a SHAmbles collision to make it possible spoof the DNS despite DNSSEC. They need to get their trojan horse records into the same zone that contains the target name. The zone owner will sign the trojan horse records, and if their signature algorithm uses SHA-1, the attacker can take the trojan horse signature and attach it to their attack RRset to make it look cryptographically legitimate.

SHAmbling TXT records

TXT records are ideal for a SHAmbles collision. Section 1.1 of the SHAmbles paper says, "Our attack uses one partial block for the birthday stage, and 9 near-collision blocks." This is illustrated in figure 7 (p.20) in the paper, which shows the attack needs 588 bytes.

A TXT record can contain arbitrary binary data. It consists of a series of substrings, each with a 1-byte count followed by that much data. The SHAmbles collision needs more than one 255 byte substring to work, so some trickery is needed.

Our attacker's trojan horse records might look roughly like:

$ORIGIN _acme-challenge.toehold.example.
@  3600  IN  TXT  "innocuous stuff"
@  3600  IN  TXT  "\255\123...\42\69\0" ""...

The second record contains the collision blocks that make the attack work. The attacker has to ensure that the collision blocks sort after the innocuous prefix (in DNSSEC canonical sorting order), which can be as simple as ensuring that the second record is longer than the first.

The record containing the collision blocks may need to start with some arbitrary padding to align with SHA-1's 512-bit input block boundaries. The substring length octets inside the collision blocks probably cannot be controlled, so our attacker needs to add a trailer to re-align the substring lengths to the end of the TXT record. The trailer is just 255 zero bytes: the first part of the trailer uses up any remaining space in the last substring of the collision blocks, and the rest of the trailer is interpreted as zero-length substrings.

The attack records need to be constructed to line up with the collision blocks in a similar way to the trojan horse records, but everything else can be different. Our attacker mainly cares about having a different owner name and different TXT contents:

$ORIGIN _acme-challenge.intranet.example.
@  3600  IN  TXT  "gfj9Xq...Rg85nM"
@  3600  IN  TXT  "\255\222...\53\88\0\0" ""...

In these examples @ is short for the current $ORIGIN setting. This notation is just to stop the lines from bumping in to the right margin and wrapping.

Signing oracle

How does our attacker get their trojan horse records into the DNS zone so that they are signed? A plausible scenario is that the attacker's toehold is a host on some shared infrastructure that provides hosts with limited DNS update capabilities. Perhaps each host has a TSIG key that allows it to publish an ACME dns-01 challenge for its own name but nothing else.

A DNSSEC signature covers a bit more than just the signed records: there is also some RRSIG metadata. This identifies the key that signed the records, and has inception and expiration times.

To make the chosen prefix collision work, our attacker has to correctly guess what these RRSIG fields will be when zone signs the trojan horse records, so that the attacker can ensure that the attack records collide. In practice, it is easy to predict the RRSIG fields by controlling the time when the trojan horse DNS update is submitted.

Putting it together

Our attacker has pwned the network and gained access to a low-value host; they want to gain access to something more juicy. They plan to get a TLS certificate for their target so that they can intercept logins over https without being detected.

In this situation they would normally use ACME http-01, but our attacker is thwarted because the site publishes CAA validationmethods and accounturi records protected with DNSSEC. But it's only RSASHA1, so they have an opening.

Our attacker asks Let's Encrypt for a certificate for their target, and gets an ACME challenge. They rapidly calculate a chosen-prefix SHA-1 collision to construct their trojan horse records, and at a carefully controlled time they update the DNS zone using the TSIG key they found in their toehold.

The attacker gets the signature from the trojan horse records and uses it to make an RRSIG record for their attack records. Then they spoof DNS responses to Let's Encrypt containing their attack records, convincing Let's Encrypt that our attacker has legitimate control over the target, thereby getting a TLS certificate.

Then the attacker uses the certificate to intercept TLS traffic to the target, and get some privileged login credentials from the decrypted https traffic.

Back to reality

This attack is supposed to be approximately as plausible as the tech scenes in action movies. One of the ways tech in movies is often implausible is that cryptographic keys are broken by brute force in a ridiculously short period of time, instead of taking longer than the heat death of the universe.

In our attacker's scenario, they need to run a SHAmbles attack within the expiry time of an ACME challenge. This expiry time is at least a few days, so less than a factor of 10x more difficult than the proof-of-concept SHAmbles attack.

I think this is enough to argue that SHA-1 in DNSSEC is practically broken, in cases where permission to update a zone is shared.

Other record types

Our example scenario used TXT records, but there are a number of other DNS record types that provide enough space to smuggle in SHA-1 collision blocks, and which an attacker might be able to use for mischief.

TLSA records can be used to authenticate TLS certificates for mail servers and other protocols. TLSA records can contain RSA keys (selector = 1 for public key, matching type = 0 for no digest) which are large enough to hide collision blocks.

DNSKEY records

The most juicy target for an attacker is to get a signature over their choice of DNSKEY records, which would allow them to freely create their own signatures for any records in the zone.

Most zones have two kinds of keys, KSKs (key-signing keys) which only sign the DNSKEY records, and ZSKs (zone-signing keys) which sign the rest of the zone. The DS records in the parent zone make the link in the chain of trust. The DS records contain hashes of the KSKs, and the DNSKEY records are only trusted if they are signed by at least one of those KSKs.

In a zone set up with a KSK/ZSK split like this, our attacker can only get records signed by a ZSK, so they are unable to get a working malicious DNSKEY.

Some zones are set up with CSKs (combined signing keys) which sign the whole zone including the DNSKEY records. In a zone using CSKs our attacker can obtain a working DNSKEY under their control.

CDNSKEY and CDS

Alongside a zone's DNSKEY records, there may be CDNSKEY and CDS records. These records are instructions to the parent zone saying what the DS records should look like. If our attacker can get signed malicious CDNSKEY or CDS records, they may be able to persuade the zone's parent to install the attacker's choice of DS records. That would also allow the attacker to freely create their own signatures for any records in the zone.

Like DNSKEY records, CDNSKEY and CDS records must be signed by the zone's KSK. So zones with separate KSK and ZSK keys are safer against collision attacks than zones with a CSK.

Top-level domains

Our scenario was something like an enterprise environment; but the most prominent situation where a zone can be updated by multiple parties is a top-level domain. Any domain registrant can get an arbitrary set of DS records signed by a TLD.

At the time of writing there are 274 TLDs using algorithm 5 (RSASHA1) or 7 (RSASHA1-NSEC3-SHA1).

What prevents TLDs being vulnerable to SHAmbles is that the payload of a well-formed DS record is too small to hold enough collision blocks. So, provided there are enough syntax checks, it is probably not feasible at the moment for an attacker to make a trojan horse DS RRset that collides with an attack DS RRset for a different domain.

However there are some TLDs (such as .de) that allow some subdomains to insert records directly into the TLD without a delegation. This greatly increases the risk of a chosen-prefix collision attack (though .de uses RSASHA256 so it is safe against SHAmbles).

Shared keys

Another risky practice is hosting providers that use the same public/private key pair for large numbers of zones. There are well over 200,000 zones which share keys, and some keys are shared by over 140,000 zones.

In this situation an attacker might be able to get legitimate control over a zone with the same key as their target's zone, even though those zones are different. This attacker does not need to be surreptitious with their trojan horse records: they can set up chosen prefix collision attacks in their own zone and use their signatures to attack other zones in the same hosting setup.

Hardening RRSIGs

In 2008, a chosen prefix collision attack against MD5 was used to create a rogue X.509 CA certificate. By 2015, it was evident that SHA-1 would soon be vulnerable to a similar attack. In 2016 the CA/Browser forum baseline requirements were updated to require that certificate serial numbers are assigned using at least 64 bits of randomness. This protects certificates against chosen prefix attacks because an attacker cannot predict the prefix on the trojan horse certificate.

The predictability of RRSIG records is similar to the predictability of X.509 certificates 10 years ago. Can we add randomness to RRSIG records to make chosen prefix collisions harder? There is some space to add randomness to the inception and expiration times.

In BIND by default the inception time is one hour before the current time (to allow for inaccurate clocks). You could subtract about 10 to 12 bits of randomness (less than 1 hour 10 minutes) without causing problems. And by default the expiration time is 30 days in the future. This could have about 16 bits of randomness added (up to about 8 hours).

More adventurously, it might be possible to randomise the original TTL field, for up to 32 more bits of randomness. The original TTL field is required to match the TTL of the records being signed, so it would be a protocol violation to randomise it. But validators might not be able to detect that a violation has occurred, in which case randomising it would be benign. Some experimentation is needed to find out if this guess matches reality!

Conclusions

Whenever a DNS zone is signed with a SHA-1 DNSKEY algorithm it is vulnerable to chosen prefix collision attacks. This is a problem when a zone accepts updates from multiple parties, such as:

  • TLDs
  • enterprises
  • hosting providers

It is also a problem when a key is re-used by multiple zones.

Zones using algorithm numbers 7 or less should be upgraded. The recommended algorithms are 13 (ECDSAP256SHA256) or 8 (RSASHA256, with 2048 bit keys).

For extra protection against chosen prefix collision attacks, zones should not share keys, and they should have separate ZSKs and KSKs.

DNSSEC zone signing software should provide extra protection against chosen prefix collisions by adding more randomness to the inception and expiration times in RRSIG records.

Software implementing CDNSKEY and CDS checks must ensure that the records are properly signed by a KSK, not just a ZSK.

Top-level domain registry software must not accept over-sized DS records.


Corrections

The number of domains with shared keys was erroneously large.

I originally thought the SHA-1 input block size is 20 bytes, like its output size, but in fact the block size is 64 bytes. This means a collision requires more space than stated in earlier versions of this article, but this does not significantly affect the implications for DNSSEC. The attack outline now explains how to accommodate the larger collision blocks.

Clarified reference to SHAttered colliding PDF files.