Microsoft's e-mailservers break DKIM on non-ascii characters

Copper Contributor

There seems to be some kind of character encoding issue in DKIM signatures applied by Office365 servers. We just had a DKIM fail ('body hash did not verify') and tracked it down to non-ascii quotes being in the body. When we replaced them with ascii-quotes, the DKIM was valid again.

 

This is our chain:

 

appserver -> on-premise smtp -> Office365 (which signs the message with DKIM)

 

I can reproduce it with this bash script:

 

#!/bin/bash

# A test script for triggering the character encoding DKIM bug.

{ 
  echo "HELO origindomain.nl"; 
  sleep 0.5
  echo "MAIL FROM: wiebetest@ourdomain.com" 
  sleep 0.5
  echo "RCPT TO: wiebe@someotherdomain.nl"
  sleep 0.5
  echo "DATA"
  sleep 0.5
  echo "Subject: $RANDOM Conversation buster: My DKIM test"
  sleep 0.5
  echo "From: wiebetest@ourdomain.com"
  echo "TO: receiver@ourdomain.com"
  sleep 0.5
  echo "Hello receiver,"
  echo ""
  echo "Non-ascii char: “"
  sleep 0.5
  echo ""
  echo "."
  sleep 0.5
} | telnet smtp.onpremise.local 25

 

When I remove that non-ascii char, the dkim is valid:

 

Authentication-Results: mx.google.com; dkim=pass arc=pass; header.s=selector2

 

But when I have the non-ascii there:

 

Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify); arc=fail (body hash mismatch); header.s=selector2

 

The first Outlook server in the received chain is:

 

Received: from smtp.onpremise.local (1.2.3.4) (Wiebe: redacted) by HE1EUR01FT055.mail.protection.outlook.com (10.152.1.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.3455.23 via Frontend Transport; Tue, 13 Oct 2020 10:47:51 +0000

 

When people send me non-ascii chars just with Outlook clients it works fine, so it appears to happen when the body does not specify a character encoding.

11 Replies

I am not convinced about the part that it is explicitly microsoft who breaks dkim... but your point on the character encoding gave me an idea... my issue is that messages from one mailing system get dkim=pass in gmail, but dkim=fail in outlook (body hash fail).
I decided to check and compare html source.. and found that gmail shows encoding utf-8 in html header, while outlook shows encoding iso-8859-1. bodies with different encodings would result in different hash, of course. I am investigating if the email generating system specifies encoding before sending. If it is not specifying encoding, I will check if specifying it solves the issue.

I dove into it a bit more. The mail client is supposed to deliver the content using quoted printable encoding, so only 7 bit ascii (alphanumerics and =). This is the 'Content-Transfer-Encoding', different from the CONTENT encoding, which can still be UTF-8 (in which case µ will be =C2=B5).

I tried specifying 8 bit as content transfer encoding, but that didn't help.

However, ultimately, the mail client is the original offender, because that apparently just dumps UTF8 bytes into SMTP.
Actually, this is still not it. I tested clients that do proper quoted printable, and it fails. And Postfix +OpenDKIM does handle the above shellscript properly.
Since you mention Content-Transfer-Encoding... I also observe in my case that the message lands in gmail with Content-Transfer-Encoding: 8bit, while in outlook (where dkim fails to verify body hash), it gets "quoted-printable". I am investigating if the code generating the message specifies Content-Transfer-Encoding in header.

So we sat Content-Transfer-Encoding: 8bit on the email generating app. The message got delivered to outlook inbox with Content-Transfer-Encoding: 8bit. But.. charset still got changed to "iso-8859-1". While the generating code specifies UTF8. DKIM body hash verificaton still failing, of course.
I am working with support of the cloudy mta provider, my suspicion is that the charset change (aka the reason for DKIM fail) occurs at the cloudy mta.

Just got confirmation from cloudy MTA support that it is they who switch encoding from UTF8 to iso-8859-1 because of non-ascii characters in the body.

Still, it's Microsoft's servers that add the body hash, and I think it's a bug that with injecting certain bytes, you can break that. In-band bytes should not affect the protocol.
We are facing exact the same problem.
Is there a solution for this issue yet?
For us, the vendor of the software that generates the e-mail did confirm that there was something wrong with the encoding. It's been a while and it would require digging up a bunch of old conversations, so at this point, I can't tell you the details.

In your case, you want want to look at your mail sending software as well.
Hello halfgaar,
thank you for pointing us in the right direction.
If you could find the time, it would be great you could post the solution for this issue.
Any little deeper insight would be highly appreciated.
Many thanks in advance!