Exchange message transport & routing has at times been a bit challenging to understand. Whether it was the X400 based Mail Transport Agent (MTA), Open Shortest Path First (OSPF) based routing in Exchange 2000+, or the effect of Database Availability Groups (DAGs) on mail delivery and resilience. Of course if your mailboxes are completely hosted in Office 365, then this is no longer your problem to worry about. But for many of our customers, Hybrid or some form of coexistence is still a concern. If your Office 365 mail routing is anything but the most basic case, you probably need to know a little something about how we do message attribution.
One of the harder things to wrap your mind around is how Office 365 attributes messages to particular organizations (tenants). Office 365 is a completely multi-tenant environment – meaning virtually all infrastructure can be shared with other tenants. This includes IP addresses, certificates, transport servers to name just a few components.
When a message arrives at Office 365, one of the first things we need to do is figure out which organization it belongs to. At first, this sounds simple – just look at the recipient, right? Well, it is more complicated than that, because of Hybrid and complex routing scenarios.
For this entire post, we’ll refer to a message that is sent by Office 365 customer contoso.com, destined to another Office 365 customer, tailspintoys.com. Both customers may be hybrid, and so it is possible that the mailbox of the initial sender at contoso.com is hosted on-premises:
Based on connectors, Office 365 has to quickly determine which topology is the intent. Both organizations may have rules that they’d like to apply to the message, and both organizations would probably like to keep tabs on the message as it passes through their organization and Office 365 tenant. So, we need to be able to distinguish which tenant the message is currently traveling through. Back to our example, each time the message comes into Office 365 we need to decide if it is incoming to tailspintoys.com or originating from contoso.com. That way we can decide which set of rules to run, connectors to apply, and which customer to provide with the message trace data.
By the way, it’s also possible that the configuration could be this:
Or even this:
There are many other possibilities, some of which are more complex. The point is that Office 365 is extremely flexible. So knowing the rules can keep you out of trouble. For a list of all the scenarios which we support, see our best practices document.
Important Note: just because it is possible to route mail in a complex manner does not necessarily mean that the required headers and permissions will be preserved. We frequently see customers configured in complex routing scenarios who are having a poor experience with spam, spoof, and phishing – either false positives or false negatives. This blog only focuses on the mail attribution related pieces. Successful attribution doesn’t make any statement about the security of your configuration.
Before we get into the exact logic, I think it’s important to dispel some myths we frequently hear.
When you pick your initial onmicrosoft.com domain, you also pick a primary geography for your tenant. Office 365 then assigns you a unique DNS entry for your onmicrosoft.com domain (and any other domains you verify). This means:
In other words – we can’t use DNS as an attribution method.
Office 365 has a feature that allows you to relay messages from your on-premises environment. The way you enable this is by creating an inbound connector of type on-premises. This connector can authorize relay either by IP address, or by certificate. The latter is highly preferred, particularly when you’re sending email for domains which you don’t own or haven’t added to your tenant. Certificates are significantly more secure and identify you better than just a IP addresses – as it is very difficult to prove IP ownership definitively. Should two Office 365 tenants specify the same IP or overlapping ranges, then things start to go wrong – for example, if inbound on-premises connectors are created for a shared service.
Sometimes when checking to see if Office 365 is an open relay or not, the IP address from which the test is being run is listed on a connector – inside your tenant, or someone else’s. And since the relay feature is enabled for that IP, the ‘open relay’ test will fail. More on how to protect against abuse later.
Office 365 will delay the rejection of an “unauthorized relay” until “End Of Data” instead immediately after RCPT TO command in the SMTP conversation. This may cause some third party tools to incorrectly flag Office 365 as a potential open relay. You can be assured, however, that this is not the case.
Yes - except when we’re not expected to. Going back to our example, if a message is truly attributed to contoso.com initially, then when we’re ready to send the message to tailspintoys.com, we will absolutely look up the MX record. However, there are some exceptions you should be aware of:
If tailspintoys.com does not point their MX record to Office 365, they can tell Office 365 to reject messages which do not originate from the endpoint they want all mails to first route through (including onmicrosoft.com domains). You can do this by creating an inbound connector something like this:
New-InboundConnector –Name "Block delivery unless via MX record" -ConnectorType Partner -SenderDomains * -RequireTls:$true -RestrictDomainsToCertificate $true -TlsSenderCertificateName <the domain or SAN of the certificate used by the endpoint where all mails to first route through>
Or, if you prefer to use IP address instead of certificate:
New-InboundConnector –Name "Reject mail not routed through MX" -ConnectorType Partner -SenderDomains * -RestrictDomainsToIPAddresses $true -SenderIpAddresses <static, full list of on-premises IPs, or IP ranges of third-party filtering service their MX may be pointing to>
Note that by creating a connector like this will cause NDRs to be generated for messages which are submitted directly to your tenant and don’t match the certificate name you provide above. Certificate is always the preferred method. Whether you end up using certificates or IP(s), if either certificate or IP(s) changes, you will reject good messages.
If you create a tenant in Office 365/Azure and add production domains to the tenant, you need to be careful. It is likely that your organization already sends email to many other Office 365 customers. As such, real production email will be sent to Office 365 at some point. As soon as you verify the domain, it is possible that mail will be associated with your test tenant, especially if there is an inbound connector inside your test tenant, or if the other party has a configuration issue that causes the mail to be attributed differently than they intend.
Put another way, we’ll start accepting mail for verified domains – unless you tell us not to accept or tell us to only accept conditionally.
First, SMTP is inherently not very secure. It is also possible that someone else is spoofing the sender of the message – whether this spoofing is legitimate (like a bulk email service, a partner organization, 3rd party service, etc.) – or not (spam, phish, etc.). While EOP does it’s best to protect you from the latter, we must be extremely careful not to block the legitimate messages. This is no different than any other SMTP server, multi-tenant or not. Very few email servers will outright block email at the SMTP level based on a spoofed MAIL FROM, because there are so many legitimate uses.
There are two main standards for detecting bad spoofing. SPF based spoof detection isn’t the best method for a multi-tenant service and is certainly the weaker of the two. We don’t typically use SPF alone to determine that mail is suspicious – it has proven too unreliable. The other standard which we use, DKIM, is more secure. We have some level of anti-spoofing by default using DKIM, and you can always make it better by publishing and using both DKIM and DMARC. We have a Spoof Intelligence feature that uses our knowledge of mail flow patterns as the world’s largest email provider, and also using SPF, DKIM, and DMARC as inputs.
When you use one (or both) of these standards to authorize Office 365 to send mail on your behalf, you are essentially saying that you trust us to prevent bad spoofing of your domain by other Office 365 customers. Read through the links above to build your confidence that we do not take this trust lightly.
Most customers have been using certificates on-premises that take the form of <host>.<rootdomain>, for example, exchange.contoso.com. However, exchange.contoso.com is not usually registered inside of the Office 365 tenant when the hybrid wizard is run. In order to make sure that mail works, we optimized the wizard to instead create the connector with *.contoso.com as the matching certificate. That way (a) the on-premises certificate matches and (b) the accepted domain (contoso.com) also matches. Hybrid wizard does tell you that it is going to do this.
Unfortunately, for customers with multiple tenants, or complicated outbound on-premises routing, having a wildcard match may be catching too much mail and attributing it all to the matching Office 365 tenant. The simplest fix – if the hybrid wizard created the connector – is:
So, back to the reason for the title of the post - when a message is sent to Office 365, the Front-end transport servers that receive the message need to determine if the message is originating (coming from a customer) or incoming (going to a customer). In other words, is the current hop inside of contoso.com’s routing configuration or inside of tailspintoys.com?
This is the basic logic used. This has been simplified to a small degree in order to cover the top scenarios and make it easier to understand.
So now you know how Office 365 Message Attribution works.
Scott Landry
With contributions & feedback from many others, especially Bruce Wilson, Stan Aleksiev, Markus Dahlweid
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.