Heads up, gateway developers and consumers of raw mime data from drop directories! If you do not develop gateways for Exchange 2007 and write mime devouring applications, I highly suggest you use this opportunity to test Internet Explorer 8's new "Back" functionality. In IE 8, clicking that "Back" button moves you back to the page you were at before. Oh, ok, so it's not really new functionality, but it's important and if you don't develop gateways you are unlikely to find this topic of interest. For the three of you still with me, read on, because in the words of the immortal Dylan, The times, they are a changin' We're going to talk about Address Encapsulation today and a change in Service Pack 1, Rollup 7, what we are doing with it, why, and how to cope. Before we can go where we're headed we have to know where we've been, so I invite you to come along with me as I set the time machine for 1996 and take a look at
Now comes the part where the encoding algorithm is invoked. If you can locate your Tardis, you can zip forward to 2009 and read all about this algorithm on technet - http://technet.microsoft.com/en-us/library/bb430743.aspx. (If you can find a Tardis, could you please do two things? First, go back in time and find out where I left mine. Second, there's a Dalek invasion that is going to happen last Thanksgiving unless someone prevents it.)
The algorithm works like so (for the click challenged). Alpha numerics: Ok. Slashes get converted to _. Everything else gets "Plus" encoded. That means there's a + and the two digit hex value of the character. Finally the Exchange server's primary domain is appended (so that hopefully replies get back to it. So Bob's encapsulated address is
IMCEAMHS-ceobob+7BPicardForever_HUB+7D@contoso.com.
When his message arrives it immediately generates a firestorm of replies ("A bald man in a jumpsuit? Are you serious?") addressed to Bob's encapsulated address. As these pour into the Exchange Server, it will perform reverse osmosis on the address, "De-encapsulating" it. First it looks for the super secret key, "IMCEA". Check!
Next it looks for a dash that could function as an address type separator. There it is, right after MHS.
So the address type is MHS. Now goes through and "Plus decodes" everything else. The resulting email is reconstituted as "MHS:ceobob{picardforever/HUB}" and dutifully passed on. Where once a person had to wait for WIVV boards to replicate in order to make an idiot out of themselves electronically, through Address Encapsulation it is now faster and easier. That is what we call "progress."?
Back to the Future
Here in 2009, address encapsulation isn't a mystery. It's documented on technet. Exchange servers are now tied to the Active Directory, they speak primarily SMTP, and the thought of there being a controversy about who is the better captain is a long distant oddity that people laugh about.
Exchange Server is widely used across the world in all kinds of businesses, all kinds of languages and many, many different address formats, all of this still using the same basic algorithm that first crept up in Exchange 4.0. This is where things that worked fine in the world of a decade ago don't work so well now. You see, Exchange 4.0 was not the world's most localized product. It shipped in four languages, English, German, French and Japanese, and it worked in English. We also supported a number of different character sets, as long as they worked fine when converted directly to raw ASCII. Exchange 2007 lives in a different world. Unicode should work. UTF is the standard format, and generally speaking any place where the code handles a string it had better be Unicode.
There are problems with Unicode Addresses and Address Encapsulation. Particularly, that address encapsulation supports single byte characters really well, and anything that's not a single byte character not so well (where by "not so well" I mean "not at all").
Our CEO from the previous example, he's grown up as well. He no longer argues over things like star trek captains. He understands that James Tiberius Kirk would capture an alien enemy and beat the secret of their doomsday device from them with his bare hands. If Piccard were to capture an alien enemy it would be to negotiate with, surrender to, or preferably negotiate an unconditional surrender to. No, our CEO now has more mature tastes in entertainment. He mails his video on demand service and they mail him the stream address.
So now when he takes a lunch break, he sends a mail to [VOD: Super Psychic Battle Angel ??? Part 2: ?????????????]. And he is in for a nasty surprise. See, Sakura's name consists of decidedly non English characters. The Exchange Server will start to encapsulate the address to transfer it to the Video On Demand gateway and things will not go well.
The CEO gets an NDR. It's pretty, in HTML, with helpful diagnostic information and a generous description of what is wrong:
Delivery has failed to these recipients or distribution lists:
Super Psychic Battle Angel ??? Part 2: ?????????????
The format of the recipient's e-mail address isn't valid. A valid address looks like this: username@contoso.com. Microsoft Exchange will not try to redeliver this message for you. Please check the e-mail address and try sending the message again, or provide the following diagnostic text to your system administrator.
While it's a very nice NDR it contains neither super psychics nor battle angels and you can bet money it's not what he had in mind when he pushed "Send". Furthermore, there's absolutely nothing the CEO can do about it. It's a limitation of the encoding.
It is a limitation of IMCEA encoding that multibyte characters cannot be encoded, so if we want to be able to address them in encoded format, we need an extension, and an extension we have made. Under the new rules, an existing address without multibyte characters will be encoded exactly as before. That's right: IMCEA<Address Space>-<encoded string in + format>.
What about our Battle Angel Video? Well, it definitely has extended characters. And this is ok. Now the conversion system will apply a new encoding: The address will first be converted to UTF8. The UTF8 address will then be plus encoded as needed.
So, [VOD: Super Psychic Battle Angel ??? Part 2: ?????????????] becomes
IMCEAVOD-UTF8-Super+20Psychic+20Battle+20Angel+20+E3+81+95+E3+81+8F+E3+82+89+20Part+202+3A+20+E3+83+87+E3+82+A3+E3+83+AC+E3+82+AF+E3+82+BF+E3+83+BC+E3+82+BA+E3+82+A8+E3+83+87+E3+82+A3+E3+82+B7+E3+83+A7+E3+83+B3@contoso.com
The key addition here is the "UTF8-" key after IMCEA<TYPE>-. One side effect of this is that if you had one off addresses that begin with UTF8-, Exchange Server will treat them as "Special" and attempt to decode them. This means that DLs or recipients whose actual, non encoded non SMTP address is "UTF8-<something>" will NOT work correctly if the address is ever encapsulated.
Exchange Server 2003 does not understand the extended encoding format. It does not create them, nor does it decode them. It could not deliver to them before and it cannot do so now. Exchange Server 2007, on the other hand, will be able to address and deliver to these addresses. Such is the price of change. If you are a gateway developer reading from the pickup directory, you too may encounter these addresses. Fear not. By turning the encoding steps backwards you can in fact decode the address. In otherwords :
1. Look for "IMCEA" as the start. If it's not there, bail, this isn't an encoded address of either flavor.
2. Everything between IMCEA and "-" is the address type.
3. Plus Decode as normal.
4. Look for "UTF8-". If it's not there, this is a standard IMCEA encoded address. Process (or fail to process) as normal.
5. If it is a UTF-8 encoded Encapsulated address, un-UTF8 encode it by stripping the "UTF-8" prefix and then decoding the rest of the string.
There we have it. Technet will be updated to reflect this extension in time. Mail to your video on demand services, your fax services, even your star trek mailing lists. Use English, bad English, and non English characters. We don't mind. Just don't claim that Piccard was a better captain than Kirk. Some things software simply can't accommodate.
- Jason Nelson
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.