In Exchange Server 2007 SP1, the EWS team added a new feature called EWS Request Proxying. It is sometimes also referred to as EWS Proxy or HTTP Proxy. Not much has been said about this feature, but if you care about performance, it is important to understand why it is there, why you want to avoid it if at all possible, and if you cannot avoid it, why it is important to have.
What is it?
EWS Request Proxying only comes into play in Exchange installations that have multiple Active Directory sites. If you have a single AD site, you can stop reading this post right now. Now that our readership has become smaller, let's continue. Active Directory sites are typically created to define a boundary of highly connected computers and devices. This implies that accessing resources cross-site is more (potentially much more) expensive than accessing resources within a single site. EWS lives on a Client Access Server (CAS) within a specific site. If the mailbox that EWS is trying to communicate with resides on a mailbox server in a different Active Directory site, the ensuing RPC calls between EWS and the mailbox server will be made across a potentially expensive cross-site link. Depending on the EWS operation that is being performed, there may be *many* resulting cross-site RPC calls. To combat this potentially expensive RPC chattiness, when EWS encounters a request that needs to be serviced in another AD Site, it will actually package the entire request up and make an EWS request to a CAS server in the destination site rather than face cross site RPC calls. The initial CAS box is proxying the inbound HTTP request to a more appropriate CAS server, which is why you will sometimes hear this feature referred to as "HTTP Proxy". In Exchange 2007 SP1, EWS determines the "best CAS" to proxy to by scoring all the items in the request and determining which site can service the most items. The downside of this is that IF your request needs to access mailboxes in multiple AD sites, you WILL encounter cross site RPC calls since EWS does not "split up" requests.
Although the EWS service is generally stateless, there is some state that is maintained to reduce the amount of Active Directory and Mailbox lookups that EWS has to do when servicing requests. Continued requests from a given user will be quicker and cheaper on a CAS that is maintaining such state. Although any of the CAS servers in the Active Directory site *could* service the request, it is preferable to maintain affinity between the caller and the CAS so that this cached state can be taken advantage of. I will call this "nice affinity" given that it is not a necessity, but rather something that should be preferred if at all possible. In contrast to "nice affinity", there are certain web methods such as GetEvents and Unsubscribe that have very strong CAS affinity given that subscriptions reside in memory on the CAS box where they are created. The Subscribe call itself is governed by the "nice affinity". However, GetEvents and Unsubscribe calls for that created subscriptions must be serviced by that specific CAS box. I will call this "necessary affinity".When CAS servers sit behind a Network Load Balancer (NLB), it is the NLB that determines which CAS will receive a proxied request that is sent to the NLB's URL. Due to "nice affinity", it is preferable that all requests from a given user be proxied to the same destination CAS box so that the cached information can be taken advantage of. If EWS cannot proxy the call to the CAS maintaining "nice affinity" caches, then all is not lost -- EWS can just choose another CAS in the site which will cause that second CAS to lookup and cache the same information. As a result, EWS takes some state from the inbound call and caller and hashes it across all the available CAS boxes in the destination site. Assuming that the topology in the destination site has not changed, a given caller will always be routed to the same CAS box.In contrast, if a call has "necessary affinity", EWS cannot fall back to another CAS box, but *must* proxy the request to the applicable CAS box or fail the request. In this case, EWS does not hash the request across CAS boxes, but rather examines the subscription Id directly to determine which CAS box to proxy to. As you can see, whether for "nice" or "necessary" affinity purposes, EWS must be able to bypass the NLB and send proxied requests to the CAS boxes directly.As part of the EWS proxy feature, the *-WebServicesVirtualDirectory PowerShell cmdlets include a new parameter called InternalNLBBypassUrl which allows you to specify the direct URL to get to a CAS box. When a CAS is configured as part of an NLB, the externalURL or internalURL (depending on the location of the NLB) is set to the Url of the NLB. However, as we have seen , EWS must not use this address for proxying and instead relies on the InternalNLBBypassUrl for proxying. When an Exchange CAS box is installed, the InternalUrl and InternalNLBBypassUrl are both set to the same value - the direct URL for that machine. It is important that IF a CAS is put behind an NLB, you do NOT modify the InternalNLBBypassUrl for that virtual directory to point to the NLB.EWS will only proxy requests to CAS servers that have the InternalNLBBypassUrl set to a non null value. This also means that if you do *not* want to allow servers to proxy to each other, you can turn off proxying altogether by setting the InternalNLBBypassUrl of all your CAS servers to null.If you want to see all the InternalNLBBypassUrls for web service virtual directories in your topology, run the following from the Exchange Management Shell:
When EWS proxies a request from an initial to a destination CAS box, the thread on the initial CAS box is blocked for up to a minute until the request to the destination CAS box returns. As such, you will be using resources on *two* CAS servers in order to fulfill your request. In addition, due to the additional routing, your request will take longer to process than if you talked to the destination CAS box directly. So how can you talk directly to the destination CAS box? You should call AutoDiscover *first* to obtain the correct EWS URL that should be used for the mailbox in question. You do AutoDiscover, right? In fact, if you are dealing with multiple mailboxes, you should call Autodiscover once for *each* mailbox you are trying to access. Of course, it is reasonable to cache this information on the client side. However, note that due to mailbox moves/migration, such cached information may end up getting stale, so don't hold onto it too long.So when *should* you re-autodiscover? As a general practice, we would recommend you calling autodiscover once per day. In addition, if you encounter any of the following failure response codes, you should re-autodiscover:
When you cannot avoid it
There are times, however, where you cannot avoid HTTP proxy. If the destination site has NO external facing CAS boxes and your client is outside of the corporate network, the only way to access a mailbox on the internal-only site is to have it routed through an externally facing CAS box in another site. And this is precisely why HTTP proxy was added - to allow external access to resources on internal facing sites. Even though the mailbox you are trying to access exists in an Active Directory site with no external facing CAS boxes, you should *still* call Autodiscover to determine the EWS Url to send requests to. Why? Because if there is no external facing CAS box for the site in question, Autodiscover returns the closest and cheapest external CAS.
Rules to Live By
Autodiscover every distinct mailbox you are going to access and cache the EWS Url for a limited amount of time.
Group batched requests by EWS Url. In other words, if two mailboxes return different EWS Urls, do not include both in a single request as it will likely result in cross-site RPCs