Dan...
#1 - No, you can set the /3GB switch and have the PAE kernel loaded together. However, both of these squeeze kernel memory, and that's why it's important to tune the server appropriate. The PAE kernel uses wider page table entries (PTE) so that it can accommodate physical memory above the 4GB boundary. As an aside, these wider PTEs are also needed to support the Data Execution Prevention (DEP) feature in Windows 2003 SP1. The /3GB switch modifies the 4GB of virtual address space allocation for each process so instead of each app getting 2GB of user and 2GB of kernel space, the app gets 3GB of user and only 1GB of kernel space. Because Exchange uses a single store.exe process, it requires the larger user address space to scale. You can think of the PAE kernel as changeing physical memory allocation, and the /3GB switch changing virtual memory allocation.
#2 - Great question. As a general rule, we recommend leaving the cache size at its default of ~900M. While it's true that a scale-up server will perform better with a larger cache size (i.e. 1.2GB), you run the risk of depleting the virtual address space which can cause a service outage. As with most tuning parameters, you have to balance performance vs. stability. You are correct that ExBPA doesn't provide a concrete recommendation right now. This is mainly because there continue to be different schools of thought on tuning this parameter. Here at Microsoft, we're running one large server with a 1.2GB cache (>4000 users), whereas all other Exchange servers are running with the default cache size. This one server is running closer 'to the wire' in terms of stability, and we have seen one instance where the lack of virtual memory caused an outage. NOTE: Just because you have 4GB of physical RAM in a server, it doesn't mean that it's safe to increase cache settings. At the end of the day, you'll run out of virtual memory in the process before physical memory. The amount of virtual memory is constrained by the 32-bit OS, which is a big reason why we're looking to 64-bit with E12.