Exchange Server 2007 uses new transport logs file formats that are easy to parse with one of our favorite Swiss Army knife tools, "Log Parser". Log Parser (also referred to as LP in the rest of this blog post) is commonly used for analyzing IIS log files or Windows Event logs. This first post will get you started on how to use Log Parser in the context of analyzing Exchange 2007 transport logs. The second part will lead you in-depth on how extract the essence of the log as it provides a lot of valuable information.
I have tried to write this post in a way that you can see how we came to final queries that we wanted to use, so we might walk through several versions of queries and show what different options we are adding.
To get started first download and install Log Parser from the following link:
I mentioned looking at IIS logs... if you have IIS installed, from a command line you can run:
logparser "select * from <1>" -o:DATAGRID
This query will return all fields from your IIS logs into a data grid where IIS SiteID=1. As you can see LP allows you to parse IIS logs using common SQL commands such as: select, where, group by, order by, etc.
The "logparser -h" switch returns the main help information. Below are the most useful help commands:
logparser-h FUNC - returns all LP functions
logparser-h GRAM - returns all LP grammar keywords
The beauty of this tool is that it can parse virtually any logs. As Exchange 2007 was released after Log Parser shipped we will use the CSV importer for our parsing.
So if you type "logparser-h -i:CSV" LP will return all additional options for the CSV input format.
-i:CSV - specify the CSV input parser of LP as there is no current Exchange 2007 LP Input parser. The CSV allows specifying the separator and in the case of the Exchange 2007 logs it is comma separated.
Note: The Exchange Management Shell extensions options of Set-TransportServer allow you to control the log maximum age time and size in order to control the used disk size. For more information you may refer to: SendProtocolLogMaxAge, ReceiveProtocolLogMaxAge, SendProtocolLogMaxDirectorySize, ReceiveProtocolLogMaxDirectorySize.
If you open an Exchange Transport Log file the headers should look similar to this:
As you can see above (click to see full size), LP returns well formatted data. Looking at that - y
ou might have asked yourself: "How do I select the date and time" as it is labeled by LP as "#Fields: date-time"; the answer is by using brackets as bellow:
logparser "select [#Fields: date-time] from re*.log" -i:CSV -nSkipLines:4 -o:DATAGRID
logparser "select [#Fields: date-time] as date-time from re*.log" -i:CSV -nSkipLines:4 -o:DATAGRID
Brackets are required because LP does not recognize natively the Exchange 2007 file format because it is processed as CSV, but using the [field] syntax allows you to access the right information.
First interesting thing to note is that we are getting remote IP name with the remote port number so if we want to get the full FQDN, we need to extract the prefix information delimited by ':' by doing EXTRACT_PREFIX(remote-endpoint,0,':') and then get the revered IP of the remote sender by using the REVERSEDNS LP function. For example:
logparser "select REVERSEDNS(EXTRACT_PREFIX(remote-endpoint,0,':')) from RECV*.log" -i:CSV -nSkipLines:4 -o:DATAGRID
Now let's say we want to know all the senders that are submitting the most mail to our system. We need to group by RemoteSendingHost that is a reversed IP and for our convenience group by and order in descending order:
logparser "select REVERSEDNS(EXTRACT_PREFIX(remote-endpoint,0,':')) as RemoteSendingHost, count(*) as Hits from RECV*.log group by RemoteSending
Host order by Hits DESC" -i:CSV -nSkipLines:4 -o:DATAGRID
Note: This query might take a while, depending on the size of your logs and the variety of remote hosts that are trying to send you mail. This is because doing a reverse DNS query is time costly. Especially as spammers often do not provide valid reverse name resolution and LP will wait for a DNS time-out. For this to work properly, the machine running the query must have full Internet name resolution capability.
In part 2 of the series, I will cover more advanced techniques and how to use Office Web Components with Log Parser.
I would like to thank Jeffrey A. Johnson and Nino Bilic for their help in writing, testing and reviewing this!