I haven't blogged the last couple of days because I both wanted my desktop search testing to progress before reporting back, and because I've just finished an article on rootkits for Windows IT Pro Magazine's June issue. If you’re not familiar with rootkits check out
RootkitRevealer
, a tool that Bryce and I recently released that tries to detect them. Its home page gives a brief description of rootkits.
The June issue of Windows IT Pro Magazine is going to result in a nice synergy for me because it will be released at
TechEd US
in Orlando, where I'll be delivering a breakout session that discusses rootkits, "Understanding and Fighting Malware: Viruses, Spyware, and Rootkits". If you're going to TechEd you might want to check out Dave Solomon (my coauthor on Windows Internals and Inside Windows 2000) and me copresenting a preconference tutorial on advanced Windows troubleshooting. The precon is a lot of fun with a lot of practical information and we highlight some of the advanced uses of a number of Sysinternals tools, including Process Explorer, Filemon and Regmon.
Picking up on my desktop search testing, shortly after I learned that Google Desktop Search (GDS) only does partial document indexing I uninstalled it and decided to give Microsoft a chance by installing MSN Desktop Search, which is also in beta. My experiences with it have been a little better, though Google did a much better job of telling me where it was during the initial indexing process. Even finding MSN Desktop Search's status window is difficult. I expected to find it on the context menu associated with the deskbar search window that you can have it display in the task bar, but it’s only accessible through the context menu of its tray icon. Well, I was going to tell you the current status of the indexing on my laptop, where I installed MSN Search this morning, but it looks like they've got a bug in their context menu: it appears, but is only partially drawn and I can't interact with it. Anyway, on my desktop system it said for hours that it had indexed 16575 items with 0 left to index - when a search for text I knew to be in e-mails failed, indicating that it hadn't indexed Outlook, yet.
Desktop searches with MSN Search are working great, though, and its finding the text within the power point documents that GDS couldn't. MSN Search's documentation states that it will index only the first 1 MB of documents, which is annoying, but most (not all) of mine are smaller than that. There's no reason for Microsoft and Google to exclude advanced options like that. Speaking of advanced options, MSN Search also has an easily accessible index for selecting what you want to index (someone posted a comment to my previous blog posting explaining that GDS has a way - that in my short usage I didn't come across - to specify what you want to exclude, which would require more work for me).
Another reader pointed out that this month's issue of PC Magazine has a comparative review of desktop search engines that you can read on-line at
http://www.pcmag.com/article2/0,1759,1771684,00.asp
. They choose Yahoo Search as their winner, but then again, they don't mention either the GDS or MSN Search document size limitations, so I'll see if MSN Search works for me before trying another engine that might have other gotchas.
As a side note, as I was trying to open MSN Search's tray icon context menu I was reminded of just how annoying it is when the “hide inactive icons” task bar configuration option is selected. I’ve already switched the option off on my desktop system and now have done so on my laptop. When I’ve tried to interact with a hidden icon by clicking the open button on the tray I’ve usually had to race to click on the icon before the tray closes up on me. Hiding the inactive icons is nice visually, but the implementation is flawed.
MSN Desktop Search won't work for me and I suspect, a lot of other folks, because there is no IFilter for Firefox or Thunderbird files. In addition, the only IFilter I could find for OpenOffice files costs $50.
3/16/2005 7:07:00 PM by Anonymous
X1 (and therefore Yahoo, I presume) will index content in files of up to 2GB. But it's a user-definable global (files and email attachments) setting, with a default of 30MB.
dtSearch has a 2GB limit too, but no user-set max. However, it uses a 'report' display rather than 'preview' to manage resource issues on displaying big text files (by default more than 16MB, but user-configurable). Big binary files (e.g. databases) can be set to be broken into bite-size chunks (though it will automatically cope with popular formats like MBOX or Eudora MBX). And specific big files/types can be set to ignored.
Still, there's really only one way to find a glove that fits ...
3/17/2005 12:59:00 AM by Milly
I've been using MSN Desktop Search since the begining. It rocks since I installed of the ifilters from http://www.ifilter.org/Links.htm and http://www.citeknet.com.
MS should have included them by default with their setup I reckon.
3/17/2005 2:35:00 AM by Xi
Those developers should be heroes. Without the free IFilters available on the web, MSN Desktop Search would be completly useless. I found Microsoft's always lucky to rely on 3rd-party to do the dirty work.
3/17/2005 3:17:00 AM by Gert Van Rousel
I too just analyzed several; Copernic, Yahoo! and Google, plus Microsoft
without
the Desktop Search. (Just indexer and search.) My results are in my Blog at
http://www.livejournal.com/users/bloggit/3764.html
I was suprised at the results. Google is pretty lousy, though it does a good job on Thunderbird. Yahoo! doesn't support Thunderbird and hangs a lot, but otherwise is okay. Copernic handles Thunderbird and most other stuff well, but no zip file support. And Microsoft (non-Desktop) does just about as well. Amazing. I'll give MSN Desktop a try soon.
3/21/2005 12:46:00 PM by Anonymous
Mark, when you want some icons to display and other to stay hidden.
Right click on start, properties, task bar,select hide icons, customize:
the you can select which you wwant to stay and which you want to hide. Thais if explorer doe not lock up on you when trying to change it...
You have some great tips and info,
thanks, Doug
4/12/2005 4:02:00 AM by Anonymous
"MSN Desktop Search won't work for me... because there is no IFilter for Firefox or Thunderbird files. ...the only IFilter I could find for OpenOffice files costs $50."
There are some difficulties in making an IFilter for programs like Firefox and Thunderbird, which store many discrete chunks of data (bookmarks, messages, history items) in a single file. It doable, it just hasn't been done yet. For more information, see
this comment
on Microsoft's Channel9 wiki. The poster is talking specifically about the mbox mailbox format used by Thunderbird (and Eudora and others, too).
As for the OpenOffice.org IFilter,
this one
is now available for free to non-commercial users.
5/18/2005 4:56:00 PM by Adam Messinger
I can't use MSN Desktop Search because it won't index Outlook 2000 pst's. Stupid. And no--I'm not planning to take Microsoft's suggestion and switch to C/W mode in order to get it to work. Instead, I would suggest that Microsoft get to work and get their program to work.
6/10/2005 1:59:00 PM by Anonymous
I appreciate your blog, Mark, as well as the other comments. This is the first time I've heard about the file size limitation being a *feature. I recently noticed by accident that Google Desktop Search was not finding things past, say, page 10 of a 200 page document. That upset me as I naively thought that, if GDS tells me it's not there, it's not there.
Could you or someone else comment on 1) what the default size limitations are for the big desktop search players;
2) which desktop search apps allow you to modify this default;
3) how these defaults are modified.
Just to start the ball rolling, I see Copernic has a default of 50 mb, and allows end users to modify this in their Options. I can't find the equivalent for GDS or MS-WDS. I would appreciate any feedback and/or tips to other websites that discuss this. Thanks.
11/27/2005 2:57:00 AM by Anderbek
P.S. More digging... I notice in GDS support they say, "However, if you're searching for a word within the file, please note that Google Desktop searches only about the first 10,000 words." It doesn't seem like there is a way to modify this setting. I've also looked at the plugins TweakGDS and Google Desktop Extreme but no luck.
I haven't found any information about file size limitations for MS's Windows Desktop Search nor how to modify it. Although I will say that I tried searching for a keyword at the end of a 644-page document (21 mb file) and WDS was able to find it.
11/27/2005 4:45:00 AM by Anderbek
The initial indexing only starts/continues after 30 seconds idle time.
It stops directly again upon user action (like checking the completion percentage :)
12/14/2005 3:13:00 AM by glenn_dm
Does anyone know if the size limit is internal to the GDS/MSN engines or the filters? Ie, can one use a PDF-ifilter with GDS to get the complete file?
5/8/2006 1:54:00 PM by FlyingRat
This is how GDS limits the pdf file size when pdftotext is called from GoogleDesktopCrawl.exe (captured with procexp.exe)
"..GoogleGoogle Desktop Searchpdftotext.exe" -enc UTF-8 -htmlmeta -layout
-l 20
infile.. outfile.."
Notice the "
-l 20
" which means only bring out the first 20 pages!
FlyingRat (l a r s @ t w i n o x . s e)
5/8/2006 2:06:00 PM by FlyingRat
I wish reviewers of destop search products could perform more deep-draught analysis and make all kind of limitations exposed to the public.
FlyingRat.
5/8/2006 2:13:00 PM by Anonymous
MSN Desktop Search rocks. I tried Google Desktop Search and that basically freezes my Pc everytime it I try to Index my Pc. Msn Desktop Search rocks! It does its work perfectly fine. No FREEZING!!!
5/13/2006 10:07:00 PM by Anonymous