Mark Gerow has written a great article on the practical aspects of implementing SharePoint search. Great Reading.

http://www.law.com/jsp/legaltechnology/pubArticleLT.jsp?id=1202426774552#

Here are some of his topics:

WHY SEARCH IS DIFFERENT?

Why can't it just work like my favorite Internet search engine?

How do I know others won't be able to find my secured documents?

Does it search my Inbox?  ( yes it can btw with BA-insights.com http://www.ba-insight.com/exchange.html connector).

Do I have to stop searching the old way? I liked it.

OF PROTOCOL HANDLERS, IFILTERS AND THE BDC

CONCEPTUAL VS. KEYWORD SEARCH

FACETED SEARCH

 

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments
Filed under: ,

I have just discovered a new PDF ifilter from a German company that supports both 32bit and 64bit.

http://www.pdflib.com/download/tet-pdf-ifilter/

I ran it through testing on both platforms and could not break it and was impressed with the speed.

As many of you know the Adobe ifilters are not 64bit compatible, there is a "work around" but I don't recommend it. That leaves you with either Foxit or this new company for 64bit support.

For 32bit you can use the Adobe ones for free, but you should know that they only work on one PDF at a time and thus are not as fast as the non-free versions from the other vendors.

 

UPDATE: Adobe releases true 64bit ifilter: http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

After updating one of my QA servers with the latest MOSS security patches from windows update I am seeing some strange results.

The scope counts that are displayed are actually login dependent. If the account that you are logged into the SSP with doesn't have permission to the files or items that were crawled they aren't included in the scope count. This includes the counts displayed in the scope rules also.

I am trying to determine which update introduced this issue. Can anyone shed some light on this or at least confirm I am not imagining this new feature.

 

UPDATE: it will also apply your real time security trimmers too.

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

So have been getting client requests for us (sharepointworks) to develop a 64 bit lotus notes connector on our framework. It would be developed in partnership with an IBM Lotus technology partner and thus would be very robust and scalable. More important it would benefit from our advanced security mapping framework to ensure users / groups and real time security is honored.

We are now trying to decide if there is enough of a market to do this. I have not had the opportunity to evaluate the current packaged MS offering and we don't really want to compete with free. But if demand keeps increasing we will add it to our Enterprise offerings. (Hummingbird, PCDocs, Worksite, Exchange Mailboxes, SQLServer, OLEDB, Dynamics(in development), Semantec Enterprise Vault (in planning)).

If any of my readers has any insight as to the market for this or if you have an interest in it (or any feature requests), please let me know through this blog. As with all connectors developed on our framework, it will be code complete in a matter of weeks once we start development.

 

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Yes, I am one of them.

More info: http://sharepointsearch.com/cs/blogs/enterprisesearch/archive/2008/01/08/microsoft-announces-offer-to-acquire-fast-search-amp-transfer.aspx 

I guess I was right. http://sharepointsearch.com/cs/blogs/notorioustech/archive/2007/07/19/fast-search-integration-announcement-could-be-interesting.aspx

 

Until the full nature of the future integration of SharePoint and Fast are announced we will all be just guessing.

One guess (from a vendor) is that it will become the basis for Office 14, but who knows.

One thing is sure, Microsoft was missing conceptual/semantic (http://en.wikipedia.org/wiki/Latent_semantic_analysis) search capabilities and now they have it.

Stay tuned.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments
Filed under: ,

Don't know how i missed this one before.

After you install the latest Search Server 2008 RC 64bit (didn't check the 32 bit), go to Central Administration->Operations->Logging

By default Collect Error Reports is enabled and so is the checkbox to silently send error reports to Microsoft.

NOTE: it doesn't just say SharePoint error reports and could possibly mean all error reports on the computer.

Here is the text: 

Change this computer's error collection policy to silently send all reports. This changes the computer's error reporting behavior to automatically send reports to Microsoft without prompting users when they log on.

I personally have no problem helping out the development and QA process of Microsoft. I am just not sure all my clients want this on by default.

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

Thanks Calvin for the heads up. I am sure many of us have wondered why they were still there for so long, and the painful response times.

http://calvin998.wordpress.com/2007/11/07/findings-on-sharepoint-search-bdc/

After a content source is deleted, the index items of the content source will NOT be deleted immediately. Instead, the SharePoint “Gather” will delete them one by one at the background, and a warning message will be generated for each deleted items! It seems no full crawl can be done before this process is finished (at least no BDC crawling). It will take more than a day to remove 1.5 M records - aboutn 600 records a minute. Slooooooow! If one just want to erase all indexed content, one can click on “Reset all crawled content” link on search setting page. That is very fast.

It seems no way to purge the humongous crawl log. MS site says it will deleted after 5 days - to be confirmed.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Did a quick search and I am not the only one suffering from the latest MS security patch to MOSS. It contained a rollup of prior patches including some search related ones. It seems that now MSSearch.exe has a slow leak when crawling and searching. I can confirm that it wasn't happening last week at client site and then I installed the patch and now I have to either roll back, bug MS Support or kill it periodically when it gets to big.

Can anybody else confirm this bug?

 

UPDATE: Problem seems to be directly related to crawl speeds. I was able to reduct the impact by throttling back the crawler using Central Administration > Application Management > Search Service > Crawler Impact Rules .

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

http://weblog.infoworld.com/tcdaily/archives/2007/11/post.html

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

Microsoft Just announced the next release of SharePoint Search at the Enterprise Search Summit, which I am at:

http://blogs.msdn.com/enterprisesearch/archive/2007/11/06/announcing-microsoft-search-server-2008-express.aspx

 

There are many partners already onboard creating Open Search definition files that will allow integration with the newest release of SharePoint Search.

http://www.microsoft.com/enterprisesearch/connectors/federated.aspx#fscp

What this means is that existing Search providers, those with their own query engines and indexes can now integrated a lot easier without having to develop their own custom search interface in SharePoint.

Here is a great overview. Somewhat technical http://msdn2.microsoft.com/en-us/library/bb931080.aspx

We will have more updates on SHAREPOINTSearch.com soon.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

It has been on my agenda for quite awhile to do formal product reviews and ratings of all the search products listed on the SHAREPOINTSearch.com site. Unfortunately I haven't had much time available. We did receive a great demo from the CEO of Interse on their iBox Product. THANKS.That review will be coming soon.

I am hoping to be able to schedule time with some of the vendors at the Enterprise Search Summit on either Monday or Tuesday to start the process.

If you are a vendor and will have a representitive there that can demo your product(s) to me and would like a formal review done on it please contact me through this blog and I will reply with my contact information.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

http://www.enterprisesearchsummit.com/west/ 

The last one in NY was fun and very interesting. See my post  Got to meet all of the Enterprise Search vendors not just the SharePoint focused ones.

Anyways, I am planning on being there the 5th - 7th so if any readers out there are also going to be there let me know, I would love to meet up for a beer.

Note: I will be bringing sharepointsearch.com toys to handout :)

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

So by now you must have heard through the grapevine that Mondosoft was going to be liquidated by it's VC's because of insolvency.

Well yesterday the company was purchased by Surfray www.surfray.com who is a Mondosearch competitor.

It appears that they have no interest in the Ontolica product and are considering selling it off.

If you do a search on google for mondosoft and surfray you will find alot of Danish articles about this. What is interesting is that it appears in March of this year Surfray attempted a hostile takeover of Mondosoft by going directly to its shareholders and stating "Mondosoft ist eine Firma, die in einer Krise steckt aus der wir Ihnen helfen können" sagt SurfRay´s CEO Martin Veise zur dänischen IT Zeitschrift Computerworld . or somewhat translated: Soft Mondo is a company that in a crisis from which we can help," says SurfRay's CEO Martin Veise to Danish IT magazine Computer World.

And how did we all miss this? For me at least, it's because I don't speak Danish.

I hope it all works out for the Ontolica product and especially Lars (who is a friend), 

How many SI's, search consultants (yes me too), vendors and even Microsoft have recommended to their customers the Ontolica product? I suppose if the current installation works....

 

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

I have used the MS Exchange content source to crawl both public folders and private mailboxes and folders.

Here is what I found out: It works great accept the search results are security constrained to the crawling account. Only the account that was used to crawl the folders has permissions to search for those items. This means that you have to make your public folders be truly public by allowing anonymous access which means a security hole on your exchange server to get this working. It also does not crawl more than one sub folder deep. So if you have a multilevel folder structure you will have to add them one by one to the sources. NOTE: there has been a patch released that you can get from MS Support which I didn't try and may address some of these issues.

But the Official MS solution for including emails in your searches is to run Windows Desktop search and use that as your search interface and can include in SharePoint Search results in the same interface.

Neither of these approaches were going to work in reality for our customers and instead of turning to external search engines like X1, Fast or Autonomy we chose to develop our own solution to the problem. I usually don't talk about my own products in my blog, but I think there is such a need for this that I will do this ahead of our formal announcement.

We are now in beta testing of a SharePoint search connector for exchange http://www.sharepointworks.com/pages/escexchange.aspx which basically allows the inclusion of all your private mailbox content into your SharePoint index ( up to the 50 Million item limit) and provides a true Enterprise Search experience. Security is fully adhered to and you can configure master accounts (for discovery purposes) that have search access to all mailboxes if you choose. Besides its use as a replacement for Desktop search, it also provides very useful help desk functionality to search history across multiple mailboxes (thanks Martin for the idea). We took a very unique approach to crawling the Exchange content that basically provides a buffer between SharePoint and the Exchange servers to minimize impact on production systems and to allow more frequent updates to the search index (even with 50 million emails, incremental's can be done hourly). Contact me for more information or if you would like to participate in trials of this product.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments

You can find some great resources about search relevancy in our Big Resource List

http://sharepointsearch.com/pages/bigresourcelist.aspx?category=&focus=Relevancy&class=&member=

In particular check out this article

Evaluating and Customizing Search Relevance

It is written by Dmitriy Meyerzon, Avi Schmueli and Jo-Anne West

- I have met Dmitriy and Avi at a MS Search workshop and they are THE MS Search guys to listen to.

So after you have read what they say about improving Relevancy, download this free tool I just uploaded to allow you to manipulate the weights and parameters they talk about.

http://sharepointsearch.com/cs/files/folders/searchtools/entry2527.aspx

Addendum:

Read these forum discussion. Very enlightening about search relevancy. http://sharepointsearch.com/cs/forums/t/2397.aspx , http://sharepointsearch.com/cs/forums/t/2546.aspx

 

- Cheers.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Posted by notorioustech | with no comments
More Posts Next page »