Please upgrade your web browser now. Internet Explorer 6 is no longer supported.
Zac Smith
SharePoint, WSS and MOSS development.

SharePoint Search for Public Websites

by Zac Smith 18-Aug-09, 4 Comments

Configuring search on a public facing Web Content Management (WCM) site is quite a different task compared with your typical SharePoint intranet. Searching over internal content largely works out of the box; setting up a few content sources and basic scopes is usually enough to satisfy most users.

With a public website we want a simpler more ‘bing/google’ like search experience. The method of search is a basic search keyword phrase input and the power of the search resides in the indexing of content. We do not want to rely on a user’s ability to construct complicated search terms. Everybody can use it, and use it effectively.

What follows from here is a basic guide for setting up SharePoint search on an anonymously accessed SharePoint publishing site. This assumes a bit of experience configuring search, but if you don't take a look at this TechNet webcast on installing and configuring search in SharePoint Server 2007.

Creating Scopes

Creating scopes is the most important step in configuring public search. There are usually a number of resource files such as CSS, JavaScript, XSL and images as well as objects like user profiles that you wouldn’t want showing up in your search results. However we do want to be able to search over all of our document libraries, inlcuding aspx pages. So our first step is to create a scope that will return all pages and documents which we can create like this:

Public Search Scope Example

A search using this scope will return anything that is in the content source “Local Office SharePoint Server sites” AND (the content is a publishing page OR the content is a document). Note the brackets used in this statement.

As you can see the rule behaviour is being used to create logical conditions. The logic of the rules can be applied as follows:

  • Include = OR
  • Require = AND
  • Exclude = AND NOT

The ‘contentclass’ property specifies what type the indexed item is and will be automatically available for any content item in SharePoint. The two types that we are usually concerned with in a public site are:

  • STS_ListItem_850 (Publishing Pages)
  • STS_ListItem_DocumentLibrary (Documents)

Check out this post from Dan Attis for a complete list of contentclass values.

Tip

I would recommend against allowing list items in your search scopes. The basic reason for this is that to view a list item you need to browse to the display form (/Forms/DispForm.aspx). Problem is this should be locked down by the Form Lock down feature. Unfortunately it is common for lists to be used to store content for your public web site; for example when using WSS collaboration features such as blogs, wikis and discussion lists. At the end of the day the collaboration and publishing features in SharePoint don’t play very nicely together. When making design decisions for a SharePoint based solution and the question comes up - “Should we put this content in a simple list or create aspx pages?”, you should consider whether you want the content to be searchable or not.

Scope Examples

What if we wanted to create a scope that returned everything under a specific web? In this example I have added folder rule that will include all results in or beneath the 'about-us' site:

Public Scope for a web

 

What if we had a shared server environment that hosted multiple websites? In this example I have added a domain rule so that any results for my site 'http://trinkit' will be returned:

Public Scope for a Site

If you don’t know how to create scopes than have look at this help page from microsoft office online.

Tip

When indexing document libraries make sure that the documents are of a file type known to SharePoint, otherwise SharePoint will crawl the document as a list item and use the form display page rather than the actual document itself. Check out the filter pack from Microsoft if you want to add additional file types.

Creating a Simple, Deployable Layout

Armed with our public search scopes we already have enough information to return the right results. The next step is to create a simple search page to display search results.

When you create a search centre using the out-of-the-box search site template, you get a whole bunch of features that just aren’t that well suited to a public facing scenario (RSS Feeds, Alerts, Advanced Search). My recommendation is to take a light weight minimal approach - why use a whole search centre when a single results page will do it? Creating a single page layout that is part of an easily deployable SharePoint solution is often the cleanest way to go.

Web Parts

Web Part zones often cause issues when it comes to repeatable deployment and they add additional HTML bloat. If you are wanting the simplest HTML output possible then web part zones should be avoided.When it comes down to it we only really need a page layout with a few basic web parts – SearchBoxEx, CoreResultsWebPart and the SearchPagingWebPart.

Here is an example of using the CoreResultsWebPart in a search page layout without web part zone.

<Search:CoreResultsWebPart runat="server" 
    ID="SearchResults" 
    ShowActionLinks="True" 
    Scope="All Pages and Documents"  
    HighestResultPage="1000" 
    DuplicatesRemoved="True" 
    DisplayDiscoveredDefinition="True" 
    ShowSearchResults="True" 
    FrameType="None" 
    NoiseIgnored="True" 
    StemmingEnabled="True" 
    View="Relevance" 
    QueryNumber="Query1" 
    SentencesInSummary="3" 
    ResultsPerPage="10" 
    DateFormat="DateOnly" 
    DisplayAlertMeLink="False" 
    DisplayRSSLink="False" 
    RelevanceView="True" 
    WebPart="true">
    <XslLink>/XSL/CoreSearchResults.xsl</XslLink>        
    <SelectColumns>
        <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <Columns>
                <Column Name="WorkId"/>
                <Column Name="Rank"/>
                <Column Name="Title"/>
                <Column Name="HitHighlightedProperties"/>
                <Column Name="Size"/>
                <Column Name="Path"/>
                <Column Name="Description"/>
                <Column Name="PictureThumbnailURL"/>
                <Column Name="SiteName"/>
                <Column Name="CollapsingStatus"/>
                <Column Name="HitHighlightedSummary"/>                
                <Column Name="ContentClass"/>
                <Column Name="IsDocument"/>
                <Column Name="Write"/>
                <Column Name="Author"/>
                <Column Name="ContentType"/>
            </Columns>
        </root>
    </SelectColumns>        
</Search:CoreResultsWebPart>

The other web parts can be added to the page layout in the same way.

Tip

Make sure search.js is inlcuded in a custom search page layout as it is needed for logging search statistics: 

<asp:Content ContentPlaceHolderID="PlaceHolderAdditionalPageHead" runat="server">
    <SharePoint:ScriptLink ID="ScriptLink1" name="search.js" runat="server"/>
</asp:Content>

  

Additional Branding Considerations

The majority of the branding is quite easy due to the core search results web part using an XSL transformation to style the results. Unfortunately the other web parts will require tedious battling with overriding of SharePoint’s CSS properties. Not ideal but you can still get it looking pretty decent if you know what you are doing.

For full control of the HTML structure and styling you would need to create a bespoke solution that used the search SQL Syntax API that comes with MOSS. This is also the only solution if you require some advanced sorting or filtering functionality. This isn't overly difficult, but it's a tough one to explain to the business owner that is forking out for SharePoint.

So what about advanced search? I think we’ll leave that one for another day.

I hope this post gives you a few ideas and some "best practices" on you can go about creating a decent search solution for you public SharePoint website.

Good luck!

Categories: Development, MOSS, Search, SharePoint, WCM
4 responses so far:
  • Wednesday, 16 Sep 2009 10:20 by James
    Hi there, nice article. I was particularly interested when I read your comments about the search.js file. We have a governmental client who has had their Intranet and Internet built for them by another company using completely customised SharePoint with no web parts. So we have a custom master page, custom search results page layout, custom search controls, custom xsl and custom code calling into the search API. The problem is that we are not getting any usage statistics generated. We have normal team sites set up on the same farm which are generating usage statistics so it's definitely something to do with the customisations. When I saw your search.js comment I thought that must be it. I've added it in and it doesn't seem to have had any effect. Is there anything else we need to do to get this working? I notice when using Fiddler and a default search results page that the search.asmx web service is called. Do we need to wire this up somewhere? If you have created a custom master page / page layout etc. and are getting usage statistics I would very much like to compare code with you so that we can get this working. Thanks in advance. James.
  • Thursday, 17 Sep 2009 08:33 by Zac Smith
    Hi James The search.asmx call that you are seeing will be the search stats update. My guess would be that the custom search results web part does not output the HTML/JS required to update the search stats. I would compare the default page output with your custom page to determine what is missing and add it back in. Ideally you should go back to the vendor and get them to fix this. Hope that helps.
  • Friday, 18 Sep 2009 12:07 by James
    Hi Zac, I've tracked down the missing code but I don't think I can "easily" recreate it. It seems the code is generated by the Core Results web part. It generates an "window.onunload" event function that sends a soap request containing all of the search details. I've looked at the webpart code (using Reflector) and it is generating the required javascript through the use of internal objects that I don't have access to. It would be a lot of work to recreate this. I don't understand why the logging of these statistics is not done as part of the Query call in the object model, not as an after thought bit of javascript on the results page. I can't believe I'm the first to have noticed this problem. I appreciate that the ClickThrough statistics can only come from the client-side but surely the query itself could/should be logged serverside? I've also noticed that the ClickThrough stats will only get logged if you maintain the "CSR" in the id of each search result hyperlink. So anyone who edits the XSL beware. I'd love to hear how other people have got around this? Do you yourself use the default web parts? I notice that your company have built a starter framework which sounds very similar to the one we're using. Do use web parts or custom controls only? Are you generating search statistics? I appreciate your help. Thanks, James.
  • Friday, 18 Sep 2009 09:58 by Zac Smith
    Hi James I'm guessing that not many people have created custom search web parts and actually bothered looking at the search usage stats. I have been using the default web parts, unless there is a specific requirement for heavy customisation of compliance I think it is better to use the OOTB search webparts (as discussed in this post). Search stats are being generated fine. When it comes to non-OOTB components I generally use custom user controls in public facing scenarios. I would have thought it wouldnt be too difficult to generate that javascript yourself using the RegisterClientScriptBlock class from .Net 2.0. Good luck.

 

Post a Comment:
Name:
URL:
Email:
Comments: