Monday, August 3, 2015

Sitecore: Indexing Associated Content

Setting up indexes and indexing content for site search in Sitecore is a pretty straightforward task and there is an extensive knowledge base put together by the community with various examples. One useful construct we often find ourselves implementing for keyword search is setting up a computed field. I typically use this construct to index any additional content referenced by a page, typically (content blocks, promos, callouts) added to the page via presentation details


The Need: Index externally reference content by a page item
Solve: Create a computed field to index TextField and HtmlText type fields of referenced items

This implementation fetches all the renderings for the current item's presentation for the default device and checks their datasource item for index-able content.
As a suggestion, check if the current item inherits from certain page templates else skip the execution.

Step1: Create a class and implemented the IComputedIndexField interface as shown below:

    public class RelatedContent : IComputedIndexField
    {
        public const string DefaultDeviceId = "{FE5D7FDF-89C0-4D99-9AA3-B5FBD009C9F3}";
        public string FieldName { get; set; }
        public string ReturnType { get; set; }

        public object ComputeFieldValue(IIndexable indexable)
        {
            Item item = indexable as SitecoreIndexableItem;
            //add condition to skip if the current item does not belong to a page template
            var sb = new StringBuilder();
            var masterDb = Factory.GetDatabase("master");
            DeviceItem defaultDevice = masterDb.GetItem(DefaultDeviceId);
            RenderingReference[] renderings = item.Visualization.GetRenderings(defaultDevice, true);
            foreach (RenderingReference rendering in renderings)
            {
                if (string.IsNullOrEmpty(rendering.Settings.DataSource))
                    continue;

                Item datasourceItem = item.Database.GetItem(rendering.Settings.DataSource);
                if (datasourceItem == null) continue;

                //add an if condition to get indexable content for certain template types
                sb = GetIndexableContent(sb, datasourceItem);
            }
            return string.IsNullOrEmpty(sb.ToString())?null:sb.ToString();
        }

        private StringBuilder GetIndexableContent(StringBuilder sb, Item item,bool indexAllTextFields = true, string fieldName="")
        {
            if(sb==null) sb = new StringBuilder();

            if (!string.IsNullOrEmpty(fieldName))
            {
                if (item.Fields[fieldName] != null
                && !string.IsNullOrEmpty(item.Fields[fieldName].Value))
                    sb.Append(item.Fields[fieldName] + " ");
            }

            if (indexAllTextFields)
            {
                //skip standard fields by checking for "_" in name
                foreach (Field field in item.Fields.Where(x=>!x.Name.StartsWith("_")))
                {
                    var customField = FieldTypeManager.GetField(field);

                    if (!string.IsNullOrEmpty(customField.Value))
                    {
                        if (customField is TextField)
                            sb.Append(customField.Value.Trim() + " ");
                        else if(customField is HtmlField)
                            sb.Append(Sitecore.StringUtil.RemoveTags(customField.Value.Trim()) + " ");
                    }
                }
            }
            return sb;
        }
    }


STEP 2: Add the following to your custom index configuration

For Sitecore Lucene:

<configuration ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration" >
    <fields hint="raw:AddComputedIndexField">
        <field fieldName="related_content"></classname/>, </assemblyname/></field>
    </fields>
</configuration>

For Solr:

<configuration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration" >
    <fields hint="raw:AddComputedIndexField">
        <field fieldName="related_content"></classname/>, </assemblyname/></field>
    </fields>
</configuration>


That's it! oh, and don't forget to rebuild your index.

If you like this approach or would like to suggest a better approach, please leave a comment.

No comments:

Post a Comment