Tree branches

More and more often we are asked to host a single page app (SPA) within Sitecore along side regular server generated pages. However, Sitecore is used to being the only show in town and assumes that it is responsible for all parts of a URL path up to the hash. e.g.

  • /myReactApp (SPA root page loaded by Sitecore)
  • /myReactApp/step1 (404 page not found)
  • /myReactApp/step1/itemId1/subItem4 (404 page not found)

Since those pages after /myReactApp don’t exist in Sitecore, the user will see a 404 if they hit that url directly. Instead you can use hash routing. e.g.

  • /myReactApp (SPA root page loaded by Sitecore, SPA loads its start page)
  • /myReactApp#step1 (SPA root page loaded by Sitecore, step1 loaded by the SPA)
  • /myReactApp#step1/itemId1/subItem4 (SPA root page loaded by Sitecore, step1 plus specific items loaded by the SPA)

This will work just fine, but it’s a little bit ugly. Wouldn’t it be nice to just have regular looking URLs so that the SPA is more transparent to the end user? You could do this by creating child pages in Sitecore with names that match the SPA routes, but it would be a nightmare to maintain. A neater option is to add a processor into the item pipeline that will climb up the tree to find the SPA host page that exists in Sitecore. e.g.

  • /myReactApp (Page is found by Sitecore, SPA start)
  • /myReactApp/step1 (Sitecore checks for /myReactApp/step1 and can’t find it, so tries /myReactApp, finds that and loads that page. step1 loaded by the SPA)
  • /myReactApp/step1/itemId1/subItem4 (Sitecore checks for subItem4, then itemId1, then step1, then finds myReactApp and loads that page, step1 plus specific data loaded by the SPA)

How it works

The ParentItemResolver pipeline component loosely follows what the default ItemResolver class does, but walks UP the path to find a match. It…

  • Checks whether an item has already been found previously in the pipeline, or if the path is /undefined or /itemnotfound. If yes, no need to do anything more and it stops.
  • Reads the requested path and breaks it up from most specific to least specific, so given the path /myReactApp/step1/itemId1/subItem4, it will generate the list
    • /myReactApp/step1/itemId1/subItem4
    • /myReactApp/step1/itemId1
    • /myReactApp/step1
    • /myReactApp
  • Tries to match these paths to an item from most specific to least as shown above.
  • If it finds an item, it checks to see whether that item has the field MapChildrenToParent, and it’s true. If yes, then this is the item we’re looking for.

The code

\Pipelines\HttpRequest\ParentItemResolver.cs

using Sitecore.Abstractions;
using Sitecore.Configuration;
using Sitecore.Data.ItemResolvers;
using Sitecore.Data.Items;
using Sitecore.Diagnostics;
using Sitecore.SecurityModel;
using System.Collections.Generic;
using System.Linq;
using Sitecore;
using Sitecore.Pipelines.HttpRequest;
using Sitecore.StringExtensions;

namespace MyProject.ParentItemResolver.Pipelines.HttpRequest
{
    class ParentItemResolver : HttpRequestProcessor
    {
        private int MaxChildDepth =>
            Settings.GetIntSetting("MyProject.ParentItemResolver.MaxChildDepth", 4);

        /// <summary>
        /// Initializes a new instance of the <see cref="T:Sitecore.Pipelines.HttpRequest.ItemResolver" /> class.
        /// </summary>
        /// <param name="itemManager">The item manager.</param>
        /// <param name="pathResolver">The path resolver.</param>
        public ParentItemResolver(BaseItemManager itemManager, ItemPathResolver pathResolver)
            : this(itemManager, pathResolver, Settings.ItemResolving.FindBestMatch)
        {
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="T:Sitecore.Pipelines.HttpRequest.ItemResolver" /> class.
        /// </summary>
        /// <param name="itemManager">The item manager.</param>
        /// <param name="pathResolver">The path resolver.</param>
        /// <param name="itemNameResolvingMode">The item name resolving mode.</param>
        protected ParentItemResolver(
            BaseItemManager itemManager,
            ItemPathResolver pathResolver,
            MixedItemNameResolvingMode itemNameResolvingMode)
        {
            Assert.ArgumentNotNull(itemManager, nameof(itemManager));
            Assert.ArgumentNotNull(pathResolver, nameof(pathResolver));
            ItemManager = itemManager;
            ItemNameResolvingMode = itemNameResolvingMode;
            PathResolver =
                (itemNameResolvingMode & MixedItemNameResolvingMode.Enabled) == MixedItemNameResolvingMode.Enabled
                    ? new MixedItemNameResolver(pathResolver)
                    : pathResolver;
        }

        /// <summary>Gets the item manager.</summary>
        /// <value>The item manager.</value>
        protected BaseItemManager ItemManager { get; }

        /// <summary>Gets the item name resolving mode.</summary>
        /// <value>The item name resolving mode.</value>
        protected MixedItemNameResolvingMode ItemNameResolvingMode { get; }

        /// <summary>Gets or sets item path resolver.</summary>
        /// <value>Item path resolver.</value>
        protected ItemPathResolver PathResolver { get; set; }

        /// <summary>
        /// Resolves context item from provided <see cref="P:Sitecore.Pipelines.HttpRequest.HttpRequestArgs.Url" />, <see cref="P:Sitecore.Pipelines.HttpRequest.HttpRequestArgs.LocalPath" />, and context site information (if site resolved).
        /// <para>Will try to resolve different combinations of aforementioned parameters according to resolve order from <see cref="M:Sitecore.Pipelines.HttpRequest.ItemResolver.GetCandidatePaths(Sitecore.Pipelines.HttpRequest.HttpRequestArgs)" />.</para>
        /// <para>If item still not resolved, will use <see cref="P:Sitecore.Pipelines.HttpRequest.ItemResolver.PathResolver" /> logic.</para>
        /// <para>Last resort is to use site start path if request contains <see cref="F:Sitecore.Pipelines.HttpRequest.ItemResolver.UseSiteStartPathQueryStringKey" />.</para>
        /// </summary>
        /// <param name="args">The arguments.</param>
        public override void Process(HttpRequestArgs args)
        {
            Assert.ArgumentNotNull(args, nameof(args));
            if (SkipItemResolving(args))
                return;
            var isPermissionDenied = false;
            Item foundItem = null;
            var str = string.Empty;
            try
            {

                StartProfilingOperation("Resolve parent item.", args);
                var stringSet = new HashSet<string>();
                var candidatePaths = GetCandidatePaths(args.Url.ItemPath);
                foreach (var candidatePath in candidatePaths)
                {
                    if (stringSet.Add(candidatePath))
                    {
                        if (TryResolveItem(candidatePath, args, out foundItem, out isPermissionDenied))
                        {
                            str = candidatePath; // I am your father
                            break;
                        }

                        if (isPermissionDenied)
                            return;
                    }
                }

                if (foundItem != null)
                {
                    //check whether the found item has enabled being returned for child paths
                    if (foundItem.Fields["MapChildrenToParent"] == null || foundItem.Fields["MapChildrenToParent"].Value != "1")
                    {
                        foundItem = null; // That's impossible
                    }
                }
            }
            finally
            {
                if (foundItem != null)
                    TraceInfo("Parent item is {0}.".FormatWith((object)str));
                args.PermissionDenied |= isPermissionDenied;
                Context.Item = foundItem;
                EndProfilingOperation(null, args);
            }
        }

        /// <summary>
        /// Gets item candidate paths that will be tested to be resolved to context item.
        /// <para>Uses <see cref="P:Sitecore.Web.RequestUrl.ItemPath" /> as a first priority.</para>
        /// <para>Uses <see cref="P:Sitecore.Pipelines.HttpRequest.HttpRequestArgs.LocalPath" /> as a second priority.</para>
        /// <para>Will try a combination of <see cref="P:Sitecore.Sites.SiteContext.RootPath" />, and <see cref="P:Sitecore.Pipelines.HttpRequest.HttpRequestArgs.LocalPath" /> next.</para>
        /// <para>Will use <see cref="T:Sitecore.Sites.SiteContext" /> rootPath/startItem/localPath next.</para>
        /// </summary>
        /// <param name="args">The arguments.</param>
        /// <returns>Item path candidates that are fetched from arguments.</returns>
        protected IEnumerable<string> GetCandidatePaths(string itemPath)
        {
            var candidatePaths = new List<string>();
            if (Context.Site == null || itemPath.ToLower().EndsWith("/undefined") || itemPath.EndsWith("/pagenotfound"))
            {
                return candidatePaths;
            }

            var pathParts = itemPath.Trim('/').Split('/').Take(MaxChildDepth + 1).ToList();
            var parentPaths = new List<string>();

            for (var index = pathParts.Count - 1; index > -1; index--)
            {
                parentPaths.Add("/" + string.Join("/", pathParts.Take(index)));
            }

            foreach (var path in parentPaths)
            {
                candidatePaths.Add(DecodeName(path));
            }

            var excludedPaths = new[] { "/", "/sitecore", "/sitecore/content" };

            return candidatePaths.Where(x => !excludedPaths.Contains(x));
        }

        /// <summary>
        /// Tries to resolve item by provided path. Reports if item read permission was denied.
        /// <para>Returns <c>true</c> if item was found by <paramref name="itemPath" />, and user has access to it.</para>
        /// </summary>
        /// <param name="itemPath">The item path.</param>
        /// <param name="args">The arguments.</param>
        /// <param name="item">Will be resolved from provided parameters.</param>
        /// <param name="permissionDenied">If set to <c>true</c> read permission was denied for an item.</param>
        /// <returns><c>true</c> if item was resolved by path; <c>false</c> when item does not exist, or user does not have read access to it.</returns>
        protected virtual bool TryResolveItem(
            string itemPath,
            HttpRequestArgs args,
            out Item item,
            out bool permissionDenied)
        {
            permissionDenied = false;
            item = ItemManager.GetItem(itemPath, Context.Language, Sitecore.Data.Version.Latest, Context.Database,
                SecurityCheck.Disable);
            if (item == null)
                return false;
            if (item.Access.CanRead())
                return true;
            permissionDenied = true;
            item = null;
            return false;
        }

        /// <summary>
        /// Decodes the path to ensure all URL characters are escaped with respect to 'EncodeNameReplacements'.
        /// <para>Refer to <see cref="M:Sitecore.MainUtil.DecodeName(System.String)" /> for more details.</para>
        /// </summary>
        /// <param name="itemPath">The item path.</param>
        /// <returns>An escaped path with all the characters from 'EncodeNameReplacements' replaced.</returns>
        protected virtual string DecodeName(string itemPath)
        {
            return MainUtil.DecodeName(itemPath);
        }

        /// <summary>
        /// Checks if item resolving should be skipped:
        /// <para>If context item is already set - do not want to override it.</para>
        /// <para>If no context database is set - do not know where to fetch item from.</para>
        /// <para>If no item path is provided.</para>
        /// <para>If the path ends with /undefined or /pagenotfound.</para>
        /// </summary>
        /// <param name="args">The arguments.</param>
        /// <returns><c>true</c> if item resolution should be skipped;<c>false</c> otherwise.</returns>
        protected virtual bool SkipItemResolving(HttpRequestArgs args)
        {
            return Context.Item != null 
                   || Context.Database == null 
                   || args.Url.ItemPath.Length == 0
                   || args.Url.ItemPath.ToLower().EndsWith("/undefined") 
                   || args.Url.ItemPath.ToLower().EndsWith("/pagenotfound");
        }
    }
}

We’ll going to wire up the processor so that it runs immediately after the ItemResolver. There’s also a setting for the maximum depth that Sitecore should go to, to prevent someone entering a ridiculous path and Sitecore trying to resolve every part of it.

\App_Config\Include\MyProject.ParentItemResolver.config

<?xml version="1.0" encoding="utf-8"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <settings>
      <setting name="MyProject.ParentItemResolver.MaxChildDepth" value="9" />
    </settings>
    <pipelines>
      <httpRequestBegin>
        <processor type="MyProject.ParentItemResolver.Pipelines.HttpRequest.ParentItemResolver, MyProject.ParentItemResolver" resolve="true" patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']" />
      </httpRequestBegin>
    </pipelines>
  </sitecore>
</configuration>

You’ll need to either create a new template that your SPA host page can inherit, or extend an existing template with a checkbox field called MapChildrenToParent.

SEO considerations

When implementing this solution, we were careful to consult with analytics and SEO specialists to understand the impact that this might have on SEO in particular. Search engines don’t like it if you return exactly the same content at different URLs in the same site, and may punish sites that do so in the rankings. In this case, we chose to go ahead with this solution because

  • Using hash routing would have a similar effect on SEO because search engines don’t consider anything in the URL fragment.
  • The ability to link to a forward point within the SPA is for users’ and marketing convenience. We may like to link from another page to a particular part of the SPA, but we are not expecting users to come directly from search engines to anywhere other than the beginning of the SPA.

I hope this has been helpful. Thanks for reading. :-)