Content Deployment explained

on Tuesday, March 2, 2010

In this post, I’ll cover how a particular WCM feature called Content Deployment supports multi-farm topologies and can be used to enable authoring -> staging -> production scenarios. I’ll talk a little about previous solutions to the problem, give you an overview of the Content Deployment feature, and discuss the architecture of the feature in-depth.

I would like to point out before we get started that even though I will be discussing the Content Deployment feature’s application in Internet-facing scenarios, the feature is most definitely useable for intranet sites as well. Because content is deployed from site collection to site collection, it can even be used to deploy content to another location on the same machine. Bottom line: the Content Deployment feature is very flexible and has uses beyond the scenarios that I’ll be discussing in this post.

First, let’s make sure we’re all on the same page by presenting a typical IT scenario. If you’re running an Internet-facing site that contains content authored by people on your internal network, chances are that you have a network separation (i.e. firewall(s)) between the intranet and the Internet-facing network. You want your internal authors to have access to the site so they can author, edit, and approve content, but you want that network (the intranet) shielded from incoming Internet traffic for security purposes. After all, the Internet is a big scary place no doubt populated with people who’d like nothing better than to get onto your internal network and wreak havoc. However, if Internet users can’t access your site, then what good is it?

The most common solution to this problem is to have two server farms: one internal farm (in the intranet) dedicated to your authors/editors/designers and a second farm (in the Internet-facing network) that hosts your production site. Your internal farm is read/write, while your production farm is most likely read-only. If you have this sort of topology though, you need a way to get the authored content from your authoring farm to your production farm. This is where Content Deployment comes in. In a nutshell, deployment allows you to push out your content from one server farm to another. For the purposes of this post, I’m going to focus on a two-tier topology (authoring -> production), but you can also have a three-tier topology (authoring -> staging -> production) or even an n-tier topology if you need it.

If you have used Content Management Server (CMS) 2002, then you’re somewhat familiar with Content Deployment already. In CMS, you would export your content from the authoring farm to an .sdo file, transport it via your own method to the production farm, then initiate an import of the .sdo file on the production farm. There were three basic steps: export, transport, and import, but there wasn’t any UI to help you configure this process – it was all manual.

In MOSS, things work in much the same way. The three basic phases of the process remain the same; however, Content Deployment takes care of transporting the content across the wire for you and instantiates the remote import as well. We even provide a UI in the SharePoint Central Administration site that allows you to configure, run, and monitor the deployment of your content.

So how exactly does this all work? So glad you asked! :-) There are two core conceptual objects to understand: paths and jobs. A path is basically a connection between a source farm and a destination farm. The path contains information about which source web application and site collection you are deploying, authentication information for the destination farm, and the web application and site collection on the destination farm. In short, a path represents the mapping between your authoring and production site collections.

However, a path by itself doesn’t actually deploy any content. In order to do that, you create a job. A job is associated with a path, and it determines exactly which sites in the source site collection will be deployed and on what schedule. You can have many different jobs for a given path, each running on different schedules and deploying specific sections of your site. That’s right – a job has a schedule and can deploy content updates regularly without the need to manually kick it off every time. For example, let’s say you have a Press Releases site that needs to be updated every hour, and an Employee Bios site that only needs to be updated every month. You would create two different jobs, one that runs every hour and deploys the Press Releases site, and another that runs monthly and deploys the Employee Bios site.

Let’s say you also need to push out your data to a third farm, perhaps a read-only extranet. No problem! You simply create another path that maps your authoring site to the extranet site, and create jobs that deploy the appropriate content on the appropriate schedule. One important thing to note is that deployment is always one way: source -> destination. It’s a “single-master” system.

Deployment is also pretty smart. By default, it only deploys the changes since the last successful deployment, which saves bandwidth and time. And if there aren’t any changes, the deployment will complete without redoing any unnecessary work. Of course, full deployments every time can be configured if that’s what you really want.

What about dependencies? What if a page is dependent on a page layout or image that has been updated? No cause for concern; deployment automatically picks up the dependent page layout and packages it up along with the page itself – even if the dependent resources aren’t in the same site. Let me clarify what that means. In the above scenario with the Press Releases and Employee Bios sites, let’s say you configured the two jobs I talked about on two different schedules. Whenever the Press Releases job runs, it will check to see if any of the content it’s deploying is dependent on other resources, and will pick those resources regardless of where they live in your site collection. This ultimately means that you don’t have to worry about content not rendering on your production environment because your jobs run on different schedules and have interdependent content – Content Deployment takes care of it all for you!

As I alluded to earlier, Content Deployment is configured and managed in the SharePoint Central Administration site, so the person configuring it is a Central Administrator. Usually this is OK, but there might be instances where a page needs to be deployed from authoring to production as quickly as possible. The site owner, who probably isn’t a Central Administrator, needs to get that content out ASAP, and doesn’t have time to wait for the next scheduled deployment. What does the site owner do?

For scenarios like this, Content Deployment has a special job called “Quick Deploy” that is automatically created for every path in any site collection with the Publishing Resources feature enabled. This job, once enabled in a path, wakes up every few minutes (15 minutes, by default) and checks a special list for content that should be deployed. If the site owner has rights, he or she can deploy pages quickly to production by using the “Quick Deploy” link on the Page Editing toolbar. This adds that page to the special list, which the Quick Deploy job will check the next time it wakes up. Pretty nifty, eh? By default, only the site owner has the Quick Deploy right. In order to give other users the same privileges, you simply add them to the Quick Deploy Users group.

And that, ladies and gents, is Content Deployment in a nutshell – albeit a fairly large nutshell. :-) I should also mention that if the Content Deployment feature doesn’t meet your needs for some reason (perhaps you have physically separated networks and require an alternative transport method such as “sneaker-net”), you can always use our APIs, which are well documented in the MOSS SDK, to craft a custom solution that meets your exact needs. I think that you’ll find this feature compelling and exciting, especially if your IT infrastructure leverages a multi-farm topology.