Content management systems at the BBC
Steve Elson, executive product manager in BBC Future Media, looks back on the history of CMS development within the BBC
Back in 2007, the BBC was working with 24 different content management systems – now it’s down to just a handful. Thanks to the continuing efforts of various Future Media teams, the BBC’s content management systems are not just fewer in number, but are also much more efficient and user-friendly.
In this article we focus on the evolutionary journey towards iSite 2, the BBC’s new multi-purpose content management system (CMS).
The story begins in 2002 with Flip, an XML-based CMS. Flip was a desktop application, built using Perl, which functioned mainly as a text editor for arbitrary types of XML document. Flip also had a server component which transformed these documents into flat HTML pages using XSLT style sheets. The server would then publish the HTML to the old BBC static server platform using FTP.
- The Flip interface
Some might say it was primitive, but it was incredibly flexible and managed many thousands of pages of content across different BBC websites. It was mainly being used for the BBC’s interactive learning websites such as History, CBBC and CBeebies.
Contrary to most people’s expectations, editorial users found authoring XML documents quite straightforward once they got used to it, it’s a very flexible way of authoring structured content. It’s also quick and cheap to add new content types, because it’s possible to build one editing interface that works across virtually any type of XML content.
We wanted the content to be free of presentation information so it could be displayed across different platforms and in different formats; being multi-platform was important to us long before the rise of responsive web design. Since the content wasn’t specific to any one platform it could be rendered appropriately for whichever device required it, such as a desktop browser, phone, interactive TV, and other formats such as RSS.
Flip was a multi-tenanted CMS, which means the same application hosted multiple logically distinct editorial projects. This was ideal for teams who had multiple projects they wanted to switch between, without having to learn how to use new tools. Flip was also quick for teams to start using, as all they had to provide was an XML schema (or DTD) and some XSLT files – there were no complex forms or user interface components to configure.
The need for change
The move away from Flip came when the BBC’s architecture began to change. The infrastructure, servers and programming languages that the BBC used to deliver content to the public had been based on Apache, SSI and Perl. That was now being replaced by a platform known as Forge, which was more service-oriented in its architecture. It used Java at the service layer, PHP at the rendering layer, and various relational and object databases for storage. Flip wasn’t really suited to this modern architecture, and the publishing of flat HTML documents was no longer desirable.
Flip was built using Perl, whereas the majority of software engineers in the BBC were moving over to Java as the uptake of the Forge platform began to really take off. This skills mismatch was going to make maintaining and developing Flip a problem in the long term.
Finally, the installation and updating of Flip as a desktop application was a continuing (if minor) struggle. We wanted to move to a web-based client because that would be easier to deploy to our users, and faster to roll out updates and fixes.
The beginning of iSite 1
In 2009, there were lots of teams using the Forge platform to provide services for our end users, but hardly any tools for editorial teams to actually author and manage content. We needed to find something that would meet these needs.
The creation of iSite 1 involved the installation of a 3rd party content management system called Alfresco, which provided the specific web content management features we needed. It was a Java application based on Spring, and since these were standard technologies on the Forge platform it was easy to install.
We retained a lot of principles that had served us well with Flip, such as having a multi-tenanted system. We also formalised a new principle of “devolved development”. We didn’t want a big centralised content management team who had to do all the work setting up projects and modelling content types, because they would inevitably become a bottleneck. That was the experience in a previous attempt to centralise CMS; it was just too slow and did not support the agile, iterative development processes we use.
We knew that ill-fitting user interfaces could be a real problem for our users – Steve Elson
Our solution was to have a CMS team being responsible for the core CMS application, but the actual work to set up, configure and manage projects including content types would be done by ‘tenant’ teams. If a BBC team needed a content management solution, and had developers building their public-facing website, they could use them to set up and manage their space within the CMS. That meant not having to wait on the CMS team to do anything. This was the model in Flip, and something we wanted to carry forward as it allowed rapid parallel development. Alfresco was one of very few systems that supported this way of working.
It was also important that the CMS be schema-agnostic. The CMS itself shouldn’t need to know much about the actual content it stores - it should be flexible enough to contain and manage a large variety of different content models. A lesson from a previous CMS project was that one-size-fits-all content models didn’t really work at the BBC. We have such a variety of content types that modelling them all consistently is a huge and never-ending task. So, we let tenant teams define their own content models, and let them worry about consistency when it made sense for them.
The BBC web site serves millions of users every day, and we have to think carefully about the scalability of our systems. This was always on our mind when designing iSite 1, so we decided to isolate it entirely from the delivery chain. Our plan was to keep iSite as a back office tool that deployed content to another system, which was focused solely on high-volume, robust delivery of data.
To meet this goal we built Electron, an XML repository based on Atom and Solr.
Atom is a web standard which defines both a web feed syntax, and a simple HTTP protocol for creating and updating resources. Its simplicity, flexibility and performance made it an ideal technology for our needs. However, it has limited querying features so we turned to Solr, a highly scalable search engine which can index all sorts of content and perform fairly complex queries across it. In Electron, as new Atom resources are published they get indexed by Solr, which we can then query to retrieve whatever data we need. This simple combination of two common technologies has been very successful, and incredibly stable.
Another advantage of separating the CMS (iSite 1) from the content repository (Electron) was that we could take the CMS down for maintenance and upgrades without affecting the public website.
Web-based v desktop-based editing
One of the big differences from Flip is that iSite 1 provided a web User Interface (UI)with forms-based interface for editing content. Moving from a desktop XML editor to a web UI was one of the biggest changes we made.
We wanted to keep XML as the underlying data format in iSite 1 because it worked well, providing the schema-agnostic format that we wanted. We considered using JSON, which is another very popular document format, but the sorts of documents we manage are better suited to XML.
For our form engine, we turned to XForms, a supported feature of Alfresco. XForms is a W3C standard which allows you to define forms for editing XML data. It’s not the most popular technology in the world, but is used frequently in document management systems. Alfresco has a convenient feature which takes an XML schema document and automatically generates an XForm for that type of content. That worked for 95% of the things we needed to do, and gave us flexible XML content with the addition of forms-based interface on top.
iSite 1 was hugely successful for a number of years, and is still up and running now. We went from installing Alfresco on the platform to having live web sites within the space of a few months, with more tenant projects being added regularly from then on. We’ve now got around 200 projects in iSite 1, used by roughly 30 different teams in the BBC. There are two thousand registered users in the system, managing BBC blogs, CBBC, CBeebies, Programmes pages, corporate websites and many more.
This brings us to autumn 2011, when we started encountering the limits of what Alfresco could do. Projects and requirements were getting more complex and Alfresco was proving hard to extend. We could add custom form controls and other simple modifications, but we couldn’t really make the changes – namely, to the main publishing, authoring and workflow interfaces – which our users wanted most.
It was becoming increasingly clear that, unfortunately, the future of our CMS strategy did not lie with Alfresco, so we had some decisions to make.
We conducted a feasibility study to looks at our options. These boiled down to sticking with Alfresco, adopting another tool “off the shelf”, building something ourselves, or various combinations of these. In the end we decided that building something ourselves was the best option.
The CMS software market is quite crowded, with a huge number of products competing with each other. Despite this, we struggled to find a CMS that met our requirements around flexibility, multi-tenant operation and integration with existing services. Those that did come close also provided a vast array of other features that we did not require (such as rendering systems), which would end up complicating our architecture and adding a support burden.
In addition, off-the-shelf CMS tools generally have quite fixed user interfaces, which are hard to customise. We knew that ill-fitting user interfaces could be a real problem for our users, and we were determined to provide them with a system that allowed them to get on with their jobs without tools getting in their way.
Building iSite 2
Before beginning development work on iSite 2, we spent some quality time with our users. We ran a series of workshops to gather feedback on what they liked and didn’t like about iSite 1, and also ran more formal usability studies to understand how they used the system, and what the problems were.
This provided us with a very good set of user goals for the new system, and we have used these to inform our development work ever since.
Another objective we set ourselves was to align our user interface with that of iBroadcast 2, another internal production tool which provides media management features. We think of iBroadcast 2 as our sister application, as the two are very often used in conjunction by BBC staff to manage the website.
We decided to collaborate with the iBroadcast team on a set of joint design guidelines, and if you compare the two products today they’re very consistent in their navigation, look and feel.
Off the shelf
Although we had decided to build a new CMS ourselves, of course there are many libraries, components and systems that we decided to use.
The main off-the-shelf component we used was MarkLogic, an XML database. MarkLogic was already integrated on the platform and was used widely during the Olympics for data ingest and management. With an emphasis on XML documents in iSite 2, it was logical to use it as our data repository.
MarkLogic is scalable enough to handle public-facing content delivery load, as well as being flexible enough to provide all the querying and searching needed at the CMS layer. This fact allowed us to use the same data repository for both scenarios, replacing Electron and significantly simplifying our architecture.
Another off-the-shelf component we used was Orbeon, an XForms engine. XForms is a good fit for our editing requirements, though we did also look at many other options. Orbeon gave us a very rich set of form controls for editing XML documents, and a visual form builder that allows our tenant teams to get up and running incredibly quickly.
Orbeon is only used in the content editing parts of iSite 2, and we have a pluggable architecture which means we can use other sorts of editor if that makes sense. In fact, we have also returned to our raw XML editing roots to some degree, by supplying an in-browser XML code editor based on the CodeMirror syntax highlighting library.
One of the drawbacks of devolving the presentation of data to our tenant teams, using a variety of PHP applications, is that providing CMS users with a meaningful preview is quite hard. Using Electron to store published content separately from the main CMS repository also complicated matters.
iSite 1 used an XSLT based mechanism to provide a rough preview, but this required tenant teams to maintain both preview XSLTs and their main rendering mechanism in PHP. This meant doubling the amount of work the teams had to do, plus they had to keep the two synchronised. In practicality, many teams failed to build the preview XSLTs, as it was just too much work.
Preview consistently came near the top of the list when we asked users for their most desired features. Using MarkLogic as a single data repository made the job a bit simpler, so we wanted to finally solve this problem in iSite 2
We have designed a simple set of rules that allows the CMS to determine, with some configuration by tenants, the public-facing URL of any piece of content in the system. Using this, we can then push content a user wishes to preview into a temporary location, and call the tenants PHP application in a special preview mode which allows it to retrieve the temporary content in preference to actual published content.
We are building this feature right now, but we hope it will be one of the most popular features in iSite 2.
Our approach with iSite 2 could be described as “content management as a service”. Our goal is to provide BBC teams with a highly usable, flexible content management system that they can configure for their needs in minutes.
There should be no need to install software, and no worry about hosting, scalability or any of the other things you typically need to do when setting up a content management solution. iSite 2 will take care of all of this behind the scenes, allowing our tenants to get on with the important job of producing innovative, compelling products and services for our users.
As we work on delivering this vision we will also increase the level of integration with other internal BBC systems, providing a cohesive and simple experience for editorial teams no matter what sort of content (text, AV, images and so on) they want to provide in their products.
Most importantly, our aim is to remain flexible. Our content management systems must be able to rapidly adapt to support the changing landscape of the BBC’s digital output, and ensure that as our products and services evolve, so too do the tools behind them.