Search Engine Optimisation: Rebuilding Food

Tuesday 28 August 2012, 10:41

Oliver Bartlett Oliver Bartlett Product Manager

Tagged with:

Apples page on BBC Food

The 'apples' page on BBC Food

Hi, I'm Oli Bartlett and I was the product manager for BBC Food during the rebuild in 2009-10. This post is a follow-on to Duncan's SEO post to provide a little more context and detail on how we tried to maintain our audience reach during the re-launch.

In the BBC we often see temporary drops in audience reach after a major re-working of a website. In situations where a website is given such a significant overhaul that its structure and page URLs change, one major factor in this drop in traffic is the removal of the old URLs from the site.

Put simply, if you remove the pages and those pages were getting views, then you no longer get the views.

However, links to those pages continue exist all over the web, most importantly for us in search engine indexes.

Once search engines discover that their indexed URLs are no longer valid (i.e. they receive a 4xx http response code), they will remove those pages from their indexes. In order to maintain the traffic from search engines it's important to put in place a good http response strategy for those URLs. For example, where content has been moved rather than deleted, use a 301 response code to redirect to the new location.

bbc.co.uk/food

Part of the problem with the old BBC Food website was that there was too much content duplicated in different forms across the website - for example we often had two or more pages displaying the same recipe - which is really bad for users and SEO.

Additionally, a lot of the content was due a refresh in the context of the new product goals - finding recipes and food from your favourite BBC programmes. This led to the decision to cull around 2000 pages from the old website - these included recipes whose rights had expired, duplicate recipes, and articles and other content which simply didn't fit with the objectives for the new product.

Three Kinds of Deleted Page

Each deleted page, or group of deleted pages, required a different approach to http responses:

  1. Expired recipes: 410 - Gone. We present a message explaining the situation regarding rights to BBC recipes, and giving links to similar recipes (where recipe rights have expired we still know the detail of the original recipe so can link to similar recipes - ie for the same dish, by the same chef, using the same ingredients etc.).
  2. Duplicate recipes: 301- Moved permanently. One of the duplicate recipes was kept, the other was deleted from the system and a 301 redirect put in place from the deleted recipe to the new canonical one.
  3. Consolidated articles: We created 'food' pages (e.g. bbc.co.uk/food/apple) which acted as canonical resources containing the typical editorial content found in our old food articles (ie how to prepare, choose, store etc.). Each deleted article was 301 redirected to the most relevant food page, and in the case of articles about diets, occasions, cuisines etc. we had appropriate canonical pages for each.

Sometimes, 404 is the right answer

We tried to minimise the number of URLs that returned "404 Not Found" but invariably there were some which were removed and had no suitable alternative.

In this case it was considered to be better to return a 404 than to redirect to the food homepage.

Simply redirecting all removed pages to the homepage breaks the web. For example, if someone has posted a link to a page that subsequently gets removed, by putting a redirect to the homepage you give the impression (to users and search bots) that the post was about the BBC Food homepage.

Additionally, if a recipe search result links you to the BBC Food homepage, that's not helpful and you're less likely to click on a BBC link next time. We'd prefer those links to be removed from search engine indexes so people don't have that experience.

For the few weeks following the relaunch of BBC Food we were getting significant numbers of 404/410s reported on the site, but these were expected.

As the invalid page links were removed from search indexes, very quickly these errors tailed off.

The new pages were soon indexed and after a brief dip, our audience figures were back and rising healthily. We didn't completely avoid the post-launch dip, but it was predictable and reversible and so much easier to stomach.

Oliver Bartlett is Product Manager, Olympic Data, 2012

Tagged with:

Comments

Jump to comments pagination
 
  • rate this
    0

    Comment number 1.

    What reason is there for returning a 404 (Not Found) rather than a 410 (Gone) response for deleted pages? Surely the 410 communicates more (there was a page here but it's gone) than the generic 404 (there might never have been a page here).

  • rate this
    0

    Comment number 2.

    Hi Frankie, good question! An audit of the most visited areas of the site allowed us to configure the new dynamic application to return a 410 or 301 for pages being replaced or consciously removed. However, we turned off the entire old site after the re-launch, and invariably there were many pages which were removed as part of this that hadn’t been covered in the audit. These now return 404 (the default for non-existent URLs). You’re right, a 410 is a more appropriate response for all removed pages but in practical terms we couldn’t explicitly capture all of the removed URLs.

  • Comment number 3.

    This comment was removed because the moderators found it broke the house rules. Explain.

 

This entry is now closed for comments

Share this page

More Posts

Previous
What's On Red Button 25th August - 1st September

Saturday 25 August 2012, 06:00

Next
Designing BBC iPlayer for Xbox 360

Wednesday 29 August 2012, 12:04

About this Blog

Staff from the BBC's online and technology teams talk about BBC Online, BBC iPlayer, BBC Red Button and the BBC's digital and mobile services. The blog is reactively moderated. Your host is Nick Reynolds.

Blog Updates

Stay updated with the latest posts from the blog.

Subscribe using:

What are feeds?

Links about BBC Online

BBC Internet blog Archive

owl-plain-112.jpg 2012 ι 2011 ι 2010 ι 2009 ι 2008 ι 2007

Tags for archived posts