« Previous | Main | Next »

Designing for your least able user

Michael Smethurst Michael Smethurst | 20:00 UK time, Monday, 16 March 2009

Usability, accessibility and search engine optimisation from an information architect's perspective

A few reasons why we make websites how we make websites. Thanks to Martin Belam and Nick Holmes.

Also available as a Belorussian translation (by Softdroid).

Another dull presentation...

..in black and white with too much text and no pictures. I apologise - PowerPoint was never a key skill. If it's of any use please feel free to take it and tart it up as necessary.

Following an understandable complaint about the use of Flash in a post about accessibility I've added an S5 HTML version here. Also a note that there's nothing in the presentation that isn't in the post.

Some egg sucking

Way, way back in the day was the internet. It provided a way for machines to talk to machines across the globe. Software engineers no longer had to care about the wires or the packets - all the hard work was done for them. It changed the focus of development away from cables and back to machines.

In 1990-91 Tim Berners-Lee took the internet and added 2 new components. He took SGML, stripped it down and invented HTML. And he took academic theories of hypertext and invented URIs and HTTP. The result was the World Wide Web and it changed the focus from a network of machines to a web of documents and links.

All this is explained far, far better and with greater emphasis on the future by the man himself in his Giant Global Graph post.

All about pages

So the web has always been about 2 things: pages and links. It's pretty obvious but it's something we often lose track of. If you've ever worked on the web just think about all the time and effort we put into pages: wireframes, visual design, photoshop comps, semantic markup, CSS, flash components, 'widgets'... And compare that with the time and effort we put into URI schemas and link design.

In many ways it's understandable. It's far easier to get people to engage with things they can easily picture. And if they engage they sign-off. And if they sign-off we get paid. Unfortunately URI schemas and link designs are not by nature particularly engaging or picturesque.

But URIs and links are more not less important than page design. If you get your URIs right and your pages look shoddy you can always come back and make them nicer. But if you make nice pages but get the URIs wrong you've got a big rebuild job and lose the persistence of your URIs. And persistent URIs are vital for both user experience and search engines.

To use a tortuous analogy a well designed website should be like a cubist painting: the spaces between things are as important as the things themselves - where things in this case means pages and spaces means links. Sorry!

The early days of web search

Back in the days of AltaVista and first generation Yahoo!, web search was all about pages. Search bots crawled the web and indexed the content and metadata of each page. Results were ranked according to keyword density and the content of meta elements.

Pretty soon people realised that they could spam the search engines by including hidden data in pages. The meta keywords and description elements were designed to allow publishers to include metadata to describe the page but some publishers abused them to include popular search terms with no relevance to the page content.

Now search engines have the same attitude to being fooled as George W. Bush. They don't much like it. If a search engine allows itself to be spammed users will no longer trust it's results and go elsewhere. And since modern search organisations are really more in the advertising business than the search business getting spammed could be fatal to their bottom line. So they started to ignore meta keywords and description elements.

The major metric for search result ranking became keyword density on pages. Which was fine whilst the web was small. But last year Google announced they'd found 1 trillion pages on the web. As we see increasing content syndication via feed hubs and friend feeds and reaggregators how do search engines differentiate and rank the ever increasing list of results?

What Google did

To solve the problem Google went back to the 17th Century and some founding principles of modern science: peer review and citation. To differentiate between results with near identical keyword density they also ranked pages by how cited and therefore influential they were. Which for the web meant they counted the number of inbound links. The assumption is that the more a page is linked to (cited) the higher it's relevance. This was the foundation of the famous PageRank algorithm. If you're in the mood for deep maths you might want to do some background reading on Eigenvalues - if you're not then really don't.

So Google brought URIs and HTTP into the world of search. What had been all about pages and content became about pages AND links - the 2 key components of the web working in tandem.

Web 1.0 > Web 2.0 - from link rot to persistence

The hypertext of academia had a few things that TimBL left out of the web. One of these was bidirectional linking. In academic hypertext theory if document A linked to document B then document B would also link to document A.

The lack of bidirectional linking removes an obvious overhead from web design and maintenance. It means that pages don't have to know anything about each other. It also means it's perfectly possible to link from one document to another document that doesn't (yet) exist. A broken link is a poor experience but it doesn't break the web. This permissive attitude to linking is probably one key to the rapid growth of the web - it just made life easier.

Unfortunately it also leads to the problem of link rot. Link rot happens when document A links to document B and document B is then removed, moved or changes meaning. Again the web doesn't break but the link does. Link rot is the enemy of search engines and the enemy of any organisation that hopes to make it's content findable.

Back in page focussed web 1.0 days this happened all the time. If web 2.0 was about anything (and it was about lots of things) it was the resurrection of HTTP and URIs as key components of the web. Blog permalinks, blog outbound links, social bookmarking, wikis and AJAX all reestablished the primacy of HTTP, URIs and links as the backbone of the web.

Web 2.0 > Web 3.0 > onwards - from documents to things

So what happens next? Some people talk about web 3.0, some say Semantic Web, TimBL talks about the Giant Global Graph but it's all pretty much the same thing. Instead of a network of machines or a web of documents we're starting to move into a world of linked things or at least a web of data about things. And to build that we first need a web of identifiers - or URIs. Those things might be you or your friend or a great TV programme or a favourite music artist. But the key to identifying them on the web is to give them an HTTP URI and make that URI stable and persistent - cos cool URIs don't change.

Probably the first place to feel the effects of this will be you and your social network. Tired of having to constantly reenter you friends into social network sites? If you (not a document about you but actually you) have a URI and each of your friends has a URI and your social network can be expressed as links between these URIs, there's no need for any more data entry. The technology may be OpenID or FOAF + SSL or a combination thereof - whichever way the concept remains the same.

The future of search (is semantic?)

As clever as Google et al are (and they are very clever) search is still something of a brute force exercise. If I search for The Fall search engines don't know if I mean the fall of man or the film or the season or the book or the band. (In this case a document about the band is ranked first which is how it should be :-) .) But that doesn't help people who aren't fans of Salford based Krautrock.

So influence doesn't always map to authority - particularly when terms are ambiguous. To differentiate between films and seasons and books and bands we need to publish content as data that machines (in this case search bots) can understand. In other words we need Linked Data on the Semantic Web.

It's still early days for Linked Data and search engines. Most of the major players seem to be dipping their toes in the water. Yahoo! SearchMonkey is probably the most high profile effort to date. So far it indexes microformats and RDFa within HTML documents to extract semantic information. I dare say few SEO consultancies will advice you to add microformats or RDFa or full fat RDF to your site just yet - but those days may be coming.

Why this matters

When you're buried in the midst of a large project it's sometimes difficult to focus on anything but the implementation details. Your energy is expended on your site and you forget to consider how that site stitches into the rest of the web. How many user testing sessions have you sat through where the participant is shown to a browser open at the site homepage and asked to find things? If you're designing silo sites the assumption is always that visitors arrive via your homepage - that's why you've spent so much time and effort designing it.

But in real life how many of your journeys start at a homepage and how many start at Google? Every day 8 million users arrive on a BBC page via a search engine. 1 million of those come via BBC search, the rest via Google, Yahoo! etc.

So search is important. The prettiest, most useful site in the world is no use if your potential users can't find your pages. And the easiest way to find things on the web is via search.

Now I'm not saying homepages aren't important - just that the time and energy we spend making them is sometimes disproportionate to their value to the web and therefore to users.

So what can you do?

There are several routes open if you want to optimise your site for search engines. Some of them are vaguely dubious:

  1. You can increase the density of keywords. This often raises objections from journalists and editorial staff who don't like their copy style interfered with. And those objections are often dismissed by techy types and SEO consultants as creative whimsy. But it's not. You often see sites that have been SEOed to within an inch of their lives with keywords repeated ad nauseam. How many times do you have to read Prime Minister Gordon Brown before you understand that Gordon Brown is Prime Minister? If you keep repeating keywords you run the risk of making your content unpleasant to read and therefore less useful. And if it's less useful people won't pass it on to friends or link to it. And without links your Page Rank suffers. It's a vicious circle. But from an IA perspective it's always better to write for humans not search bots. The usual style guides still apply - write for your intended audience and try to use words they'd use. But don't repeat yourself unnecessarily just to up your PageRank.
  2. You can add keywords to URIs. Google tells us they take no notice of this but no-one seems sure of what impact it has on Yahoo! etc. If you do choose to do this consider what happens if those keywords change over time. Can you honour your old URIs? Without too many redirects? Search engines see redirects as a potential spam mechanism - Google for example will only follow one 301. If you can't honour your old URIs then links to your pages will break and all your hard won search engine juice will leak away.

Other techniques are less controversial:

  1. You can provide XML search sitemaps. These let you you exercise some control over how and when your site is crawled and indexed. Before a search bot crawls your site it first checks the sitemap to determine what's changed, what's expected to change frequently and what hasn't changed since it's last visit. This lets you point the bot at new pages and updated pages and makes sure these are crawled and indexed first.

But the majority of techniques are more about URIs and links than pages:

  1. If you've read this far then you've probably guessed the first recommendation is to spend time designing your URI schema. Ensure that you can guarantee the persistence of URIs against all (predictable) eventualities. Don't sacrifice persistence for the sake of readability. If your pages move your search engine juice goes out the window.
  2. If you decide that readability is a prerequisite for you or your organisation and if you plan to publish a lot of pages you'll probably need to allow for editorial intervention in setting these labels. You'll also need to build an admin tool to allow this intervention to take place. Which means you'll need to build the cost of employing these people and building the tool into your project.
  3. If for some unforeseen reason your URIs do have to change spend time getting your redirects right. Sometimes there are redirects on your site that you're so used to you forget they exist. For instance if I link to http://bbc.co.uk/zanelowe the first redirect will be to http://www.bbc.co.uk/zanelowe and the second redirect will be to http://www.bbc.co.uk/radio1/zanelowe/. Since Google will only pass PageRank for one redirect the first of these links will pass no PageRank to Zane Lowe. Since even the addition / omission of trailing slashes will usually cause a redirect getting this wrong could lead to a serious leakage of search engine juice.
  4. Never expose you technology stack in your URIs - no .shtml, /cgi-bin/, .php, /struts/, .jsp etc. Technology is likely to change over time - when it does you don't want your URIs to change with it. As a side benefit keeping your technology out of your URIs also gives less clues to hackers.
  5. On the subject of security there was a recent suggestion on MSDN that you should change your URIs every 10 minutes to deter cross-site scripting (XSS), cross-site request forgery (XSRF), and open-redirect phishing. If you did choose to do this you'd be kissing your search engine findability goodbye.
  6. Never include session keys in your URIs. This is one of the options discussed in the MSDN article above. But it's also commonly used as a means of tracking users across a site. When you visit a site you're given a key that persists for the duration of your browser session. This is then used in every subsequent URI link to track your journey. But since every user gets a different key for every visit it means you can't link to, bookmark or email a link to a friend. Which means search engines can't see or index the page. It's a technique that's used on the BBC jobs site. Which means that this link to a Workstream Delivery Manager job (eh?) which works for me won't work for you and won't work for search engines. And next time I open my browser it won't work for me either. Nice.
  7. Only use https where you have to. https is http's more secure cousin. It should be used when you want users to submit sensitive information to your site. However, pages served with https aren't indexed by search engines so you don't want to use it for plain content pages. The BBC jobs site gets it right with application submissions which are https. Unfortunately it also uses https for everything else: search results, job listings, job descriptions... Another reason why BBC jobs don't turn up in search engines.
  8. One of the first rules in the O'Reilly Information Architecture book is don't expose your internal organisation structures in your public interface. It's still something that often happens and can take many forms: using labels that only your business units understand, reflecting your management structures in your site structure etc. The most pernicious examples usually happen when you think of your website as a set of stand-alone, self-sufficient products. The web really doesn't lend itself to this shrink-wrap mentality. The net result is often the creation of multiple pages / URIs across your site that talk about the same thing. In general your site should tend towards one page / URI per concept. When you get multiple pages about the same thing some will inevitably end up unmaintained and go stale. This all results in confused users. It also results in confused search engines and the splitting of your PageRank across multiple URIs. It's better to have one page with 10 inbound links than 10 pages with one inbound link.
  9. If for some reason you do end up with multiple pages about the same concept at least make sure there are links between them. Decide which one is the canonical page - the one you want to see turn up in search results. And add a rel-canonical meta element to that page. If search engines find many similar pages they'll try to squash down the result set into one page. Telling them which page is canonical helps them to make the right decision.
  10. Connecting up your site on a data and interface level and breaking down the content silos results in a more usable, more search engine friendly experience. The first step is to agree on what you model, check your understanding with users and agree on identifiers. Once you've done this new linking opportunities arise, new user journeys become possible and you can slice and dice one set of data by many, many others. The more content aggregations you make, the more user journeys, the more links for search engines to get their teeth into. As an example, if part of your site is about programmes and some programmes contain recipes and another part of your site is about food and contains recipe pages then link from the programme episode page to the appropriate recipes and link from the recipe pages to the episode they were featured in. It's simple in principle - the key to good user experience and good SEO is to get your infrastructure and piping right.
  11. If we're agreed on one page per concept we should also agree on one concept per page. There'll probably always be pressure from marketing types to include lots of cross-promotion links from content page to content page. Which is fine in principle. In practice it can lead to pages that have more adverts than content. This waters down their keyword clustering and can also be confusing for users - what is this page and where am I? If you connect up your data you can start to build semantic links through content and minimise the need for clumsy advertising. Think Wikipedia, not right hand nav.
  12. You can encourage people to link to you by making every nugget of content addressable at a persistent URI. The analogy here is with Twitter. Every tweet, no matter how mindless or empty of content and meaning has it's own URI. Which means when someone does say something interesting people can link to it. And because every tweet links back to it's tweeter it's all more links for search engines to chew on.
  13. Remember you don't have to mint your own identifiers. If you can use common web-scale identifiers for concepts you're interested in. It makes it easier for other sites to link to you if you share a common currency.

The final tip is the most important. MAKE GOOD CONTENT!!!. If your content is interesting, relevant or funny people will want to bookmark it, cite it and share it with friends. If it isn't they won't.

Share the love

So I've talked about inbound links - what about outbound links? According to a strict interpretation of the PageRank algorithm if inbound links are a tap pouring lovely search engine juice into your page then outbound links are an unplugged leak splurging it back out again.

There's a feeling amongst people who spend time discussing this stuff that search engines would be shooting themselves in the foot if they did penalise outbound links. The web in general and search engines in particular thrive on the density of links. So far I've found no evidence either way on this one but maybe I haven't looked in the right places - or maybe the right places weren't SEOed :-) If you know better (or you work at a search engine company) maybe you'd like to leave a comment. In the meantime we're working with search engine companies to get to the bottom of this - when we understand more I'll update this post.

Either way if you want to make the web a better place make links. If you find an article you like you could bookmark it in your browser. But if you do that only you benefit. If you delicious it or twine it or blog about it or whatever it is you do then your social network also benefits. And the links from delicious etc all count to the PageRank of the article so it becomes more findable and the publisher benefits too. It's worth noting that higher the PageRank of your page the higher the PageRank it's links pass on.

So if you think this post has been worth reading then please (social) bookmark it or blog it. Or if you think it's all rubbish blog it anyway but add a rel="nofollow" attribute to your link back.

Which brings us nicely to rel="nofollow". It's a way to link to something without passing PageRank. And since links are often seen as leaky many publishers choose to add it to all outbound links. Indeed some publishers are so convinced that links mean leaks they even add rel="nofollow"s to their own internal navigation - things like Terms and Conditions and Privacy Policies. It's a practice called PageRank sculpting and it verges on the paranoid. Given Google's advice on rel="no-follow" it's also pretty pointless.

rel="nofollow" is also commonly used on sites that accept user content. In order to stop people using links in comments to leach PageRank from the hosting site, publishers often add rel="nofollow" attributes to all links in comments. Twitter is one example amongst many - every link in a tweet is automatically made nofollow. The trouble with nofollow is that if everyone used it, PageRank would die and web search would die with it. So go easy on the rel nofollows or you might break the web.

Some other things you can do

'Widgets' and APIs

Widgets and open data APIs allow users to take your content / data and reuse it in their own sites and applications. It seems counterintuitive to suggest that if your content can be found everywhere it'll be more findable on your site. Again it all comes back to links.

In the case of widgets they almost always come with links back to the content of the source site. The presentation at the top of this post for instance comes with 3 links back to slideshare. Every one of those links makes the slideshare content more findable.

Unfortunately it's not always so simple. The slideshare widget at the top of this page displays (mainly) static data. Which means it can be rendered in HTML with a Flash movie for the actual presentation. Because the links back to slideshare are plain HTML links they all contribute to PageRank. In many cases widgets need to display dynamic data. To do this they often use JavaScript and / or Flash to render themselves. In which case the usual Flash and JavaScript problems emerge.

A much better way to encourage links back to your content is to provide an open data API. Opening your data with APIs allows other people to take it, mash / mesh it up with other data sources and make things you've not thought of or didn't have the time to implement. The Twitter API is a great example. The site itself is stripped down to perfection. It does exactly what it needs to do and no more. But the API allows other people to take the data and use it in new and imaginative ways. 275 Twitter applications are listed here. Some of these are standalone applications that work on desktops and mobile phones. But others are websites that can be crawled by search engines. And they all link back to Twitter passing search engine juice as they go.

So opening your content / data for reuse can make your site more findable and drive traffic back to you. As ever, if you love your data, set it free. Everybody wins.

One web

Once you've modelled your data, given everything a URI and provided as many aggregations and user journeys as possible it would be silly to dilute your user's attention and links by providing the same content at a different set of URIs. But we still do this all the time with special sites for mobile and other non-desktop devices.

For now it's not too much of a problem. Most other devices don't have the rich web of support sites you get on the desktop web and device specific sites are still more walled gardens than their more weaved into the web desktop cousins. But as mobile support increases there's no reason to suppose that the complex ecosystem of support sites (social bookmark tools eg) won't evolve with it.

The iPhone in particular already raises many of these problems. It's quite capable of rendering standard desktop sites and integrating with social bookmark tools. But we often create a separate iPhone version at a different URI to take advantage of the iPhone's swishy JavaScript page transitions etc.

Clearly different devices call for different content prioritisation, different user journeys and different interaction patterns. But they don't need their own set of URIs. It's better to use content negotiation and device detection to return a representation of your content appropriate to the user's device. A single set of URIs means your users attention and links aren't split and increases the search engine juice to your pages. One web for all.

If content negotiation / device detection is too much work you need to decide which representation (usually desktop web) is canonical and mark it / link to it as such.

Erm, user generated content

So we all know that user generated content is a patronising and demeaning label. And I'm afraid I'm going to demean it further. Sorry.

Way back in 1995 Nicholas Negroponte wrote a book called Being Digital. In it he discussed what broadcasters would have to do to make their content findable in a digital world. He talked about bits (the content as digital data) and bits about bits (data about the content). (I guess bits about bits are what we now call metadata but I think I still prefer bits about bits.) On the subject of bits about bits he said:

[..] we need those bits that describe the narrative with keywords, data about the content, and forward and backward references. [..] These will be inserted by humans aided by machines, at the time of release (like closed caption today) or later (by viewers and commentators).

Emphasis my own. Remember this was well before everyone talked about social media, before Google existed and before most people had even heard of the web.

Nowadays we're all used to sites that ask us to log in and rate and tag and comment on content. This might seem cynical but in many cases (although not the BBC of course) site publishers invite these interactions not because they're interested in what you have to say but because it's a valuable source of additional data about their content. And this data can be used to make new aggregations (most popular; tagclouds for you, for your friends, for everyone; latest comments etc).

From an SEO perspective publishers benefit twice. Firstly search engines have more text and keywords to chew on without requiring much editorial intervention / expense. And secondly more aggregations means more links into content pages, more user journeys and more journeys for search engines. Every one of those inbound links pushes up the PageRank of the aggregated page.

Personalisation

There are 2 ways to personalise a site. The first is to change the content of your existing pages according to the instructions and behaviour (implicit and explicit) of the logged in user. So a page about a TV programme might include a list of your friends who've watched that programme. It's a good way to make your site feel lived in but if this is all you do you will sacrifice valuable social recommendation and search engine goodness.

The first problem with personalising only on existing content pages is that only you can see this data as presented - your friends and friends of your friends can't. So you sacrifice valuable recommendation from outside the user's immediate social graph.

The second problem with content page personalisation is that search engines can't see it either. Search engine bots can't register and can't log in. Which means that all your development work and all the work your users put in consuming and annotating your content can't be seen by search engines.

The answer again lies in URIs and links. In this case you should treat each user as a primary data object and give them a persistent URI. Make links from users, through their attention data to your content and from your content to your users. Obviously you should ask your users before you expose their attention data - how much is made visible to the web should always be under their control.

If you need an example of how to do personalisation properly take a look at The Guardian's user pages.

Accessibility, SEO and karma

This is probably the only section that's pertinent to the title of this post. But since I quite like the title I'll stick with it - even if it is non search engine optimised.

We've long accepted that accessibility is not an optional extra. Neither is it something you can just stitch over your site when all other development is done.

The same is true of SEO. And many of the rules we follow to make sites accessible will also make them more search engine friendly. So even if you don't design for accessibility because you know you should, self interest should take you in that direction anyway. It won't be as good for your karma but it will have the same effect.

Accessibility is not a set of WAI boxes to be ticked - it needs to be baked into your whole design, build and testing ethos. Building an accessible site is no use unless it's also usable. And even a usable site is pointless unless it's useful. Giving things persistent URIs, connecting your data, building non-siloed sites and providing new journeys across and out of your site all help to make your site usable and useful so maybe it's all connected...

Plain English

Always write in plain English or French or Welsh or [insert your chosen language here]. Use language your intended audience will understand. If you overcomplicate your text you run the risk of confusing users. The risk is doubly so for users with cognitive disabilities. This doesn't mean you have to write tabloid style - as always write to be understood.

If you stick to the language of your users chances are your chosen words will also be words they search for. Which will help with your search engine friendliness.

If your website has it's own search functionality the standard SEO advice is to check your search logs to see what users are searching for and tailor your language appropriately. However, Google tell us that they already perform a certain amount of term association so if your site says 'TV listings' and a user searches for 'tv guide' they'll find your content anyway. There's really no need to cramp your writing style so long as you keep things clear.

Alt attributes

Probably too obvious to mention. People with visual disabilities struggle with images. If your page includes images you should include an alt attribute to describe the image. Search engine bots spend their lives chewing through pages like pacman on a diet of links. They can detect the presence of an image but not decipher what's depicted. So they need alt attributes too. Sometimes images are used for purely decorative purposes. You only need to add alt attributes if the image adds meaning to the document. An empty alt attribute can be the right choice.

Semantic HTML

Screen readers struggle with old style table layouts. It's best to keep your markup stripped down and simple. Get your document design right and use semantic (x)HTML.

For now search engines only really care about headings. Text found in h1s, h2s, h3s etc will be given extra weight. But you might as well go the whole hog. Separating out document design into semantic HTML and visual design into CSS will make your site easier to maintain and update. Go easy on those definition lists tho...

Hidden content

Screen readers are pretty inconsistent in their support for CSS content hiding. The majority of modern screen readers will ignore content hidden with display:hidden or display:none but still read out content hidden by positioning offscreen. Whether this is intentional or because screen readers are still catching up with modern CSS design techniques is unclear. Offscreen position hiding is often used in CSS image replacement of titles. Whilst screen readers will still read the offscreen text, the replacement images aren't rescaled in most browsers so still have accessibility issues. Remember - more people have bad eyesight and need to increase font size than use screen readers.

If you've got this far you'll know it's possible to write a very dull article. It's also possible to add search friendly but non-contextual keywords to a dull article and hide them with CSS. Like meta keywords of old, search engines see any hidden content as a potential attempt to spam them. So hidden content is penalised. Keeping content visible will help accessibility and help your SEO.

Forms

Designing accessible forms has been a subject of long debate. It's usually framed in terms of screen reader users. But complex forms are confusing for all users and doubly confusing for users with cognitive disabilities. Sometimes forms are unavoidable (for search, for user signup etc) but if possible always provide routes to content that don't require form filling.

Search engine bots can't fill in forms. Which means they meet forms and refuse like a small horse at a large fence. Put simply search bots can't search. Sometimes we're asked why we make topic pages and don't just add the extra semantics into search results. And part of the reason is that search engines can see topic pages and the links from them - but they can't see BBC search results. Getting site search right is important. But it won't reward you with any more search engine juice.

Not to pick on the BBC jobs site too much but since it's sole entry point is a search form there's no way to browse to jobs which means even if the job pages where indexable (which they're not) search engines wouldn't be able to find them in the first place.

So for the sake of accessibility and SEO never use forms when links will do (or at least try to provide both). The only exception is when the action at the end of the link is destructive - you don't want Google (or Google Accelerator) deleting your data.

Flash

Use of Flash obviously has it's place on a modern website. If you want to deliver streaming video or audio it's the obvious choice. But overuse of Flash can lead to accessibility problems. It's possible to make Flash semi-accessible with tabbed navigation and keyboard shortcuts but you need to put the work in. If you choose to use Flash for your main site navigation you're making a lot of work for yourself if you also want your site to be even approaching accessible.

Even if you use HTML for site navigation and Flash for audio-visual content there'll be knock on effects for accessibility. Because Flash doesn't scale, if your movie contains lots of text or carries it's message via moving images and video you'll make life difficult for users with visual disabilities. If it carries it's message via audio you'll make life difficult for users with hearing disabilities.

The best way to make content locked up in Flash files accessible is to provide an HTML transcript.

Modern search engines are starting to be able to look inside Flash files and index their contents - so take care when you're adding abusive comments :-) So is Flash still incompatible with SEO?

If you use it as your primary navigation the answer is absolutely yes. The other day I was looking at a site that sold trainers. I spotted a pair that I rather liked. Normally I'd have bookmarked the page in delicious and come back later but in this case the whole site was rendered in Flash. Which meant there was no page to bookmark. Which means the company lost not only a potential sale but also one tiny drop of search engine juice. Add that up across many potential users and the net effect is fewer sales and less findability for their products. So use Flash sparingly.

If you lock up a lot of your content in Flash the answer is still yes. Flash is primarily a visual, time based medium and search bots don't have much visual acumen. When we say they can index text based content in Flash files there are 2 caveats:

  • if the text is entered as a bitmap there's nothing a search engine can do to pull it apart,
  • if the semantic structure of the text is a product of the movie's timeline no search engine will be able to stitch this together.

The best way to make content locked up in Flash files search engine friendly is to provide an HTML transcript.

JavaScript and AJAX

JavaScript and AJAX can be an accessibility disaster zone. For screen reader users it's almost impossible to keep them updated when the page changes state. The result is confusion, confusion leads to frustration and your users go elsewhere. The best approach is to design and build your site as plain old HTML and progressively enhance by layering over Javascript and AJAX. Always test your site with JavaScript turned on and off to make sure it works in all modes.

In search engine terms what goes for Flash goes for JavaScript and Ajax. Used appropriately it can make your user experience more dynamic and interactions flow more freely. Occasionally (although less these days than when it first appeared) it's used to render a whole site. Which means that URIs are not exposed to users or to the web. Now most search engines don't process JavaScript so can't fight their way through to your content. And even if they could because individual URIs are not exposed there'd be no pages to index. It also means that users can't (social) bookmark or blog your pages which cuts down the number of inbound links and reduces your search engine juice. Again the best approach is plain HTML first, with JavaScript and AJAX layered over the top. Even so JavaScript and AJAX should be used sparingly. If your site degrades gracefully you still need to expose individual page URIs to users so they can link to them.

Link titles

Finally when screen readers encounter a link they'll read out "link - link title" where "link title" is the text found between the opening <a> tag and the closing </a> tag or the title attribute on the <a> tag if it has one. This means that if you use:

For more on {important key words} click <a href='..'>here</a>

users will hear "link - here" which isn't very informative. If instead you use:

<a href='..'>More on {important key words}</a>

users will hear "link - More on {important key words}" which is much more useful.

For search engines link titles are almost as important as link density and keyword density. There's little point peppering your documents with search keywords if you don't make your links descriptive. So again accessibility and SEO will both benefit if you make the titles of your links as descriptive of the link target as possible.

It's probably worth pointing out that you can only control the link titles on pages you publish. The rest of the web is free to link to your content with any label they feel fit. A while back lots of people took it upon themselves to link to the official George W. Bush page on the Whitehouse website using the link title 'miserable failure'. For a little while the top result in Google for 'miserable failure' was this biography. It's called Google bombing and there's nothing you can do about it.

And the story doesn't end with the coming of Obama. With the change of administration the Whitehouse webmaster permanently redirected the Bush biography page which was at http://www.whitehouse.gov/president/gwbbio.html to the new Obama biography page at http://www.whitehouse.gov/administration/president_obama/. Which meant Obama inherited the 'miserable failure' search ranking. They've fixed the problem since but there are are 2 lessons in this. The first is to beware of semantic drift. The 43rd president was not the 44th president; last year's Glastonbury Festival was not the same as this year's Glastonbury Festival. Every time you encounter a new concept you need to mint a new URI. The second lesson is be very careful with your redirects...

Finally

Since I seem to have spectacularly failed to write a pithy blog post I guess a few more paragraphs won't hurt...

In summary, your PageRank is outside your control. How well your site fares in search engines is pretty much at the discretion of the web. The best thing you can do is make lots of your own links and encourage other people to link to you. There are of course other options if you want to artificially inflate your search ranking (mainly keyword clustering) but...

...Google et al are cleverer than we are. They employ the best graduates from the best universities in the world. If a rival publisher makes a better page than you but your page gets a higher search ranking, users will find a new search engine that returns better results. And Google etc would lose their business. The clever people aren't about to let that happen.

So from an IA perspective the best advice is to keep things simple. Design and build with the established tools of the web: HTTP, URIs, HTML, CSS. If you make a site that's usable and accessible for people, chances are it'll be useable and accessible for search bots too. Search engines are only trying to reward good behaviour and good content. Don't make anyone's life harder than it has to be...

Magazines are made of pages, websites are made of links.

Comments

  • Comment number 1.

    A very nice read sir, and terms of size, Mr Hill would be proud.

  • Comment number 2.

    You’re preaching about designing for your least able user, by uploading a presentation to a Flash-only Web-site.

  • Comment number 3.

    @johndrinkwater Hi John. I think preaching is a little strong and certainly not the intention. I can appreciate why you'd object to flash from a free software perspective and why you'd object to flash from an accessibility perspective but in this case there's nothing in the flash that isn't in the post. How would you prefer to see the presentation? Plain text, html transcript, s5, something else?

  • Comment number 4.

    I don’t feel preaching is overly harsh, though I am sorry if it has offended - we are talking about best practices and yet accepting some wiggle room for ease of use.

    Note that I can not see that the presentation contains the same material included in the post.

    Flash is still at the mercy of one development house, and we can not expect that in many many years we’ll have access to flash consumers - right now most pages on the BBC have been overcome by this timebomb.

    S5 is perfectly acceptable, as its viewable with and without javascript (this point also includes your HTML transcript) - and the best thing, nearly all web-enabled devices can view s5 presentations. My mobile (opera mini @ p910i), an iphone(! w/o flash), a G1, a PS3, Firefox, lynx, a psp, need I go on? :)
    For offline viewing, PDF or ODF are equally acceptable.

    I would like an open web that everyone can view, share, reuse, learn, etc and the really funny thing is its not that far away!

    Of course its wrong of me to attack you when the majority of BBC bloggers are unaware of the problems it can bring - I confess its from being brought up from a young age with the beeb and expecting the world from them.
    Would you mind passing my comments on to the general BBC IA?

  • Comment number 5.

    Hi John. An S5 version is now available linked from the top of the post. Apologies for my previous laziness. I'll bring up your Flash comments at the next IA meeting...

  • Comment number 6.

    Wow, that was a pretty through account of the history of the web from an SEOs perspective. I found it very interesting brought together like that. I have put a link to in my blog.

    Monker
    Ethical Link Building

  • Comment number 7.

    Implementing best practices for semantic user interfaces is incredibly important today.

    This is the reason why I am writing a book on this very topic:

    http://flexewebs.com/semantix

  • Comment number 8.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 9.

    <RICHPOST>&quot;Magazines are made of pages, websites are made of links.&quot; - Like it!<br><BR />The development of the general principle of SEO i.e. content &amp; links was and is I think is the only logical path. Relevance is THE factor in determining ranking and the only way to be seen as relevant is relevant content &amp; relevant links.<BR /><br><BR />Andy Maclean<br><BR /><a href="http://www.openeyemarketing.co.uk/seo-uk.html" title="SEO Marketing">SEO Marketing Consultant</a></RICHPOST>

  • Comment number 10.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 11.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 12.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 13.

    Hi Michael - nice article - and extremely comprehensive.

    You mention that meta description and meta keywords don't contribute to page rank, but relevance to a user query is also important and perhaps you could give a bit more credit to the importance of page titles, which only seems to be mentioned in passing in slide 14.

    Although Google keeps specific details of its algorithms private, they do provide a lot of help and guidance for webmasters in their webmaster guidelines for example: 'Make sure that your title elements and ALT attributes are descriptive and accurate' so maybe it would be a good to include a reference to the guidelines in the 'What You Can Do' section.

    Also interesting is the question of whether a title that grabs attention is better than one that more directly reflects what is on the page.

    For example your choice of h1 ('Designing for your least able user') which, whilst intriguing for users, is less semantically accurate than your h2 (Usability, accessibility and search engine optimisation from an information architect's perspective). (I wonder if the search engines have symantically linked least able user to themselves yet :))

    At the end of the day, the proof is in the pudding and the fact that this article has attracted 52 backlinks according to Yahoo since it was published in March demonstrates how great content naturally attracts links.

    Nice article - Mark E. Smith would be proud :)

    Hit-The-North
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 14.

    A brilliant article - will be recommending it to the guys at work.

    As I work in the field of SEO I would like to respond to your point on "search engines would be shooting themselves in the foot if they did penalize outbound links"

    With it often been said that you are effectively giving away your page rank by providing dofollow links to other websites it would seam that in general outbound links would be a bad idea. Even some of the people I work with within the industry feel that unless it is in your interest to promote the site you are linking to then a nofollw attribute should be used. Personaly I feel this concept would be against the principles that are promoted by Google.

    Google have done a lot for the validity of the internet, i think they have worked very hard to insure that both their search engine and the internet in general is as accessible and useful as possible as it can be to the user. As discussed in this article one of the methods used by Google to help promote useful and relevant search results is to analyse hyperlinks that flow between websites. So the more links there are the better there system works.

    Considering that Google relies on links to help to validate that the websites returned in the SERP are relevant to the search string, then why would they penalize webmasters for having outbound links? Would they not be more inclined to award them in some way for their contribution toward helping improve the relevance their search results?

    Would enjoy hearing other thoughts thought on this.


    Thank for the article,

    Jon
    SEO Specialist

  • Comment number 15.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 16.

    oops!

    How do I delete my first comment? Any ideas please?

  • Comment number 17.

    What a great article.

    I love this line -

    "Magazines are made of pages, websites are made of links" - awesome.

    Thank you Jon for the link mate.

    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 18.

    As an Digital Marketing Consultant I have the daily task of searching for links with a follow attribute. This is possibly one of the most time consuming aspects of my work.

    Your Outbound link section is very interesting as this is something which I have always avoided. I do believe that the net should be about contribution, is it just a myth then that you should only have inbound links?

    *Thanks for the link Jon.

    Clare Brace
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 19.

    Great Article. I pride myself on using clean and semantic code and trying to make my websites as accessible as possible, and I believe that if people keep to these core principles and pay attention to web standards then they should also reap the rewards. W3C's validator is a useful tool for validating your code and with a mixture of browsers and OS's available now, it make sense for designers and developers to help stick to these standards.

    James Myers
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 20.

    Thinking about it...
    I'm having trouble getting Golf Equipment Repairs ranked in Google

    Does anyone have any suggestions?

    I'm working my way through the Word Tracker '50 Kick Ass Key Word Strategies' which is also a really good resouce (but not sure If I Should mention that on a BBC site(sorry) )

    Cheers
    Clare
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 21.

    Hello, i might be out of context here,but i can't just get over failing my theory test for the second time today. The thing is that i feel theres foul play going on with the computers in the centers. Reason being that i had a 48/50 on the theory part and a42/75 on the hazard perception part.It cant still get my head through what went wrong in the hazard perception test to make me fail the test altogether as i still thing i was brillant there. The worst of it is that i came across 3 other candidates who have done the test thrice and still failed because they didn't score 44/75 in the HP test.The worst is that you can;t review the video to see where you went wrong or obtain any information from the examiners. My point is that these agencies caryying out these tests fail candidates willingly inoder to make more money.

  • Comment number 22.

    Hi,

    I personally like your post; you have shared good insights and experiences. Keep it up.

    You obviously put a lot of work into that post and it’s very interesting to see the thought process that you went through to come up with those conclusion. Thanks for sharing your
    deep thoughts. I must admit that I think you nailed it on this one.

    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 23.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 24.

    Thank you Michael that is a gem of an article. My only criticism is that you could have made that in to at least four. As a digital marketing consultant with considerable web project management experience I found your refresher on web usability worthwhile.

    In terms of site visibility I still believe that Google is getting it wrong in a lot of cases. This is due to the ease of manipulation by SEO's. I was doing some research the other day and I found that the most relevant site in a particular sector was languishing on page 3 for a high traffic, highly relevant keyword whereas a site with a more tenuous case for ranking was on page one position 3.

    On investigation I noted the number of inbound links for both sites and the higher ranking site had more links (3k) and the other site only 50 or so. Digging deeper I checked the linking sites for anchor text, PageRank and then relevancy and found that of these 3k most were totally unrelated and often junky directories or links pages however they had been using keyword heavy anchor text. The other site had about 75% relevant links however they were more 'natural' links (click here, URL etc).

    My point is that the lower ranking site had better links, it's just that Google's algorithm just hadn't figured it out (so much for LSI!) and the upshot is that a crumby site with higher link volume was outranking it. As a digital marketer it is easy for me to say "just go and change the anchor text of those links and augment it with some directories etc" and that should sort it but for a regular business owner this approach just isn't apparent.

    This leads me in to the point that Google just like any other medium is fundamentally just another route to market manipulated by those with the know how and/or financial muscle to do so. A search engine that aspires to rank the most relevant sites first just cannot rely on citations and links...however I can't come up with anything better. If I could I'd be working for Google!

    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 25.

    Considering that Google relies on links to help to validate that the websites returned in the SERP are relevant to the search string, then why would they penalize webmasters for having outbound links? Would they not be more inclined to award them in some way for their contribution toward[Unsuitable/Broken URL removed by Moderator] helping improve the relevance their search results?

  • Comment number 26.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 27.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 28.

    I liked the content here and it was arranged in an easy read you can click through the presentation at a good speed and gain insight in a short time.

    I've just spent since 10:00am this morning building a site dedicated to search engine optimisation, it's now 01:26. this little nugget would have been well worth the read before i started this morning, just as a reminder.
    Thanks,
    Darren Boyle
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 29.

    Would you mind listing the main criteria of usability since you are specialists in this field. I really need them for my project in college.

    Paul, from Max TD [Unsuitable/Broken URL removed by Moderator]

 

BBC iD

Sign in

BBC navigation

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.