v0.5. Interactive Drama & Entertainment, BBC. March 2006 (last revised February 2007). Matt Chadburn.
This document provides an overview of the data available under the movies/syndication project.
The feed describes the cinema reviews and associated release dates of any new film reviewed by the BBC Movies production team in the last seven days of the published date. Typically you might fetch this feed weekly.
You can use this feed to navigate to the associated ratings & comments data.
http://www.bbc.co.uk/movies/syndication/1/reviews/cinema/rss2.xmlThis file validates to RSS 2.0, here are the conventions.
This feed describes the films that are opening in the next few weeks in UK cinemas. Typically you might fetch this feed weekly.
http://www.bbc.co.uk/movies/syndication/1/comingsoon/cinema/rss2.xmlThis feed contains an approximate UK cinema film release schedule for next 12 months. Typically you might fetch this feed weekly.
http://www.bbc.co.uk/movies/syndication/1/furtherahead/cinema/rss2.xmlEach film review, written by the BBC Movies production team, is held at the following location.
http://www.bbc.co.uk/movies/syndication/1/reviews/cinema/Before October 2006 there was no requirement for the review 'XML' to be well-formed or validated, so you should only find articles that have been published after this date in this location. A DTD and associated entity document are available.
A complete archive of all film reviews is available internally at ...
http://nm-films.national.core.bbc.co.uk/films/syndication/1/reviews/... where you are likely to find the data is XML in spirit rather than practice. Expect malformations and validation errors.
User ratings allow users to impart an opinion about a film they have presumably watched. Their opinion is limited to giving a rating on a scale of 1 to 5, using stars as the currency. A rating of 5 is a positive recommendation and 1 a damnation! Ratings have a simple cookie based mechanism to prevent multiple votes per user. The web interface to the rating system can be seen on a BBC Movies film review, V for Vendetta. for example (top right of the page, '4 from 2612 votes' ... ).
Typically you might fetch each vote feed daily, the data being updated as many times per day as there are votes.
http://www.bbc.co.uk/movies/syndication/1/user-rating/174521/rating.xmlUsers of the /movies service can leave a short review about a film they have seen. This submission is moderated and then published (or rejected) by the production team. The comments appear in a chronological list, most recent first. Comments can be flagged as 'star comments' by the production team, ie. worthy of note.
The absence of a file (ie. 404) typically means no reviews have been published for the given film.
This feed describes any upcoming films on BBC TV channels. The films are sorted chronologically. The feed should validate to RSS 2.0. Typically you might fetch this feed daily.
http://www.bbc.co.uk/movies/syndication/1/whatson/rss2.xmlAlthough we are using validating RSS 2.0, it might help to point out some conventions we have used.
> the film details [Films Review RSS] > don't have a link to the BBC Page, so there is no way of linking back to > you easily.
The copies you see are the inputs to the Movies production system from which the URL's are later derived, so as source documents they don't know the eventual URL, hence can't reference it. There are two options I can think of when deciding what to link to.
Firstly, linking to the dated index pages. The production cycle dictates the reviews will appear in a dated archive the next Friday after the date under in the <datereviewed> node, eg.
if date review is 1st Jan 2007, the review will be added to ... -> http://www.bbc.co.uk/films/gateways/release/review/cinema/20070105.shtml
Or secondly, linking directly to the review. The <datereviewed> node + the XML file name + .shtml will also give you the URL of the HTML review.
if review is 1st Jan 2007, the XML document miss_potter_2006_review.xml -> http://www.bbc.co.uk/films/2007/01/01/miss_potter_2006_review.shtml
> Image and document links in the XML reviews use relative URLs which don't > appear to be valid e.g. the review for Blood Diamond has the links: > images/blood_diamond_2007_review_middle.jpg > films/2003/12/04/the_last_samurai_review.shtml
Images are stored in the directory named after reviewed date, so you can derive them easily enough.
Eg. Blood Diamond ... http://www.bbc.co.uk/movies/syndication/1/reviews/cinema/blood_diamond_2007_review.xml ... contains ... <datereviewed> <year>2007</year> <month>January</month> <day>22</day> </datereviewed> ... from which you can derive ... http://www.bbc.co.uk/films/2007/01/22/images/blood_diamond_2007_review_middle.jpg
Each images has a ‘top’, ‘middle’ or ‘bottom’.
Likewise, any relative URL in the review copy is obviously to the root of bbc.co.uk, so just expand it.
> The capitalisation of the genre field can be inconsistent.
I guess we can’t controlled the vocabulary of text nodes just using DTD validation so this will have to be added as a caveat, assume case insensitivity.
It’s a guess that this was originally left free so that the production team could just build up a vocabulary of film genres they needed, and over time they have built their own controlled vocab.
> It would be useful to have some information about the genre taxonomy and > the range of the rating field (0 to 5?)
Essentially they map to the genres on the archive page. From the DTD I gather each review must have 1 or more genres. To all intents and purposes it's a fixed list, though not validated as such.
Action Adventure Animation Bollywood Classic Comedy Crime Documentary Drama Family Fantasy Horror Musical Romance Science Fiction Thriller War Western World Cinema
Rating is 1 to 5, and the currency is ‘stars’. It implies 1 star = raspberry, 2 = bad, ... 5 stars = excellent.
> The "What's On" films have time & date information embedded in the > description - it would really useful to have this in separate tags
We would have liked to move the [R], [S] etc. codes to RSS categories too, but we are serving this file directly from the Whats On CGI application (rather than fetching then processing it ourselves) so very limited by the format of the data that application throws out.
In order for the RSS to validate we chose to bundle the date inside the <description> as the given date format from whatson (eg. ‘Tue 30 Jan, 23:35’) threw up validation errors when place inside the <pubDate>. I can’t think of anything more sensible, apart from us processing it to get what we want ...
> it would be really useful to have genre signalled in > the RSS feed as well as the review.
Whats On and the Movies system are separate systems that don’t converse with one another. In the future I would hope a replacement system would provide this info, it should do shouldn’t it? But one for our wishlist too.