Posted by Libby Miller on
Welcome to weeknotes from the IRFS team in BBC R&D, where this week we are starting to talk with machines, and understand humans.
The Experiences team are working on three main projects. For Atomised Media, Tim, Lara, Chris N, Ant, Chrissy and Tom H have been putting the final polish on the Newsbeat Explains site, getting it up to a public-facing standard. Barbara has been negotiating with the stakeholders on when and how the site will go live, and liaising with our journalist Anna who will create the articles. Ant has been our sprint manager, and wrote up the run book (a document that tells the system administrators what to do if a problem with our site occurs). He also load tested the service using Gatling.
Meanwhile, Chris and Tim have been working with Tristan on improving the analytics. Lara and Chris N investigated ImageChef as a service to reduce page load times by providing responsive (i.e. appropriately sized) images for each device size. Tom H, Chrissy and Lara have been working with Tom M to improve the front end design and user experience. Alan has been improving the web interface to his script parser. Andrew worked with him to produce some wireframes for the new interface.
For Tellybox, Henry, Joanne, Libby and Calliope have been transcribing the eight interviews we've run to help understand people's TV watching habits. Joanne designed and ran a workshop to help us group our detailed findings from the interviews. Libby managed to get some BARB data in SQL and discovered that about a third of TV watching is with others - higher for on-demand. She's also been planning the next steps for the project.
Henry pitched the Talking to Machines project to the team and it's going ahead, so he's now planning out the project in more detail. We got an Amazon Echo! Henry and Ant have been setting up and testing it. Henry’s been mapping out behavior and sketching VUI flows for a possible Radio application for Alexa.
We are going to a team day next Monday to look at the projects we have ongoing and discuss the future direction of the Experiences Team. Joanne, Libby, Tristan and Ant have been working hard to plan it so we make best use of our time, using Catwigs (pictured above) among other tools, to help us understand and evaluate our projects.
The Discovery team have been focused on developing and deploying new features, and understanding our short to medium term priorities.
Thanks to Manish our Content Analysis Pipeline is more stable, and further work carrying on into the next sprint will help with scalability. Stability has allowed us to roll out the feeds management work by Frankie. This both allows us to manage all the content (RSS, Twitter and YouTube) feeds that we receive, and suggests new sources based on links we find whilst analysing content.
We have been working with myBBC to understand how we might contribute to recommendation systems (Chris, Thomas and Manish) and with News and NewsLabs (Olivier) to understand and scope out their future requirements.
Katie has been leading the collaboration with The Arts and Events team in Scotland. Using the Hay Literary Festival, Chris explored how we could identify significant people, locations and creative works associated with events such as arts and music festivals. He developed a script which uses festival websites or downloadable programmes to suggest candidates which could then be vetted by the user to produce a formatted search query for our editorial algorithms. Using this technique we provided a constantly updating stream of content to the team covering Hay. The learning from this is feeding into our work with them on the upcoming Edinburgh Festivals (International, Fringe, Book and Film).
David has been exploring different ways of presenting the content we find that try to answer some of the requests for time and/or topic based searching that we’ve had from our users.
Michel has continued his work on the comparison of tone analysis from audio and video, with sentiment analysis of the subtitles of the same content. He’s also broadening his knowledge with a course on Neural Networks.
Our third blog post "Using Algorithms to Understand Content" where we we look at our attempts at teach algorithms to understand what content is about, was published by Georgios and Olivier, complete with Dalek graphic.
Finally, we started an exercise in Story Mapping to help us all understand and focus on our priorities. It’s a technique that should help us see the details of the work, without losing sight the bigger picture - an oft stated problem with agile teams and standard backlogs. We’ll see how it works out for us.
Qiong Hu has joined the Data team for three months. She’s come from the Centre for Speech Technology Research at Edinburgh University. She is looking at writing some text-to-speech software so we can trial synthesising the voices of 'BBC Talent’.
Qiong has prepared a test database using BBC Weather forecasts due to their optimal recording conditions. She aligned the audio to the subtitles using the Kaldi software we use and is now using that to train a model for synthesising the presenter’s voice.
Ben and Matt have been working with Thomas and Manish to set up Artifactory on the public cloud, which will enable us to host package repositories for almost any type of software package, for example Docker images, Debian packages or Rubygems. This sprint we got test uploads and install of those three types working. We have a few more issues to iron out but it’s looking good so far, we hope it should be ready for use soon.
Denise presented her work on Pace and Tone Music Playlist Classification to the team last Thursday. She is writing up her findings in a Technote and Executive Summary for R&M stakeholders. Denise has also delivered Pace and Tone data to MyBBC for consideration in their applications.
Ben has provided a CODAM ‘micro service’ to BBC News that will allow them to fingerprint videos from Jupiter. Over the last few weeks they have began integration and are almost ready to begin testing.
For Speech to Text, we have finished integrating an improved Voice Activity Detection algorithm into the software. This works alongside LIUM which is the original tool we used for speaker diarization. The improved algorithm fixes some of LIUM’s mistakes and leads to just over 2% improvement in error rate.
Some of our smaller projects:
Tim attended meetings to wrap up the World Service Archive project to get the outputs of our Archive prototype inside the BBC's systems so they appear on the main BBC /programmes site.
Chris N chairs the W3C TV Control Working Group where they discussed radio use cases and what’s needed to publish a First Public Working Draft specification.
Henry helped Connected Studio run their VR demos.
Chris Newell and Libby are preparing to make a T in the Park "Shuffle" prototype, so made a quick version for Big Weekend to test the infrastructure.
We say goodbye to our UX trainee Calliope this week - all the best, Calliope!