Posted by Alex Robertson on
This week we're experimenting with our first fully remote hack, teaming up across BBC R&D to explore ideas around wellbeing in isolation - a topic on many of our minds.
Meanwhile, work across our core projects continues.
Voice and motion
Can we accurately detect body movements and gestures via webcam for multiplayer game interactions? That’s what we’re investigating as we adapt one of our prototype audio AR experiences for remote play, currently testing TensorFlow’s PoseNet for realtime pose estimation.
Many of you have been taking part in our Synthetic Voice & Personality Study. As it draws to a close, we’re preparing to analyse the large amount of quantitative and qualitative data that’s been gathered, and hope to share the insights soon. In the meantime, our own Barbara Zambrini was interviewed about the project in TVBEurope’s May issue (page 42).
The recommendations challenge
We want to help join up recommendations for our audiences, between different content products (“cross-media recommendations”; iPlayer to News, for instance) and different content forms (“multimodal content similarity”; audio, video, text). To begin tackling these challenges, we’ve reviewed the BBC’s various metadata formats (discovering significant incompatibilities to overcome) and the degree of user overlap between each product, and we’re attempting to assemble a full month of multimodal content for analysis (itself a real challenge!). Meanwhile, we’ve helped to build an offline evaluation system to assess new recommendation algorithms before their deployment, based on our recent work using Apache Spark.
What about how people feel when watching or listening to BBC content - could this help inform recommendations? That’s what our Sentiment User Study is seeking to explore, using a mixture of biometric analysis (via smartwatches for the participants) and self-reporting, the results of which we'll later compare with our automated analysis of programmes. The questions this study raises - data protection, research ethics, medical accuracy and interpretation amongst them - are all being carefully assessed by the team.
Improvements and solutions
The team working on our speech-to-text system are always striving for incremental improvements, approaching the challenge from multiple angles. Most recently these include: successfully separating and removing any non-speech audio in order to reduce the word error rate; increasing performance by merging small custom language models into the larger model; analysing approximately 800,000 BBC News articles to detect neologisms that need adding to the STT lexicon (along with their accurate pronunciations); evaluating the potential of OpenSeq2Seq.
In our collaboration with News Labs - to design a discovery tool for BBC journalists searching news clips - we’re now focused on distilling the insights from our many conversations with prospective users across the organisation, whilst continuing to iterate.
On our Web Standards strand of work, we’ve been looking at web support on TV devices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video. The problem: is the user’s web browser able to decode HDR and WCG media, and if so is it connected to a compatible display? Our proposed solution: specify new CSS media queries to assess video display capability. We’re working through the details now - more to come!
To end, three bits of exciting news. You can hear our team’s Alicia Grandjean discuss her News Mood Filter prototype in the latest Freakonomics podcast (“Reasons to be Cheerful”). We’re into final preparations (design updates, load testing, release plan) for our synchronised viewing and listening pilot. And a warm welcome to our new Industrial Trainee Ben, who’s starting work on our ambition to support the public's understanding of machine learning, beginning with some bird identification using My Naturewatch!
This post is part of the Internet Research and Future Services section