Google Summer Of Code
For the third year in a row, I'm very pleased to announce that BBC Research is participating in Google's Summer Of Code [GSOC] open source mentoring programme.
For those who don't know what that is, here's a short summary:
Google encourages students to work on projects that they propose, over the summer, on the condition that students make the resulting work available as open source and that the work must be useful. They pay the students $4500.
Google also invites organisations and groups that run open source projects to apply to be mentor organisations. In practice, this means - among other things - that they agree to post project ideas, to accept project ideas and rank them and ultimately to mentor the students, acting as the check and balance so that the work the student produces essentially meets that useful tag.
Students from around the world then apply, talk to the mentor organisations, write up proposals, and put them in, and hope that they get a chance to work on their project. This year over 7,000 students applied, over 1,100 students have been accepted and BBC Research is just one of 175 mentor organisations.
What do the students get out of it?
Well, as well as not having to look for a summer job, they get to work on the project idea they proposed. This could be as mundane as (but as vital as) building a test framework, through to working on a physics engine - or, in some cases, on games.
Not only that, it is designed to allow students who are interested in getting involved in open source and free software to do just that. It's designed to help to show them the ropes: how projects work; what's the best way to ask questions; how to use version control and a number of the basic software engineering aspects revolving around communication, collaboration and code review.
What do the mentor organisations get?
Well, clearly they get some work done on their project. It may be absolutely critical and core to them, or it may be some new area of speculative functionality, and often something they'd not even considered working on. They also get new people joining their project, either temporarily or in some cases permanently. They get people who can themselves move on to become leaders in their own right. The mentors themselves also tend to learn what it means to be a mentor, which is an incredibly useful skill to learn.
And what does Google get?
Well, Google relies on open source for a significant amount of its infrastructure, and this grows the number of people who work in open source. It's a very concrete way for Google to give something back to the open source and free software communities, which is something their open source liaison team will very passionately talk about, given the chance.
So why does BBC Research get involved?
Well, fundamentally - it fits with our goals. We produce open source software of various kinds, take on a handful of industrial trainees each year, participate in work experience, and - on a wider-BBC scale - the BBC both participates in and runs mentoring projects.
It's a way to give something back, and to see some fresh work on speculative ideas and projects.
So we spoke about projects, and while this is our third year, I'd like to acknowledge some of the work done in previous years by students. In all three years, the Kamaelia & Dirac teams drove our participation in Summer Of Code, and as a result the student proposals reflect those projects.
So, very briefly, what are they?
Dirac is a video codec based on wavelets, which has recently been standardised by SMPTE as VC2, and is taking significant steps forward in the video codec world. It has a Dirac Pro for high end usage, with hardware encoders and a new high-performance version being developed with community support called Schroedinger.
Kamaelia is a component toolkit based around principles of sociable software to make software both easier to build and maintain, while also making concurrency (including multicore) easy and natural to work with. It includes tools as diverse as timeshifting, P2P whiteboarding, video annotations, database modelling and shot change detection, through to email greylisting.
So, what happened in our first year?
On both the Kamaelia and Dirac sides, we listed ideas that we speculatively thought would be useful and interesting, and student proposals generally matched that. On the Kamaelia side, we had students looking at ideas as diverse as BitTorrent integration, OpenGL support, and trusted communications (think secure telephone, rather than DRM). Beyond that, they also produced an assortment of tools for web integration, and basic tools for IRC (Internet Relay Chat). On the Dirac side, the Dirac team experimented with two different approaches of reimplementing Dirac in Java for potential use within a browser.
Last year, the Dirac team took a different approach. They had some ideas, but really looked for students with ideas that they really pushed. They took on a student looking at core fundmental algorithm options within the codec - essentially a research project.
On the Kamaelia side, we decided to look at project ideas that would grow Kamaelia in terms of usability in other areas.
One looked at how to make Kamaelia functionality more usable in non-Kamaelia systems. This has also helped to make it easier to build certain kinds of Kamaelia systems such as simple swarming P2P systems, but also enabled core work on multicore support to move forward sooner than expected. Another student worked on functionality that enables Kamaelia systems to be remote controlled by IRC and AIM, so - for example - you could remote-control your video recorder from work using an instant messenger client.
Finally, another student worked on a piece of work that was highly speculative. Kamaelia components default to running concurrently with all other components and can be bolted together in a manner much like the toy K'Nex with a composition tool. The question was really whether we could extend this visual approach such that we can dive inside and create completely new components visually. Importantly, this project showed that this can be done, and while much work needs to be done, this project allowed us to consolidate in real terms what we mean by components, shards, and connectors.
It's worth noting that with all these project ideas, the overhead of support for projects we've mentored is incredibly low. Some of the ideas and tools they produce can be highly speculative, work we would never be able to justify spending time on, but are incredibly useful and grow our understanding of the projects. This grows the projects' usefulness to the BBC. Beyond that, it opens us up and enables others to see that we are actively interested in new contributors with new ideas.
We've worked (online) with students from around the globe UK, USA, India, Austria, and we've all learned a lot from the students and I hope they learned from us too. I'm really excited by the ideas this year since they are all so very different from what's gone before.
So what stage are we at this year? Well, last week Google announced which projects had been allocated, and mentor organisations and students are starting to know each other better, and to plan how we're going to work together, share resources, and so on.
This year, the Dirac team has taken on two projects, and are working together with the Schroedinger team closely, with mentors both from inside the BBC and outside.
Once again, a Java decoder is on the cards, but this time, rather than being based around the C++ implementation, it is based around taking the optimised Schroedinger codebase as a starting point. The other project they're mentoring revolves around using a Graphics Processing Unit (the core of your graphics card) to speed up Dirac primarily for decoding, but also for encoding.
In the Kamaelia project, we've taken a slightly different approach. Like Dirac, we're also lucky to have support from the community in mentoring students. One of our smaller goals in previous years in GSOC has been to test a core hypothesis of Kamaelia: that it should be possible to create tools that allow the average, or competent novice developer to build highly concurrent systems, which can naturally take advantage of multicore hardware (when it becomes common) in such a way that they don't worry about concurrency and such that it's fun.
Our perspective, 8 GSOC students and 2 industrial trainees later, is that Kamaelia does indeed enable this, so we thought "let's see what systems people can build and would want to build if we encourage unconstrained project ideas."
Now, since Kamaelia also encourages the creation of components which are naturally reusable even if designed with a specific system in mind, we primarily evaluated projects based on a few criteria: how ambitious they were, what spread of useful components would drop out, how interesting and useful an example they would make, and if they created any core facilities that we know would be useful, along with sheer enthusiasm for the idea from students and mentors.
We certainly got some surprises, and so we're mentoring five projects. Three are end-user application systems and two relate to core enhancements. The interesting thing about the end-user apps, however, is their randomness - and their scope. One of the core enhancements relates to support for 3D visualisation support, augmenting our existing 2D topology visualisation tools which have enabled novel interfaces. The other relates to a testing framework for Kamaelia components, drawing on inspiration from sources like unit test, Xtest and testing approaches for SOA-based systems.
Here's more detail about three applications:
Kamaelia Jam: this is a project which revolves around creating tools for collaboratively creating music (both in person and over the network) in a highly visual and ultimately tactile way. Based on the aims of the project, this will give Kamaelia MIDI and OSC support, and may well provide a useful route for working with arduino, webcams and potentially multitouch support.
Kamaelia Paint (tentative name): a tool to take the existing whiteboard code and extend it to become a full multiwindow paint tool. Now, due to the libraries in use (pygame), the only way this will work is if this is a naturally multicore ready system, which is part of the core aspect of this project. However, it also enables integration with the very basic video annotation tools.
Does the world need another paint program? Maybe. Will this project create a lot of interest components? Almost certainly. The student involved has already created components for Lego Mindstorms integration, for example.
Kamaelia Publish (or Press): in the student's own words the aim of this is to allow you to publish yourself on your own terms - essentially a peer-to-peer webserver and personal widgets. (A way of summarising this is "what would happen if you melded a webserver, a P2P client, an IM system and Facebook as a client side application?".)
Again, this is a highly speculative and interesting project that again will produce a variety of interesting tools, such as potentially the capability to take Google App Engine applications and run them on people's desktops.
Then there's the that other aspect to these three projects. Given that Kamaelia components can all be used with one another, what happens when you begin to mashup the functionality of these systems (new and old) in new and inventive ways? And these ideas were just the tip of the iceberg.
And on that note, I'd just like to thank our GSOC students of past years: Ryan, Thomas, Anagha, Devendra, Luis, Adam, Baishampayan, Patrick, Jinna, Tara and Andrew. And I'd like to welcome the students we're working with this year: David King, Joe Turner, Pablo Orduna, Chong Liu, Jason Baker, Matthias Bolte and Bart Weigmans.
I'd also like to thank David Schleef & Sylvain Hellegouarch for their enthusiasm in being community mentors this year :)
Google's Summer Of Code programme is innovative and surprising. It does actually manage to give back to the community each year, year on year, in a fundamentally useful way to the communities that they work with. Our role is at present that of a mentor organisation: to provide guiding through our projects to introduce students to how open source projects can and do work.
This is our third year, and it's proven to be useful, but it does leave a question.
How would you see the BBC's involvement with open source developing? Continue being a net consumer or growing to an even greater contributor (where it benefits the license fee payer)? Closer interaction to programme areas that benefit directly from technology (sport, childrens and scifi spring to mind) or back-end systems?
If you had three suggestions as to how BBC Research could make a similar contribution to open source, beyond our existing involvement in mentoring schemes such as this, industrial placements and release of some projects of open source, what would they be?
You can't assume infinite resources, and every penny of cost for such suggestions takes away from the BBC's core business of giving you, the audience, creative innovative trustworthy content. But given those restrictions, what would your three suggestions be?
I'd love to hear your answers, and I'm pretty sure my colleagues would too.
Have a great summer, whatever you do.
Michael Sparks is Senior Research Engineer, Research & Innovation, BBC Future Media & Technology.