Testing BBC iPlayer Release For Android Downloads
Hello, I'm Paul Rutter, Test Manager for POD Test in Mobile Platforms, Programmes and On-Demand, BBC Future Media.
My team and myself are based in the great new buildings in MediaCity here in Salford.
In my team we look after testing for the following products:
- BBC iPlayer native applications on iOS and Android
- BBC iPlayer Radio native applications on iOS and Android
- BBC Media Player native application on Android
- We also worked on the Antiques Roadshow playalong app and provide support to other component level products built within the team.
I'd like to describe how we went about testing the latest release of BBC iPlayer on Android. Version 2.0, which introduced support for downloads.
Some of the Android devices used in testing
In the Test team we have Test Engineers and Developers-in-Test who are embedded in the Agile development team, sitting alongside developers and the rest of the project team.
We also have the ability to add additional test capacity through an external supplier, which enables us to run iterative regression testing rounds and schedule larger regression testing rounds in a shorter time period.
For this release Test resource consisted of 3 embedded Test Engineers and 1 Developer-in-Test.
Our Android products are installable on over 3000 different devices from the Google Play store. Obviously we can't possibly test on all these devices otherwise we'd never get round to releasing anything - so we prioritise testing based on stats from audience use, which gives us our most popular devices.
In addition the Product team chose to focus support for the downloads feature to a finite list of 13 prioritised devices which would be whitelisted at launch. This further narrowed down our focus for device coverage with the aim of delivering the release within an acceptable time period.
All Mobile Platforms hardware is SIM free and without operator customisation.
BDD and automated testing
We followed Behaviour Driven Development (BDD) methodology.
These Scenarios can then be implemented as automated tests or left as manual tests. Deciding which scenarios to implement was based on discussion around the Cost (complexity of implementation) against the Value (project or product).
Automated Tests were integrated into our Continuous Integration build process and run against the Android emulator.
In addition, the Feature files built up into a living requirements specification of the product.
A user story describes a feature which the team want to implement. Once the Feature has been through the "three amigos" process it is ready to pull into a Sprint. The feature is then implemented by a developer, along with any automated tests - and then tested by a tester against the Acceptance Criteria. Based on this tickets move across our task board getting closer to 'done'.
If the feature passes testing then the ticket is closed out and moved to 'done'. Any manual test cases for that feature are added to our Test Case Management System.
If Testing demonstrates the feature isn't implemented as expected the tester will either fail the ticket and move it back to in-progress, or may (after discussion with the Product Owner) move it to 'done' and raise a defect or new story as necessary.
For feature testing we tested on one Phone and one Tablet from our prioritised list of devices to support downloads.
Iterative regression testing
At the end of a development sprint a sprint build was released to the external test team together with details of the last sprint's stories and what devices were used for testing. The external test resource performed a round of regression testing covering functionality of the last sprint's stories and integration with existing functionality, execution of P1 tests and exploratory testing across different devices from the prioritised list. At the end of the next sprint the process is repeated.
- Test early! One of the fundamental rules of testing. Discovering issues sooner rather than later is always a good thing.
- Faster feedback to the development team around problem areas.
- Achieve greater device coverage sooner in the process – testing across more devices sooner rather than waiting for a pre-release regression round.
- Increase product knowledge in the Test team – giving the off-shore team early insight into the new functionality that we were building.
- Gain confidence in the product – more testing, testing sooner, more runtime of the builds and more feedback.
- By increased testing sooner we expect to decrease the likelihood of finding unknown issues in Release Candidate regression rounds.
Following the above the test team identified and logged 206 defects in total prior to reaching our Release Candidate. 81 of these were Severity 1 defects and were prioritised for fixing throughout the development period, prior to launch.
Release Candidate regression testing
Once the team had completed all in-scope functionality a Release Candidate (RC) build was created.
The team worked together to plan a full regression testing round across all 13 prioritised devices, plus regression testing on non-whitelisted devices (does the original functionality still work?) and negative testing on non-whitelisted devices (confirmation that we are we not seeing the downloads option where we expect to).
This initial round of testing came in estimated at around 60 man days! We managed to complete it in 2 working weeks. This was achieved by prioritising which tests/devices to execute first, by using the external test resource and by having developers change their hats and become a Tester for a few days!
This full round of regression testing resulted in 53 defects being logged but only 4 being prioritised as showstoppers by the product owner David Berlin.
Once the 4 issues were fixed and re-tested we had a new Release Candidate.
Working with the development team we talked around each issue and fix, understanding the risk to the application and where we might want to focus our testing. Having this information is vital and allowed a risk-based approach to be taken for the second round of regression testing - which was much lighter touch and completed in just a couple of days.
Following this the team had a Release Board meeting where stakeholders meet to review output of regression testing, discuss confidence in the build and ultimately decide whether to release or not!
Internal Beta testing
In addition to Testing within the Mobile BBC iPlayer team we also ran an internal Beta trial. So once we had our first RC we were able to offer that to our internal audience within Future Media. This is around 1400 people and, assuming they're Android users, allows them to install the build, play with it and provide direct feedback.
This is beneficial in many ways:
- Provides increased runtime of the applications
- Greater and varied device coverage
- Direct feedback to the development team
- Greater 'in the wild' testing
Defects are logged with standard good testing practice to include details on build, device, environment, screenshots, crash logs and steps to reproduce. In addition we set a Severity value (1 - 4). This then feeds into a defect triage meeting with a representative from Test, Development and Product review any new defects. Based on severity, explanation from Test and understanding on difficulty to fix and risk to the application from Dev, the Product representative can set a priority for fixing the defect.
The Test team advise at the end of stand-up each day whether there is enough new, or high severity defects to warrant a defect triage meeting. These are booked in daily but happen on an as needed basis.
Tools and technologies
We use various tools and technologies in the team, including:
- Jira (project and defect management)
- Confluence (documentation and knowledge sharing)
- Testrail (test case management and manual and automated test execution)
- Calabash, Ruby, Cucumber (Open source automated Test tools, frameworks and languages)
- Charles (for network monitoring and manipulation)
- The Android SDK
- Eclipse, Sublime Text 2 and IntelliJ (IDEs and text editors)
Last but by no means least, the most important part of all of this are the great bunch of highly skilled, enthusiastic, expert Test Engineers and Developers-in-test who love their product and routinely go above and beyond to improve the experience for our audience.
What went well?
Many of the team had worked on the downloads release of iOS BBC iPlayer so came with lots of experience and lessons learnt. Testing features against a single phone and tablet rather than all supported devices enabled us to get tickets tested quicker, with faster feedback to the Developers and Product Owner and faster progress across the board.
The iterative regression testing rounds were very successful and enabled the team to get earlier visibility of defects and earlier prioritisation of these by the Product Owner for fixing - as well ad greater device coverage earlier in the project.
Regular defect triage! Regular (often 3 or more times a week) review and discussion of new, yet-to-be-prioritised defects for the project was invaluable. Fixes got prioritised and went in sooner in a more timely fashion. Don't sit on and build up an unhealthy defect debt.
What didn't go so well?
Adopting BDD is a hurdle and there's certainly an initial uphill 'lump' to get over before the team start feeling the benefit of the process. However, having the three amigo review of a feature before it hits development was eventually seen as very worthwhile.
Most of our testing is performed on latest builds following a fresh install. This means we don't get much time to 'soak test' on a specific build for a number of days. This has meant in the past it's been difficult to uncover memory handling issues. In addition, we rarely watch programmes for the full length of the show (like a user would). This led to us not being fully aware of a memory leak issue affecting playback of downloaded content on certain lower memory devices where playback could crash at around 60mins of playback. This was discovered by members of the team using the device in their own time outside of the office.
On the day we were hoping to release the application David Berlin discovered a show-stopper issue (in a real world use case on the London underground) around locking and unlocking the device during playback. This was so significant it ultimately resulted in two of our 13 prioritised whitelisted devices being removed for downloads at launch. This issue will be fixed with a subsequent patch fix.
Needless to say the edge cases and real-world issues mentioned above are now covered by formal Tests within our Test suite.
I hope you enjoyed an insight into the Test effort that goes into supporting one of our major releases and if you have any questions don't hesitate to ask in the comments section.
Paul Rutter is Test Manager for POD Test in Mobile Platforms, Programmes and On-Demand, BBC Future Media.