Posted by Bruce Weir on , last updated
Following on from our first public trial of our 5G Smart Tourism application in December, we have now completed our second public trial in which we tested an alternative method of presenting low-latency, high-bandwidth video to a smartphone.
The smartphone application in the second trial used the same ‘window in time’ concept as the first, in which a historical reconstruction of a real location was aligned with, and superimposed upon, the same location in the real world, but instead of streaming pre-rendered 360 videos to a user’s phone we also tested a “remote rendering” VR version. In this new version, the physical position of the smartphone was sent to a remote computer which used that information to draw a live ‘VR’ view of the historical scene before streaming it back the smartphone for display.
Creating the historical view in real-time has several advantages over using pre-rendered 360 videos. Firstly, while the 360 video is drawn from a single physical position, forcing the user to stand in a single location to accurately align the historical view on the phone with the real world, the VR version gives the user complete freedom to move around the physical environment for as long as the position of the phone can be determined. Secondly, because the VR version is being drawn live, it is possible to add interactive or personalised elements directly to it as one would with a computer game. Thirdly, in the VR version, it is possible for the user to ‘zoom’ his viewpoint (to examine details in the scene a distance from him) without loss of image quality.
Of course, the VR scene could be drawn directly on the smartphone itself, but this has several drawbacks, including inferior rendering quality compared to that available on a decent PC, a much larger app that needs to be installed, and extremely high battery usage (performing the high-quality graphics drawing on our Android S8 smartphones caused the battery to drain in about 15 minutes).
The key technological requirement for this application was to keep the delay between a user moving their smartphone and the view on their screen updating to an absolute minimum. To this end, our partners at the University of Bristol provided a 60 GHz, 5G mesh network between the Roman Baths trial location and the physical server cluster located at Bristol University which performed the VR graphics drawing, although the final ‘hop’ between the 5G network and the smartphones was done via an LTE connection.
Bristol provisioned 10 virtual graphics-drawing PCs using OpenStack and optimised them to allow fast access to physical memory and graphics hardware. Onto each virtual machine, they installed our graphics application and streaming software. During the trial, a user’s smartphone would request graphics-drawing capability from Bristol’s servers and be assigned IP endpoints for sending the control data and receiving the drawn video. A description of the technical aspects of the phone app is included below.
Using Google’s ARCore SDK, each smartphone could locate itself in the Roman Baths and then send its position, orientation and other control data as raw UDP packets to Bristol’s graphics-drawing PCs. The graphics application on these PCs used Aardman’s historical 3D models of the Roman Baths and a server application based on the NVIDIA Capture SDK to draw the historical scene from the viewpoint of the smartphone, encode the resulting graphics as H264 video and then stream this live view back to the smartphone using the SRT streaming protocol. This video was then decoded by the phone using an Android MediaCodec instance and displayed on an OpenGL surface in the app. The result of this process can be seen in the video below, in which a user is moving their phone around the upper balcony above the Great Bath, and the resulting historical view is being created about 15 miles away and sent back in real-time.
For this trial, round-trip latency from a handset being moved, to a video frame corresponding to that movement being displayed was ~200 milliseconds. Positive user feedback from the large number of participants who attended the trial confirms that this is satisfactory for a handheld VR application. It would not be acceptable for a head-mounted VR display however, where a latency of less than ~50ms between user motion and display update is required. Further methods for improving perceived latency would be required if this sort of technology was to be used for a headset-based VR experience. Possibilities include:
- Drawing a wider field-of-view than the display requires and moving the user viewpoint around that wider view using orientation from the fast smartphone sensors. User position latency would remain, but orientation latency would be greatly reduced.
- Using a sub-frame video encoder so that the streaming server doesn’t have to wait for a whole frame to be drawn before it can start sending data back the phone,
- Increasing the frame rate of the graphics application (and accordingly the transmission bandwidth)
- Trading network reliability for speed by sending the video back to the phone using UDP only.
A large amount of user feedback was acquired through participant interviews and we are currently analysing this to identify areas for future development. One of the features that was particularly requested was the ability to ‘personalise’ the explanatory historical notes in the scenes depending on the level of knowledge and interest of the user. Another popular request was for the 5G content to be used alongside the existing explanations from the tour-guide. This and other feedback will determine how we design future deployments of this system.
BBC R&D - All of our articles on 4G and 5G including:
This post is part of the Immersive and Interactive Content section