What does currentTime mean in HTML5?
In our previous post on implementing
startOffsetTime in HTML5, we mentioned that
Firefox and Chrome do not agree on a common interpretation of
currentTime attribute. This post expands on
that point to explain in detail what the differences are (and
why we think Firefox is right).
Explanation of join time (tj), earliest seek time (tes), current position (tp)
Let's look at how Firefox and Chrome differ in the way they interpret the 'earliest seekable time of the media resource' as defined in the HTML5 media elements specification.
As we'll see below, Firefox defines this as the time the browser joins the stream, whereas Chrome defines it as the time the stream started. We'll now look at these interpretations in detail.
The diagram below shows the salient points on the timeline.
Key to diagram
|tj||browser joins stream|
|tp||current playback position|
|te1||earliest playback position (Chrome)|
|te2||earliest playback position (Firefox)|
Firefox's interpretation of
In Firefox, the
currentTime attribute is
interpreted as the offset into the stream from the time the
browser joined the stream. In terms of the diagram above
currentTime = tp -
currentTime were the only information
available to us, we could not implement
startOffsetTime as we would have lost the
information telling us how far from the stream start
tj is. Fortunately for us, Firefox
maintains an internal variable called
startTime which measures the interval from the
beginning of the stream, i.e. tj -
So, using the
DateUTC field from the WebM
header, we can calculate
startOffsetTime = DateUTC + startTime.
To calculate actual current stream time, we simply add
Chrome's interpretation of
Chrome has a different interpretation of the 'earliest
seekable time of the media resource' in that it implements
currentTime = tp -
currentTime is the
offset from the time the stream started streaming. This
would appear to make things easier for us as
startOffsetTime is then simply the
DateUTC read from the WebM header. However,
there is a problem here. The
relevant section of the HTML5 media elements
In the absence of an explicit timeline, the zero time on the media timeline should correspond to the first frame of the media resource. For static audio and video files this is generally trivial. For streaming resources, if the user agent will be able to seek to an earlier point than the first frame originally provided by the server, then the zero time should correspond to the earliest seekable time of the media resource; otherwise, it should correspond to the first frame received from the server (the point in the media resource at which the user agent began receiving the stream).
And the winner is...
Given that neither browser can seek earlier than the first
frame they receive, this implies that Firefox has made the
correct interpretation of
Chrome's interpretation appears to be that zero on the
timeline represents the earliest position in the stream.
However, we agree with Firefox that the specification
implies that zero should represent the earliest
seekable position in the stream.
There seems to be some confusion here in how the HTML5 media elements specification is dealing with logical stream addressing versus physical stream addressing. The excerpt above talks about a user agent being able to "seek to an earlier point than the first frame originally provided by the server" but does not explain how this could possibly happen without communication back to the server, in which case we are effectively dealing with a request for a different physical resource. At the very least, the fact that the Firefox and Chrome teams came up with different interpretations shows that this part of the specification would benefit from clarification.