There are three ways to do the measurement: at the server, using the video request or in the player.
Arguably the latter is the most accurate since this reflects what the viewer is experiencing and previous experiments have shown that measuring at the server is a very imprecise science.
Measuring at the video management system level is fine, but is over-dependant on the protocol and server infrastructure to be accurate for anything beyond video requests.
The biggest issue is that of measuring time. It's easy enough to know if a video has been demanded, but what if the user then switches to another window on their PC and continues letting the video (and subsequent videos) play in the background with the sound off (something I often do myself).
A number of strategies have been deployed to address this; they include having a 'heartbeat' which 'pings' the server as the video continues to play. Ideally, this would be embedded in the video file itself and therefore work across platforms and contexts. But this does not overcome the problem above.
Another way I've seen of tackling this is to calculate the bytes delivered against the size of the source file; this involves some deep integration with the metadata and the video management system.
Another is to create an overlay on the video which is conscious of the video context; but this would again be player specific.
I suspect that this narrow are is a potential goldmine for a single strand company that can come up with an agnostic solution to this issue. Companies like Visible Measures, and to a more general extent, TubeMogul, are working in this potentially lucrative space.
In the meantime it's unfortunate that online TV has, in some ways, become as inexact a medium as the traditional TV it replaced.