RTP and playout delay compensation Henning Schulzrinne Dept
- Slides: 9
RTP and playout delay compensation Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003
RTP packet header 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | contributing source (CSRC) identifiers | |. . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
RTP: timestamp l l Timestamp measured in sample units reflects nominal sampling time of first sample in packet e. g. , 20 ms block size of 8, 000 Hz audio 160 timestamp units per packet always 90 k. Hz for video – – – l l e. g. , 3000 timestamp units per packet for 30 fps 3600 for 25 fps 3750 for 24 fps even if real system clock is slower or faster note: 32 bit integer may wrap around – – if start at 0, after about 6 days for audio, ½ day for video but starting value is supposed to be random
RTP sequence number l l Counts packets actually sent Wraps around much quicker – l e. g. , for 20 ms packets, in about 22 minutes Also uses random starting value
RTP timestamp vs. sequence number l Related, but different purposes – timestamp for timing reconstruction: l l – l l playout delay compensation (later) synchronization with other sources (later) sequence number for loss measurements and gap detection t = s*b + c where t = timestamp s = sample units per packet offset c is constant within a talkspurt, but changes after each talkspurt or after transmission gap
Playout delay l Converts variable network delay (“jitter”) into fixed delay – – thus, end-to-end delay is max(jitter) + propagation delay or, if willing to tolerate some late packets: l l Propagation delay is invisible – – l l delay < 95% of jitter + propagation delay and hard to measure without synchronized clocks about 5 ms/1000 km one way Total delay should be less than 150 ms one-way End-to-end delay must remain constant within a talkspurt – otherwise gaps
Playout delay packet jitter late = lost time
Playout buffer l l Logically infinite buffer Implemented as “circular buffer”, with wrap around Takes care of jitter and reordering based on RTP timestamp t Playout point p = t*b + c – – – p = buffer position, measured in samples (typically, 16 bits if decoding is done before playout) b = buffer positions per sample (usually, = 1) c = offset silence decoder (G. 729 L 16) l l Usually, best to think of each talkspurt as an independently schedulable unit p = p 0 + (t – t 0) * b t 0 = timestamp for first packet in talkspurt p 0 = position for first packet in talkspurt
Playout buffer, cont’d. l l Thus, hard part is computing insertion point for first packet in talkspurt Trying to predict future – l late loss vs. excessive delay Conceptually, two approaches: – look at current playout point when first packet arrives l l – then, leave some margin of error may be too conservative l compute based on last talkspurt and change c l l avoids overestimation due to slow first packet deals less well with jumps in delay after long pauses insert t=100 t=140 Simple method: assume roughly normal distribution and take n times the variance of the delay (= jitter) – l play this becomes the extra delay Other mechanisms: – – spike detection optimal value for last talkspurt t