Multiplexing · Lesson 7 · Beginner

Media Time Slices

Audio and video can share one path by sending small timestamped packets and buffering jitter.

A conductor keeps instruments in time

A song has drums, voice, and guitar. They can travel as separate little packets, but playback must follow the song clock. If guitar arrives early and voice arrives late, the player buffers and lines them up.

Memorable label: “Clock beats labels.” Media packets need track labels and time labels.

Six-year-old version: Stickers say “audio” or “video,” but a clock sticker says when each piece should play.

Real-time multiplexing cares about jitter

For media, being correct is not enough. A video frame that arrives too late is useless. A jitter buffer waits briefly so packets that arrive unevenly can still play in timestamp order.

WebRTC and RTP systems use timestamps and sequence information for media timing. MDN describes WebRTC as supporting real-time audio/video communication in browsers (MDN WebRTC).

Demo: arrival order vs playback order

Packets arrive unevenly. The player uses media time, not arrival order, to play smoothly.

Media mux road

audio track video track packet stream jitter buffer packet
Small packets carry track and media time.
Ready. Play first packet.

Playback schedule

Packets sorted by media time
Play at Track Data
No playback yet.

Exercise

node --test tests/media-mux.test.mjs

Open src/media-mux.mjs. It interleaves packets by media timestamp, then models a jitter buffer.

Knowledge check

Why does media multiplexing need timestamps?

Interview payoff

Question: “Why is real-time media multiplexing harder than file transfer?”

Model answer: “Media packets must arrive and play on time. Multiplexing needs track labels, timestamps, sequence handling, and buffering to absorb jitter. Late packets can be worse than missing packets.”

Pro tip: Buffers trade latency for smoothness. More buffer means fewer glitches, but slower live feel.

Next challenge: Can we build a tiny practical protocol using these labels, chunks, and channels?