When muxing to some format, duration of the packet is used. We need an API for encoder to return the frame size.