Extract, cut, join and merge audio and video streams

This is mostly a note pad for myself with quick instructions about how to extract, cut, join and merge audio and video streams.

In Igalia we often hold meetings with several parties attending remotely. The easy setting of such meetings usually involve a shared desktop through VNC and a SIP call in a multi user room hold in our Asterisk installation.

When some of my Igalian mates cannot attend we may want to record the meeting so they can play it later. Fortunately, GNOME Shell provides integrated desktop recording out of the box and we have Asterisk set to record automatically our calls in specific multi user rooms.

So, all what it is left after a meeting is just to get both files, edit them slightly and sync them to merge them in a single multimedia container.

Usually, I would use Kdenlive in my video editing tasks. However, Kdenlive doesn’t support “video edition” without re-encoding and I would really like not to re-encode the whole stuff. Specially, the video stream. Therefore, I still will use Kdenlive for the task of syncing both streams and looking for the cutting points for both, the video and the audio file.

For most of this “without re-encoding” actions I will use the great avconv tool.

First, I will cut the video in the time 00:07:45 as starting point and 02:05:20 as ending point:

$ avconv -i screencast.webm -c:v copy -ss 00:07:45 -t 02:05:20 cut-screencast.mkv

This command basically demuxes the WebM container and extract the video stream between those two points to mux it again into a Matroska container.

Then, I will cut the audio in the starting point 00:02:13 and ending point 01:59:48. For editing OGG files we can use Oggscissors or OGG Video Tools’ oggCut .

You won’t find Oggscissors in Debian (the distribution I use). Therefore, you will have to download it and install pyvorbis and pyogg and, maybe, modify slightly the script to use the proper python interpreter. You can install the missing packages like this:

root$ apt-get install python-pyvorbis

Once with Oggscissors working, we can get the interesting audio chunk like:

$ oggscissors.py --from=133 --upto=7188 conf-call.ogg cut-confcall.ogg

or, with oggCut, like:

$ oggCut -s 133000 -e 7188000 conf-call.ogg cut-conf-call.ogg

It may happen that we actually want to extract the audio from another video file. This has happened to us, eventually, when wanting to use the audio from a synced file into another video with higher quality.

We will also use avconv for this:

$ avconv -i synced-video.ogv -map 0:1 -c:a copy synced-audio-output.ogg

It may also happen that we want to join a couple of OGG files since our SIP conf-calls sometimes have hiccups. With Oggscissors this will be done as follows:

$ oggscissors.py --join first.ogg second.ogg joint-output.ogg

With oggCat this will be done like:

$ oggCat joint-output.ogg first.ogg second.ogg

Finally, we will merge or mux the resulting video and audio files into a single media container. Again, with avconv this will be done like:

$ avconv -i final-screencast-conf-call.mkv -i cut-conf-call.ogg -c copy cut-screencast.mkv

Following the examples above this will result in a Matroska video file which contains a VP8 video stream and a Vorbis audio stream.

Hope you find this useful!

Leave a Reply

Your email address will not be published. Required fields are marked *