Quick Start Guide

Install

To use ffmpegio, the package must be installed on Python as well as having the FFmpeg binary files at a location ffmpegio can find.

Install the full ffmpegio package via pip:

pip install ffmpegio

If numpy.ndarray data I/O is not needed, instead use

pip install ffmpegio-core

If FFmpeg is not installed on your system, please follow the instructions on Installation page

Features

FFmpeg can read/write virtually any multimedia file out there, and ffmpegio uses the FFmpeg’s prowess to perform media I/O (and other) operations in Python. It offers two basic modes of operation: block read/write and stream read/write. Another feature of ffmpegio is to report the properties of the media files, using FFprobe.

Media Probe

To process a media file, you first need to know what’s in it. Within FFmpeg ecosystem, this task is handled by ffprobe. ffmpegio’s ffmpegio:probe module wraps ffprobe with 5 basic functions:

>>> import ffmpegio
>>> from pprint import pprint

>>> url = 'mytestvideo.mpg'
>>> format_info = ffmpegio.probe.format_basic(url)
>>> pprint(format_info)
{'duration': 66.403256,
'filename': 'mytestvideo.mpg',
'format_name': 'mpegts',
'nb_streams': 2,
'start_time': 0.0}

>>> stream_info = ffmpegio.probe.streams_basic(url)
>>> pprint(stream_info)
[{'codec_name': 'mp2', 'codec_type': 'audio', 'index': 0},
{'codec_name': 'h264', 'codec_type': 'video', 'index': 1}]

>>> vst_info = ffmpegio.probe.video_streams_basic(url)
>>> pprint(vst_info)
[{'codec_name': 'h264',
'display_aspect_ratio': Fraction(22, 15),
'duration': 66.39972222222222,
'frame_rate': Fraction(15000, 1001),
'height': 240,
'index': 1,
'pix_fmt': 'yuv420p',
'sample_aspect_ratio': Fraction(1, 1),
'start_time': 0.0,
'width': 352}]

>>> ast_info = ffmpegio.probe.audio_streams_basic(url)
>>> pprint(ast_info)
[{'channel_layout': 'stereo',
'channels': 2,
'codec_name': 'mp2',
'duration': 66.40325555555556,
'index': 0,
'nb_samples': 2928384,
'sample_fmt': 'fltp',
'sample_rate': 44100,
'start_time': 0.0}]

To obtain the complete ffprobe output, use ffmpegio.probe.full_details(), and to obtain specific format or stream fields, use ffmpegio.probe.query(). For more information on probe, see Media Probe Function References.

Block Read/Write

Suppose you need to analyze short audio data in mytestfile.mp3, you can read all its samples by

>>> fs, x = ffmpegio.audio.read('mytestfile.wav')

It returns the sampling rate fs and numpy.ndarray x. The audio data is always represetned by a 2-D array, each of which column represents an audio channel. So, a 2-second stereo recording at 8000 samples/second yields x.shape to be (16000,2). Also, the sample format is preserved: If the samples in the wav file is 16-bit, x is of numpy.int16 dtype.

Now, you’ve processed this audio data and produced the 8000-sample 1-D array y at reduced sampling rate at 4000-samples/second. You want to save this new audio data as FLAC file. To do so, you run:

>>> ffmpegio.audio.write('myoutput.flac', 4000, y)

There are video counterparts to these two functions:

>>> fs, F = ffmpegio.video.read('mytestvideo.mp4')
>>> ffmpegio.video.write('myoutput.avi', fs, F)

Let’s suppose mytestvideo.mp4 is 10 seconds long, containing a yuv420p-encoded color video stream with the frame size of 640x480 pixels, and the frame rate of 29.97 (30000/1001) frames/second. Then, the video.read() returns a 2-element tuple: the first element fs is the frame rate in fractions.Fraction and the second element F contains all the frames of the video in numpy.ndarray with shape (299, 480, 640, 3). Because the video is in color, each pixel is represented in 24-bit RGB, thus F.dtype is numpy.uint8. The video write is the reciprocal of the read operation.

For image (or single video frame) I/O, there is a pair of functions as well:

>>> I = ffmpegio.image.read('myimage.png')
>>> ffmpegio.image.write('myoutput.bmp', I)

The image data I is like the video frame data, but without the leading dimension.

Stream Read/Write

Block read/write is simple and convenient for a short file, but it quickly becomes slow and inefficient as the data size grows; this is especially true for video. To enable on-demand data retrieval, ffmpegio offers stream read/write operation. It mimics the familiar Python’s file I/O with ffmpegio.open():

>>> with ffmpegio.open('mytestvideo.mp4', 'rv') as f: # opens the first video stream
>>>     print(f.rate) # frame rate fraction in frames/second
>>>     F = f.read() # read the first frame
>>>     F = f.read(5) # read the next 5 frames at once

Another example, which uses read and write streams simultaneously:

>>> with ffmpegio.open('mytestvideo.mp4', 'rv', blocksize=100) as f,
>>>      ffmpegio.open('myoutput.avi', 'wv', f.rate) as g:
>>>         for frames in f: # iterates over all frames, 100 frames at a time
>>>             output = my_processor(frames) # function to process data
>>>             g.write(output) # send the processed frames to 'myoutput.avi'

By default, ffmpegio.open() opens the first media stream available to read. However, the operation mode can be specified via the mode second argument. The above example, opens mytestvideo.mp4 file in 'rv' or “read video” mode and myoutput.avi in 'wv' or “write video” mode. The file reader object f is an Iterable object, which returns the next set of frames (the number set by the blocksize argument). For more, see ffmpegio.open().

Specify Read Time Range

For both block and stream read operations, you can specify the time range to read data from. There are four options available:

Read Timing Options

Name

Description

ss

Start time in seconds

t

Duration in seconds

to

End time in seconds (ignored if t_in is also specified)

Note it is also possible to specify these timing options for the input (i.e., using the options ss_in, t_in, and to_in). The input options, especially ss_in, may run faster but potentially less accurate. See FFmpeg documentation for the explanation.

>>> url = 'myvideo.mp4'

>>> #read only the first 1 seconds
>>> fs, F = ffmpegio.video.read(url, t=1.0)

>>> #read from 1.2 second mark to 2.5 second mark
>>> fs, F = ffmpegio.video.read(url, t=1.2, to=2.5)

To specify by the frame numbers for video and sample numbers for audio, user must convert the units to seconds using probe(). For example:

>>> # get frame rate of the (first) video stream
>>> info = ffmpegio.probe.video_streams_basic('myvideo.mp4')
>>> fs = info[0]['frame_rate']

>>> #read 30 frame from the 11th frame (remember Python uses 0-based index)
>>> with ffmpegio.open('myvideo.mp4', 'rv', t=10/fs, t=30/fs) as f:
>>>     frame = f.read()
>>>     # do your thing with the frame data

Likewise, for an audio input stream:

>>> # get sampling rate of the (first) audio stream
>>> info = ffmpegio.probe.audio_streams_basic('myaudio.wav')
>>> fs = info[0]['sample_rate']

>>> #read first 10000 audio samples
>>> fs, x = ffmpegio.audio.read('myaudio.wav', t=10000/fs)

Specify Output Frame/Sample Size

FFmpeg let you change video size or the number of audio channels via output options s and ac, respectively, without setting up a filtergraph. For example,

>>> # auto-scale video frame
>>> fs, F = ffmpegio.video.read('myvideo.mp4', t=1.0) # natively 320x240
>>> F.shape
(30, 240, 320, 3)

>>> # halve the size
>>> width = 160
>>> height = 120
>>> _, G = ffmpegio.video.read('myvideo.mp4', t=1.0, s=(width,height))
>>> G.shape
(29, 120, 160, 3)

>>> # auto-convert to mono
>>> fs, x = ffmpegio.audio.read('myaudio.wav') # natively stereo
>>> _, y = ffmpegio.audio.read('myaudio.wav', ac=1) # to mono
>>> x.shape
(44100, 2)
>>> y.shape
(44100, 1)

To customize the conversion configuration, use vf output option with with scale filter or af output option with channelmap or pan or other channel mixing filter

Specify Sample Formats

FFmpeg can also convert the formats of video pixels and sound samples on the fly. This feature is enabled in ffmpegio via output options pix_fmt for video and sample_fmt for audio.

Video pix_fmt Option Values

pix_fmt

Description

gray

grayscale

ya8

grayscale with transparent alpha channel

rgb24

RGB

rgba

RGB with alpha transparent alpha channel

Audio sample_fmt Option Values

sample_fmt

Description

min

max

u8

unsigned 8-bit integer

0

255

s16

signed 16-bit integer

-32768

32767

s32

signed 32-bit integer

-2147483648

2147483647

flt

single-precision floating point

-1.0

1.0

dbl

double-precision floating point

-1.0

1.0

For example,

>>> # auto-convert video frames to grayscale
>>> fs, RGB = ffmpegio.video.read('myvideo.mp4', t=1.0) # natively rgb24
>>> _, GRAY = ffmpegio.video.read('myvideo.mp4', t=1.0, pix_fmt='gray')
>>> RGB.shape
(29, 640, 480, 3)
>>> GRAY.shape
(29, 640, 480, 1)

>>> # auto-convert PNG image to remove transparency with white background
>>> RGBA = ffmpegio.image.read('myimage.png') # natively rgba with transparency
.. >>> RGB = ffmpegio.image.read('myimage.png', pix_fmt='rgb24', fill_color='white')
>>> RGB.shape
(100, 396, 4)
>>> RGB.shape
(100, 396, 3)

>>> # auto-convert to audio samples to double precision
>>> fs, x = ffmpegio.audio.read('myaudio.wav') # natively s16
>>> _, y = ffmpegio.audio.read('myaudio.wav', sample_fmt='dbl')
>>> x.max()
2324
>>> y.max()
0.0709228515625

Note when converting from an image with alpha channel (FFmpeg does not support alpha channel in video input) the background color may be specified with fill_color option (which defaults to 'white'). See the FFmpeg color specification for the list of predefined color names.

Examples of changing image format

'rgba' (original)

_images/quick-1.png
ffmpegio.image.read('ffmpeg-logo.png')

'rgb24' with ‘Linen’ background

_images/quick-2.png
ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='rgb24', fill_color='linen')

'ya8'

_images/quick-3.png
ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='ya8')

'gray' with light gray background

_images/quick-4.png
ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='gray',
    fill_color='#F0F0F0')

Progress Callback

FFmpeg has -progress option, which sends program-friendly progress information to url. ffmpegio takes advantage of this option to let user monitor the transcoding progress with a callback, which could be set with progress argument of all media operations. The callback function must have the following signature:

progress_callback(status:dict, done:bool) -> None|bool

The status dict containing the information similar to what FFmpeg displays on console. The second argument done is only True on the last progress call. Here is an example of status dict:

{'bitrate': '61.9kbits/s',
'drop_frames': 0,
'dup_frames': 0,
'fps': 336.18,
'frame': 1014,
'out_time': '00:00:33.877914',
'out_time_ms': 33877914,
'out_time_us': 33877914,
'speed': '11.2x',
'stream_0_0_q': 29.0,
'total_size': 262192}

While FFmpeg does not report percent progress, it is possible to compute it from frame or out_time if you know the total number of output frames or the output duration, respectively.

If an FFmpeg media stream object is invoked by ffmpegio.open() with progress callback argument, the callback function can terminate the FFmpeg execution by returning True. This feature is useful for GUI programming.