.. highlight:: python .. _quick: Quick Start Guide ================= Install ------- To use :py:mod:`ffmpegio`, the package must be installed on Python as well as having the FFmpeg binary files at a location :py:mod:`ffmpegio` can find. Install the full :py:mod:`ffmpegio` package via ``pip``: .. code-block:: bash pip install ffmpegio If `numpy.ndarray` data I/O is not needed, instead use .. code-block:: bash pip install ffmpegio-core If FFmpeg is not installed on your system, please follow the instructions on :ref:`Installation page ` Features -------- FFmpeg can read/write virtually any multimedia file out there, and :code:`ffmpegio` uses the FFmpeg's prowess to perform media I/O (and other) operations in Python. It offers two basic modes of operation: block read/write and stream read/write. For the read operations, it can output data either in a Numpy array or in a plain :code:`bytes`. The Numpy mode is enabled by default if Numpy is available in the system. Another feature of :code:`ffmpegio` is to report the properties of the media files, using FFprobe. Media Probe ----------- To process a media file, you first need to know what's in it. Within FFmpeg ecosystem, this task is handled by `ffprobe `__. :code:`ffmpegio`'s :ref:`ffmpegio:probe` module wraps ffprobe with 5 basic functions: .. code-block:: python >>> import ffmpegio >>> from pprint import pprint >>> url = 'mytestvideo.mpg' >>> format_info = ffmpegio.probe.format_basic(url) >>> pprint(format_info) {'duration': 66.403256, 'filename': 'mytestvideo.mpg', 'format_name': 'mpegts', 'nb_streams': 2, 'start_time': 0.0} >>> stream_info = ffmpegio.probe.streams_basic(url) >>> pprint(stream_info) [{'codec_name': 'mp2', 'codec_type': 'audio', 'index': 0}, {'codec_name': 'h264', 'codec_type': 'video', 'index': 1}] >>> vst_info = ffmpegio.probe.video_streams_basic(url) >>> pprint(vst_info) [{'codec_name': 'h264', 'display_aspect_ratio': Fraction(22, 15), 'duration': 66.39972222222222, 'frame_rate': Fraction(15000, 1001), 'height': 240, 'index': 1, 'pix_fmt': 'yuv420p', 'sample_aspect_ratio': Fraction(1, 1), 'start_time': 0.0, 'width': 352}] >>> ast_info = ffmpegio.probe.audio_streams_basic(url) >>> pprint(ast_info) [{'channel_layout': 'stereo', 'channels': 2, 'codec_name': 'mp2', 'duration': 66.40325555555556, 'index': 0, 'nb_samples': 2928384, 'sample_fmt': 'fltp', 'sample_rate': 44100, 'start_time': 0.0}] To obtain the complete ffprobe output, use :py:func:`ffmpegio.probe.full_details`, and to obtain specific format or stream fields, use :py:func:`ffmpegio.probe.query`. For more information on :py:mod:`probe`, see :ref:`probe`. Block Read/Write ---------------- Suppose you need to analyze short audio data in :code:`mytestfile.mp3`, you can read all its samples by .. code-block:: python >>> fs, x = ffmpegio.audio.read('mytestfile.wav') It returns the sampling rate :code:`fs` and :py:class:`numpy.ndarray` :code:`x`. The audio data is always represetned by a 2-D array, each of which column represents an audio channel. So, a 2-second stereo recording at 8000 samples/second yields :code:`x.shape` to be :code:`(16000,2)`. Also, the sample format is preserved: If the samples in the wav file is 16-bit, :code:`x` is of :code:`numpy.int16` dtype. Now, you've processed this audio data and produced the 8000-sample 1-D array :code:`y` at reduced sampling rate at 4000-samples/second. You want to save this new audio data as FLAC file. To do so, you run: .. code-block:: python >>> ffmpegio.audio.write('myoutput.flac', 4000, y) There are video counterparts to these two functions: .. code-block:: python >>> fs, F = ffmpegio.video.read('mytestvideo.mp4') >>> ffmpegio.video.write('myoutput.avi', fs, F) Let's suppose :code:`mytestvideo.mp4` is 10 seconds long, containing a :code:`yuv420p`-encoded color video stream with the frame size of 640x480 pixels, and the frame rate of 29.97 (30000/1001) frames/second. Then, the :py:func:`video.read` returns a 2-element tuple: the first element :code:`fs` is the frame rate in :py:class:`fractions.Fraction` and the second element :code:`F` contains all the frames of the video in :py:class:`numpy.ndarray` with shape :code:`(299, 480, 640, 3)`. Because the video is in color, each pixel is represented in 24-bit RGB, thus :code:`F.dtype` is :code:`numpy.uint8`. The video write is the reciprocal of the read operation. For image (or single video frame) I/O, there is a pair of functions as well: .. code-block:: python >>> I = ffmpegio.image.read('myimage.png') >>> ffmpegio.image.write('myoutput.bmp', I) The image data :code:`I` is like the video frame data, but without the leading dimension. .. _quick-streamio: Stream Read/Write ----------------- Block read/write is simple and convenient for a short file, but it quickly becomes slow and inefficient as the data size grows; this is especially true for video. To enable on-demand data retrieval, :code:`ffmpegio` offers stream read/write operation. It mimics the familiar Python's file I/O with :py:func:`ffmpegio.open()`: .. code-block:: python >>> with ffmpegio.open('mytestvideo.mp4', 'rv') as f: # opens the first video stream >>> print(f.rate) # frame rate fraction in frames/second >>> F = f.read() # read the first frame >>> F = f.read(5) # read the next 5 frames at once Another example, which uses read and write streams simultaneously: .. code-block:: python >>> with ffmpegio.open('mytestvideo.mp4', 'rv', blocksize=100) as f, >>> ffmpegio.open('myoutput.avi', 'wv', f.rate) as g: >>> for frames in f: # iterates over all frames, 100 frames at a time >>> output = my_processor(frames) # function to process data >>> g.write(output) # send the processed frames to 'myoutput.avi' By default, :code:`ffmpegio.open()` opens the first media stream available to read. However, the operation mode can be specified via the :code:`mode` second argument. The above example, opens :code:`mytestvideo.mp4` file in :code:`'rv'` or "read video" mode and :code:`myoutput.avi` in :code:`'wv'` or "write video" mode. The file reader object :code:`f` is an Iterable object, which returns the next set of frames (the number set by the :code:`blocksize` argument). For more, see :py:func:`ffmpegio.open`. Specify Read Time Range ----------------------- For both block and stream read operations, you can specify the time range to read data from. There are four options available: .. table:: Read Timing Options :class: tight-table ============= ======================================================================== Name Description ============= ======================================================================== :code:`ss` Start time in seconds :code:`t` Duration in seconds :code:`to` End time in seconds (ignored if :code:`t_in` is also specified) ============= ======================================================================== Note it is also possible to specify these timing options for the input (i.e., using the options :code:`ss_in`, :code:`t_in`, and :code:`to_in`). The input options, especially :code:`ss_in`, may run faster but potentially less accurate. See `FFmpeg documentation `__ for the explanation. .. code-block:: python >>> url = 'myvideo.mp4' >>> #read only the first 1 seconds >>> fs, F = ffmpegio.video.read(url, t=1.0) >>> #read from 1.2 second mark to 2.5 second mark >>> fs, F = ffmpegio.video.read(url, t=1.2, to=2.5) To specify by the frame numbers for video and sample numbers for audio, user must convert the units to seconds using :py:func:`probe`. For example: .. code-block:: python >>> # get frame rate of the (first) video stream >>> info = ffmpegio.probe.video_streams_basic('myvideo.mp4') >>> fs = info[0]['frame_rate'] >>> #read 30 frame from the 11th frame (remember Python uses 0-based index) >>> with ffmpegio.open('myvideo.mp4', 'rv', t=10/fs, t=30/fs) as f: >>> frame = f.read() >>> # do your thing with the frame data Likewise, for an audio input stream: .. code-block:: python >>> # get sampling rate of the (first) audio stream >>> info = ffmpegio.probe.audio_streams_basic('myaudio.wav') >>> fs = info[0]['sample_rate'] >>> #read first 10000 audio samples >>> fs, x = ffmpegio.audio.read('myaudio.wav', t=10000/fs) Specify Output Frame/Sample Size -------------------------------- FFmpeg let you change video size or the number of audio channels via output options :code:`s` and :code:`ac`, respectively, without setting up a filtergraph. For example, .. code-block:: python >>> # auto-scale video frame >>> fs, F = ffmpegio.video.read('myvideo.mp4', t=1.0) # natively 320x240 >>> F.shape (30, 240, 320, 3) >>> # halve the size >>> width = 160 >>> height = 120 >>> _, G = ffmpegio.video.read('myvideo.mp4', t=1.0, s=(width,height)) >>> G.shape (29, 120, 160, 3) >>> # auto-convert to mono >>> fs, x = ffmpegio.audio.read('myaudio.wav') # natively stereo >>> _, y = ffmpegio.audio.read('myaudio.wav', ac=1) # to mono >>> x.shape (44100, 2) >>> y.shape (44100, 1) To customize the conversion configuration, use :code:`vf` output option with with :code:`scale` filter or :code:`af` output option with :code:`channelmap` or :code:`pan` or other channel mixing filter Specify Sample Formats ---------------------- FFmpeg can also convert the formats of video pixels and sound samples on the fly. This feature is enabled in :py:mod:`ffmpegio` via output options :code:`pix_fmt` for video and :code:`sample_fmt` for audio. .. table:: Video :code:`pix_fmt` Option Values :class: tight-table =============== ======================================== :code:`pix_fmt` Description =============== ======================================== :code:`gray` grayscale :code:`ya8` grayscale with transparent alpha channel :code:`rgb24` RGB :code:`rgba` RGB with alpha transparent alpha channel =============== ======================================== .. table:: Audio :code:`sample_fmt` Option Values :class: tight-table ================== =============================== =========== ========== :code:`sample_fmt` Description min max ================== =============================== =========== ========== :code:`u8` unsigned 8-bit integer 0 255 :code:`s16` signed 16-bit integer -32768 32767 :code:`s32` signed 32-bit integer -2147483648 2147483647 :code:`flt` single-precision floating point -1.0 1.0 :code:`dbl` double-precision floating point -1.0 1.0 ================== =============================== =========== ========== .. highlight:: python For example, .. code-block:: python >>> # auto-convert video frames to grayscale >>> fs, RGB = ffmpegio.video.read('myvideo.mp4', t=1.0) # natively rgb24 >>> _, GRAY = ffmpegio.video.read('myvideo.mp4', t=1.0, pix_fmt='gray') >>> RGB.shape (29, 640, 480, 3) >>> GRAY.shape (29, 640, 480, 1) >>> # auto-convert PNG image to remove transparency with white background >>> RGBA = ffmpegio.image.read('myimage.png') # natively rgba with transparency .. >>> RGB = ffmpegio.image.read('myimage.png', pix_fmt='rgb24', fill_color='white') >>> RGB.shape (100, 396, 4) >>> RGB.shape (100, 396, 3) >>> # auto-convert to audio samples to double precision >>> fs, x = ffmpegio.audio.read('myaudio.wav') # natively s16 >>> _, y = ffmpegio.audio.read('myaudio.wav', sample_fmt='dbl') >>> x.max() 2324 >>> y.max() 0.0709228515625 Note when converting from an image with alpha channel (FFmpeg does not support alpha channel in video input) the background color may be specified with :code:`fill_color` option (which defaults to ``'white'``). See `the FFmpeg color specification `__ for the list of predefined color names. .. list-table:: Examples of changing image format :class: tight-table * - :code:`'rgba'` (original) - .. plot:: IM = ffmpegio.image.read('ffmpeg-logo.png') plt.figure(figsize=(IM.shape[1]/96, IM.shape[0]/96), dpi=96) plt.imshow(IM) plt.gca().set_position((0, 0, 1, 1)) plt.axis('off') .. code-block:: python ffmpegio.image.read('ffmpeg-logo.png') * - :code:`'rgb24'` with 'Linen' background - .. plot:: IM = ffmpegio.image.read('ffmpeg-logo.png') plt.figure(figsize=(IM.shape[1]/96, IM.shape[0]/96), dpi=96) plt.imshow(IM) plt.gca().set_position((0, 0, 1, 1)) plt.axis('off') .. code-block:: python ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='rgb24', fill_color='linen') * - :code:`'ya8'` - .. plot:: IM = ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='ya8') plt.figure(figsize=(IM.shape[1]/96, IM.shape[0]/96), dpi=96) plt.imshow(IM[...,0], alpha=IM[...,1]/255, cmap='gray') plt.gca().set_position((0, 0, 1, 1)) plt.axis('off') .. code-block:: python ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='ya8') * - :code:`'gray'` with light gray background - .. plot:: IM = ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='gray', fill_color='#F0F0F0') plt.figure(figsize=(IM.shape[1]/96, IM.shape[0]/96), dpi=96) plt.imshow(IM, cmap='gray') plt.gca().set_position((0, 0, 1, 1)) plt.axis('off') .. code-block:: python ffmpegio.image.read('ffmpeg-logo.png', pix_fmt='gray', fill_color='#F0F0F0') .. _quick-callback: Progress Callback ----------------- FFmpeg has :code:`-progress` option, which sends program-friendly progress information to url. :py:mod:`ffmpegio` takes advantage of this option to let user monitor the transcoding progress with a callback, which could be set with :code:`progress` argument of all media operations. The callback function must have the following signature: .. code-block:: python progress_callback(status:dict, done:bool) -> None|bool The :code:`status` dict containing the information similar to what FFmpeg displays on console. The second argument :code:`done` is only :code:`True` on the last progress call. Here is an example of :code:`status` dict: .. code-block:: python {'bitrate': '61.9kbits/s', 'drop_frames': 0, 'dup_frames': 0, 'fps': 336.18, 'frame': 1014, 'out_time': '00:00:33.877914', 'out_time_ms': 33877914, 'out_time_us': 33877914, 'speed': '11.2x', 'stream_0_0_q': 29.0, 'total_size': 262192} While FFmpeg does not report percent progress, it is possible to compute it from :code:`frame` or :code:`out_time` if you know the total number of output frames or the output duration, respectively. If an FFmpeg media stream object is invoked by :py:func:`ffmpegio.open` with :code:`progress` callback argument, the callback function can terminate the FFmpeg execution by returning :code:`True`. This feature is useful for GUI programming.