采集–推流–SRS服务器转发–拉流

音频:麦克风采集–音效变音–编码(压缩数据)解码–RTMP推流
视频摄像头采集(RGB/YUV)–图像处理–H264编码–RTMP推流

拉流:WebRtc/HL/ HTTP-FLV
解码+音视频同步–>输出

封装:

MPEG-4标准是一套音视频信息的压缩编码标准,

H264是视频压缩算法

MP4是音视频格式,不一定遵守MPEG-4

AVI格式:压缩标准可任意选择

FLV:视频流格式,常用于直播

ts:流媒体格式,电视直播

编码

视频编码

视频编码格式
H264
wmv
Xvid
mjpeg

音频编码

音频格式
ACC
MP3
ape
flac

解码:

视频解码为YUV–>转为RGB显示(GPU)

image-20240131102749379

YUV:Y表示明亮度/灰度值,U和V表示的是色度

重采样:

像素格式:

PCM音频参数

参数
采样率simple_rate(每秒采集频率)
通道channels(左右通道)
样本大小sample_size
样本类型

H.264/AVC视频编码标准

ffmpeg目录结构

libavformat:用于各种音视频封装格式的生成和解析,包括获取解码所需信息以生成解码上下文结构和读取音视频帧等功能,包含demuxers和muxer库。
libavcodec:用于各种类型声音/图像编解码。
libavutil:包含一些公共的工具函数。
libswscale:用于视频场景比例缩放、色彩映射转换。
libpostproc:用于后期效果处理。

ffmpeg:是一个命令行工具,用来对视频文件转换格式,也支持对电视卡实时编码。
ffsever:是一个HTTP多媒体实时广播流服务器,支持时光平移。
ffplay:是一个简单的播放器,使用ffmpeg 库解析和解码,通过SDL显示。
ffprobe:收集多媒体文件或流的信息,并以人和机器可读的方式输出。

解封装

1
2
3
av_register_all();//注册解封装格式
avformat_network_init();//RTMP,HTTP

1
int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);//打开文件/网络
AVFormatContext **ps 上下文
const char *url 本地文件地址/远程URL地址
AVInputFormat *fmt
AVDictionary **options

AVFormatContext

1
avformat_alloc_context();//分配封装器的上下文
1
2
3
4
5
6
7
8
9
AVFormatContext
{
AVIOContext *pb;
char *url;
unsigned int nb_streams;
AVStream **streams;
int64_t duration;//AV_TIME_BASE
int64_t bit_rate;
}
1
void avformat_close_input(AVFormatContext **s);//关闭上下文

打印视频信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/**
* Print detailed information about the input or output format, such as
* duration, bitrate, streams, container, programs, metadata, side data,
* codec and time base.
*
* @param ic the context to analyze
* @param index index of the stream to dump information about
* @param url the URL to print, such as source or destination file
* @param is_output Select whether the specified context is an input(0) or output(1)
*/
void av_dump_format(AVFormatContext *ic,
int index,
const char *url,
int is_output);

AVSteram相关

1
avformat_find_stream_info();//查找文件格式和索引
1
av_find_best_stream();
1
2
3
4
5
6
7
8
AVStream{
AVCodecContext *codec;
AVRational time_base;
int64_t duration;
int64_t nb_frames;
AVRational avg_frame_rate;
AVCodecParametes *codecpar;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
AVCodecParameters
{
/**
* General type of the encoded data.
*/
enum AVMediaType codec_type;
/**
* Specific type of the encoded data (the codec used).
*/
enum AVCodecID codec_id;
/**
* Additional information about the codec (corresponds to the AVI FOURCC).
*/
uint32_t codec_tag;
/**
* - video: the pixel format, the value corresponds to enum AVPixelFormat.
* - audio: the sample format, the value corresponds to enum AVSampleFormat.
*/
int format;
/**
* Video only. The dimensions of the video frame in pixels.
*/
int width;
int height;
/**
* Audio only. The channel layout bitmask. May be 0 if the channel layout is
* unknown or unspecified, otherwise the number of bits set must be equal to
* the channels field.
* @deprecated use ch_layout
*/
attribute_deprecated
uint64_t channel_layout;
int channels;
/**
* Audio only. The number of audio samples per second.
*/
int sample_rate;
/**
* Audio only. Audio frame size, if known. Required by some formats to be static.
*/
int frame_size;
}
1
2
3
4
5
6
7
8
9
enum AVMediaType {
AVMEDIA_TYPE_UNKNOWN = -1, ///< Usually treated as AVMEDIA_TYPE_DATA
AVMEDIA_TYPE_VIDEO,
AVMEDIA_TYPE_AUDIO,
AVMEDIA_TYPE_DATA, ///< Opaque data information usually continuous
AVMEDIA_TYPE_SUBTITLE,
AVMEDIA_TYPE_ATTACHMENT, ///< Opaque data information usually sparse
AVMEDIA_TYPE_NB
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
//音频样本格式
enum AVSampleFormat {
AV_SAMPLE_FMT_NONE = -1,
AV_SAMPLE_FMT_U8, ///< unsigned 8 bits
AV_SAMPLE_FMT_S16, ///< signed 16 bits
AV_SAMPLE_FMT_S32, ///< signed 32 bits
AV_SAMPLE_FMT_FLT, ///< float
AV_SAMPLE_FMT_DBL, ///< double

AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar
AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar
AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar
AV_SAMPLE_FMT_FLTP, ///< float, planar
AV_SAMPLE_FMT_DBLP, ///< double, planar
AV_SAMPLE_FMT_S64, ///< signed 64 bits
AV_SAMPLE_FMT_S64P, ///< signed 64 bits, planar

AV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically
};
1
2
3
4
5
6
enum AVCodecID {
AV_CODEC_ID_NONE,
AV_CODEC_ID_MJPEG,
AV_CODEC_ID_H264,
......
}

读取数据

1
int av_read_frame(AVFormatContext *s, AVPacket *pkt);//读取packet
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
typedef struct AVPacket {
/**
* A reference to the reference-counted buffer where the packet data is stored.
* May be NULL, then the packet data is not reference-counted.
*/
AVBufferRef *buf;
/**
* Presentation timestamp in AVStream->time_base units; the time at which
* the decompressed packet will be presented to the user.
* Can be AV_NOPTS_VALUE if it is not stored in the file.
* pts MUST be larger or equal to dts as presentation cannot happen before
* decompression, unless one wants to view hex dumps. Some formats misuse
* the terms dts and pts/cts to mean something different. Such timestamps
* must be converted to true pts/dts before they are stored in AVPacket.
*/
int64_t pts;
/**
* Decompression timestamp in AVStream->time_base units; the time at which
* the packet is decompressed.
* Can be AV_NOPTS_VALUE if it is not stored in the file.
*/
int64_t dts;
uint8_t *data;
int size;
int stream_index;
/**
* A combination of AV_PKT_FLAG values
*/
int flags;
/**
* Additional packet data that can be provided by the container.
* Packet can contain several types of side information.
*/
AVPacketSideData *side_data;
int side_data_elems;

/**
* Duration of this packet in AVStream->time_base units, 0 if unknown.
* Equals next_pts - this_pts in presentation order.
*/
int64_t duration;

int64_t pos; ///< byte position in stream, -1 if unknown

/**
* for some private data of the user
*/
void *opaque;

/**
* AVBufferRef for free use by the API user. FFmpeg will never check the
* contents of the buffer ref. FFmpeg calls av_buffer_unref() on it when
* the packet is unreferenced. av_packet_copy_props() calls create a new
* reference with av_buffer_ref() for the target packet's opaque_ref field.
*
* This is unrelated to the opaque field, although it serves a similar
* purpose.
*/
AVBufferRef *opaque_ref;

/**
* Time base of the packet's timestamps.
* In the future, this field may be set on packets output by encoders or
* demuxers, but its value will be by default ignored on input to decoders
* or muxers.
*/
AVRational time_base;
} AVPacket;

AVPacket

1
2
3
4
//初始化
av_packet_alloc(void);//自动分配
av_init_packet(AVPacket *pkt);
av_packet_form_data(AVPacket *pkt,uint_8 *data,int size);
1
2
//复制
AVPacket *av_packet_clone(const AVPacket *src);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/**
* Setup a new reference to the data described by a given packet
*
* If src is reference-counted, setup dst as a new reference to the
* buffer in src. Otherwise allocate a new buffer in dst and copy the
* data from src into it.
*
* All the other fields are copied from src.
*
* @see av_packet_unref
*
* @param dst Destination packet. Will be completely overwritten.
* @param src Source packet
*
* @return 0 on success, a negative AVERROR on error. On error, dst
* will be blank (as if returned by av_packet_alloc()).
*/
int av_packet_ref(AVPacket *dst, const AVPacket *src);//增加引用

减少引用

1
2
3
4
5
6
7
8
9
/**
* Wipe the packet.
*
* Unreference the buffer referenced by the packet and reset the
* remaining packet fields to their default values.
*
* @param pkt The packet to be unreferenced.
*/
void av_packet_unref(AVPacket *pkt);

清空对象

1
2
3
4
5
6
7
8
/**
* Free the packet, if the packet is reference counted, it will be
* unreferenced first.
*
* @param pkt packet to be freed. The pointer will be set to NULL.
* @note passing NULL is a no-op.
*/
void av_packet_free(AVPacket **pkt);

seek

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/**
* Seek to the keyframe at timestamp.
* 'timestamp' in 'stream_index'.
*
* @param s media file handle
* @param stream_index If stream_index is (-1), a default stream is selected,
* and timestamp is automatically converted from
* AV_TIME_BASE units to the stream specific time_base.
* @param timestamp Timestamp in AVStream.time_base units or, if no stream
* is specified, in AV_TIME_BASE units.
* @param flags flags which select direction and seeking mode
*
* @return >= 0 on success
*/
int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp,int flags);

image-20240131152428573

解码

查找解码器

image-20240201093344412

解码上下文

image-20240201093831142

AVFrame

image-20240201101201075

image-20240201101213861

AVFrame的属性

image-20240201101358374

image-20240201101422500

linesize二维数组

image-20240201101621536

收发包

avcodec_send_packet()

image-20240201101915154

放入队列

avcodec_receive_packet()

image-20240201102039718

从frame队列中取数据

像素转化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/**
* Allocate and return an SwsContext. You need it to perform
* scaling/conversion operations using sws_scale().
*
* @param srcW the width of the source image
* @param srcH the height of the source image
* @param srcFormat the source image format
* @param dstW the width of the destination image
* @param dstH the height of the destination image
* @param dstFormat the destination image format
* @param flags specify which algorithm and options to use for rescaling
* @param param extra parameters to tune the used scaler
* For SWS_BICUBIC param[0] and [1] tune the shape of the basis
* function, param[0] tunes f(1) and param[1] f´(1)
* For SWS_GAUSS param[0] tunes the exponent and thus cutoff
* frequency
* For SWS_LANCZOS param[0] tunes the width of the window function
* @return a pointer to an allocated context, or NULL in case of error
* @note this function is to be removed after a saner alternative is
* written
*/
struct SwsContext *sws_getContext(int srcW, int srcH, enum AVPixelFormat srcFormat,
int dstW, int dstH, enum AVPixelFormat dstFormat,
int flags, SwsFilter *srcFilter,
SwsFilter *dstFilter, const double *param);

flag参数

image-20240201113426008

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/**
* Check if context can be reused, otherwise reallocate a new one.
*
* If context is NULL, just calls sws_getContext() to get a new
* context. Otherwise, checks if the parameters are the ones already
* saved in context. If that is the case, returns the current
* context. Otherwise, frees context and gets a new context with
* the new parameters.
*
* Be warned that srcFilter and dstFilter are not checked, they
* are assumed to remain the same.
*/
struct SwsContext *sws_getCachedContext(struct SwsContext *context,
int srcW, int srcH, enum AVPixelFormat srcFormat,
int dstW, int dstH, enum AVPixelFormat dstFormat,
int flags, SwsFilter *srcFilter,
SwsFilter *dstFilter, const double *param);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**

* Scale the image slice in srcSlice and put the resulting scaled
* slice in the image in dst. A slice is a sequence of consecutive
* rows in an image.
*
* Slices have to be provided in sequential order, either in
* top-bottom or bottom-top order. If slices are provided in
* non-sequential order the behavior of the function is undefined.
*
* @param c the scaling context previously created with
* sws_getContext()
* @param srcSlice the array containing the pointers to the planes of
* the source slice
* @param srcStride the array containing the strides for each plane of
* the source image
* @param srcSliceY the position in the source image of the slice to
* process, that is the number (counted starting from
* zero) in the image of the first row of the slice
* @param srcSliceH the height of the source slice, that is the number
* of rows in the slice
* @param dst the array containing the pointers to the planes of
* the destination image
* @param dstStride the array containing the strides for each plane of
* the destination image
* @return the height of the output slice
*/
int sws_scale(struct SwsContext *c, const uint8_t *const srcSlice[],
const int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *const dst[], const int dstStride[]);

音频重采样

image-20240201120705012

image-20240201120733629

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/**
* Allocate SwrContext if needed and set/reset common parameters.
*
* This function does not require s to be allocated with swr_alloc(). On the
* other hand, swr_alloc() can use swr_alloc_set_opts() to set the parameters
* on the allocated context.
*
* @param s existing Swr context if available, or NULL if not
* @param out_ch_layout output channel layout (AV_CH_LAYOUT_*)
* @param out_sample_fmt output sample format (AV_SAMPLE_FMT_*).
* @param out_sample_rate output sample rate (frequency in Hz)
* @param in_ch_layout input channel layout (AV_CH_LAYOUT_*)
* @param in_sample_fmt input sample format (AV_SAMPLE_FMT_*).
* @param in_sample_rate input sample rate (frequency in Hz)
* @param log_offset logging level offset
* @param log_ctx parent logging context, can be NULL
*
* @see swr_init(), swr_free()
* @return NULL on error, allocated context otherwise
* @deprecated use @ref swr_alloc_set_opts2()
*/
attribute_deprecated
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,
int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
int64_t in_ch_layout, enum AVSampleFormat in_sample_fmt, int in_sample_rate,
int log_offset, void *log_ctx);