《第12章MPEG视频编码II.ppt》由会员分享,可在线阅读,更多相关《第12章MPEG视频编码II.ppt(35页珍藏版)》请在三一文库上搜索。
1、第12章 MPEG视频编码II,目录,MPEG-4概述 可视对象编码 合成对象编码,MPEG-4 overview,MPEG-4 可视对象编码的特点,综合性:自然音视频对象与合成音视频对象的集成 交互性:选择播放,超链等 高效率的压缩编码:1/51/10的MPEG2码率,几乎相同的质量,MPEG-4可视对象的编码,第1代视频编码,The smallest entity in a picture is a pixel with its associated texture (color), and motion Message to be coded for every pixel: textu
2、re (color) + motion,第1代视频编码的不足, 与人的视觉本质不同 不易控制场景中的不同对象 潜力有限,第2代视频编码,将一个场景分为一系列组成对象,对每个对象分别编码,第2代视频编码,第2代视频编码,The smallest entity in a picture is an object with its associated shape, texture (color), and motion Message to be coded for every pixel: shape + texture (color) + motion,MPEG-4的音视场景,MPEG-4音视
3、场景的描述,在MPEG-4中,音视场景采用基于对象的描述方式,场景由媒体对象以层次方式组合而成(树),叶节点是初级(primitive) 媒体对象,例如: 静止图像 (固定不变的背景), 视频对象 (没有背景的说话人) 音频对象 (说话人所发出的声音); 其他,如文本和图形. 初级媒体对象可以是自然的,也可以是人造(合成)的, 可以是 2维,也可以是3维. 使用BIFS的(Binary Format for Scenes)语言来对场景的组成、场景中的音视对象的时空关系进行描述,MPEG-4的音视场景,MPEG-4 场景描述的优点,可以集成各种对象,无缝地集成自然媒体(源于麦克风、摄象机等)与人
4、造媒体(计算机生成) 、实时信息与存储信息, AV0可以是单双多声道音频信息、单双多镜头2D3D视频信息。 提供更强的交互能力,场景中的对象(人、桌子、地球仪、白扳、人的声音)以及多媒体演示声音均作为单个对象而独立编码,用户可以有选择地与其中某(几)个对象交互。 具有良好的重用性,可重新组合音视对象 AVO (Audio Visual Object)构造新场景。,BIFS 示例,MPEG-4视频流结构,视觉对象序列(VS:Visual Object Sequence),视频对象(VO:Video Object),视频对象层(VOL:Video Object Layer),视频对象平面组(GOV
5、: Group Of VOP),视频对象平面(VOP:Video Object plane),VOP的编码,VOP的描述:形状(shape)、运动(motion)、纹理(texture)。,基于VOP的运动补偿,MC-based VOP coding in MPEG-4 again involves three steps: Motion Estimation. MC-based Prediction. Coding of the prediction error. Only pixels within the VOP of the current (Target) VOP are consi
6、dered for matching in MC. To facilitate MC, each VOP is divided into many macro blocks (MBs). MBs are by default 1616 in luminance images and 88 in chrominance images.,Motion Compensation,Padding,An example of Repetitive Padding in a boundary macroblock of a Reference VOP: (a) Original pixels within
7、 the VOP, (b) After Horizontal Repetitive Padding, (c) Followed by Vertical Repetitive Padding.,Motion Vector,目标VOP中的每个宏块在参考VOP中寻找一个最佳匹配宏块。 N the size of the MB. Map(p; q) = 1 when C(p; q) is a pixel within the target VOP, otherwise Map(p; q) = 0. 运动矢量编码与h.263类似,采用预测编码,Texture Coding,Texture coding
8、in MPEG-4 can be based on: DCT or Shape Adaptive DCT (SA-DCT). I. Texture coding based on DCT In I-VOP, the gray values of the pixels in each MB of the VOP are directly coded using the DCT followed by VLC, similar to what is done in JPEG. In P-VOP or B-VOP, MC-based coding is employed it is the pred
9、iction error that is sent to DCT and VLC.,Texture Coding(cont.),Coding for the Interior MBs: Each MB is 1616 in the luminance VOP and 88 in the chrominance VOP. Prediction errors from the six 88 blocks of each MB are obtained after the conventional motion estimation step. Coding for Boundary MBs: Fo
10、r portions of the Boundary MBs in the Target VOP outside of the VOP, zeros are padded to the block sent to DCT since ideally prediction errors would be near zero inside the VOP. After MC, texture prediction errors within the Target VOP are obtained.,Shape Adaptive DCT,优点:不会产生多余的系数 缺点:需要额外的模板记录最初的形状,
11、Shape Adaptive DCT (SA-DCT) is another texture coding method for boundary MBs. Due to its efctiveness, SA-DCT has been adopted for coding boundary MBs in MPEG-4 Version 2.,Shape Adaptive DCT(cont.),Shape Coding,MPEG-4 supports two types of shape information, binary and gray scale. Binary shape infor
12、mation can be in the form of a binary map (also known as binary alpha map) that is of the size as the rectangular bounding box of the VOP. A value 1 (opaque) or 0 (transparent) in the bitmap indicates whether the pixel is inside or outside the VOP. Alternatively, the gray-scale shape information act
13、ually refers to the transparency of the shape, with gray values ranging from 0 (completely transparent) to 255 (opaque).,分割出来的前景图 像作为一个任意形状 的VO进行编码,只在视频序列的第1帧 画面时传输1次,保存在 背景缓冲器中, 此后仅 仅传输描述镜头运动 的8个参数,Sprite Coding,在编码前从一系列的 视频画面中把背景图 像抽出并拼合而成,使用8个参数,对背景 进行仿射变换,重建出 每一帧画面的背景,MPEG-4的合成对象编码,2D Mesh Codin
14、g,Uniformmesh Delaunay,Coding of Delaunay Triangulation,Except for the first location (x0, y0), all subsequent coordinates are coded differentially that is, for n1, dxn = xn xn1; dyn=ynyn1; and afterward, dxn, dyn are variable-length coded.,2D Mesh Motion coding,A new mesh structure can be created o
15、nly in the Intra-frame, and its triangular topology will not alter in the subsequent Inter-frames enforces a one-to-one mapping in 2D mesh motion estimation. For any MOP triangle (Pi, Pj, Pk), if the motion vectors for Pi and Pj are known to be MVi and MVj, then a prediction Predk will be made for the motion vector of Pk and this is rounded to a half-pixel precision: Predk = 0.5 (MVi + MVj) The prediction error ek is coded as ek = MVk Predk,3D合成对象的编码,人脸动画 MPEG-4定义了人脸定义参数(FDP)和人脸动画参数(FAP),也定义了身体的模型参数和动画参数。在解码器中的人脸模型能通过传来的动画参数产生各种运动,如表情、说话等。也可以通过下载人脸的模型参数由一个通用的人脸模型生成一个特定的人脸。,
链接地址:https://www.31doc.com/p-3124410.html