介绍
mp4⽂件格式又被称为MPEG-4 Part 14,出⾃MPEG-4标准第14部分 。它是⼀种多媒体格式容器,广泛用于包装视频和⾳频数据流、海报、字幕和元数据等。(顺便⼀提,⽬前流行的视频编码格式AVC/H264 定义在MPEG-4 Part 10)。
mp4⽂件格式基于Apple公司的QuickTime格式,因此,QuickTime File Format Specification 也可以作为我们研究mp4的重要参考
MP4文件结构的资料
mp4box大杀器
概述
mp4⽂件由box组成,每个box分为Header和Data。其中Header部分包含了box的类型和大小,Data包含 了子box或者数据,box可以嵌套⼦box。
下图是⼀个典型mp4⽂件的基本结构
MP4文件的基本组成单元是box,也就是说MP4⽂件是由各种各样的box组成的,有parent box,还有 children box。因此,这些boxes之间存在⼀定的层次关系,总结如下表所示,表中标记出了各个box必选 或可选特性,√代表Box必选。
ftyp | √ | file type and compatibility ⽂件类型和兼容性 |
|||||
---|---|---|---|---|---|---|---|
pdin | progressive download information | ||||||
moov | √ | container for all the metadata 所有元数据的容器 |
|||||
mvhd | √ | movie header, overall declarations 电影头,整体声明 |
|||||
trak | √ | container for an individual track or stream 单个轨或流的容器 |
|||||
tkhd | √ | track header, overall information about the track 轨的头部,关于该轨的概括信息,比如视频宽高 |
|||||
tref | track reference container | ||||||
edts | edit list container | ||||||
elst | an edit list | ||||||
mdia | √ | container for the media information in a track 轨媒体信息的容器 |
|||||
mdhd | √ | media header, overall information about the media 媒体头,关于媒体的总体信息 | |||||
hdlr | √ | handler, declares the media (handler) type 媒体的播放过程信息 |
|||||
minf | √ | media information container 媒体信息容器 |
|||||
vmhd | video media header, overall information (video track only) |
||||||
hmhd | hint media header, overall information (hint track only) |
||||||
nmhd | Null media header, overall information (some tracks only) |
||||||
dinf | √ | data information box, container 数据信息box,容器 |
|||||
dref | √ | data reference box, declares source(s) of media data in track 如何定位媒体信息 |
|||||
stbl | √ | sample table box, container for the time/space map 包含了track中的sample的所有时间和位置信息,以及sample的编解码等信息。利⽤这个表 可以解析sample的时序、类型、大小以及在各自存储容器中的位置。 |
|||||
stsd | √ | sample descriptions (codec types, initialization etc.) 如果是视频,包含:编码类型、宽⾼、⻓度等 信息; 如果是⾳频,包含:声道、采样率等信息 |
|||||
stts | √ | (decoding) time-to-sample 描述了sample时序的映射⽅法,我们可以通过 它找到任何时间的sample。 |
|||||
ctts | (composition) time to sample | ||||||
stsc | √ | sample-to-chunk, partial data-offset information ⽤chunk组织sample可以⽅便优化数据获取, ⼀个chunk包含⼀个或多个sample。 |
|||||
stsz | sample sizes (framing) 每个sample的大小。 虽然这⾥没有打勾,但对于mp4还是非常必要的。 |
||||||
stz2 | compact sample sizes (framing) | ||||||
stco | √ | chunk offset, partial data-offset information 定义了每个chunk在媒体流中的偏移位置 |
|||||
co6 | 64-bit chunk offset | ||||||
stss | sync sample table (random access points) ⽤于确定media中的关键帧 |
||||||
stsh | shadow sync sample table | ||||||
padb | sample padding bits | ||||||
stdp | sample degradation priority | ||||||
sdtp independent and disposable samples | |||||||
sbgp | sample-to-group | ||||||
sgpd | sample group description | ||||||
subs | sub-sample information | ||||||
mvex | movie extends box | ||||||
mehd | movie extends header box | ||||||
trex | √ track extends defaults | ||||||
ipmc | IPMP Control Box | ||||||
moof | movie fragment | ||||||
mfhd | √ | movie fragment header | |||||
traf | track fragment | ||||||
tfhd | √ | track fragment header | |||||
trun | track fragment run | ||||||
sdtp | independent and disposable samples | ||||||
sbgp | sample-to-group subs sub-sample information | ||||||
mfra | movie fragment random access | ||||||
tfra | track fragment random access | ||||||
mfro | √ | movie fragment random access offset | |||||
mdat | media data container | ||||||
free | free space | ||||||
skip | free space | ||||||
udta | user-data | ||||||
cprt | copyright etc. | ||||||
meta | metadata | ||||||
hdlr | √ | handler, declares the metadata (handler) type | |||||
dinf | data information box, container | ||||||
dref | data reference box, declares source(s) of metadata items | ||||||
ipmc | IPMP Control Box | ||||||
iloc | item location | ||||||
ipro | item protection | ||||||
sinf | protection scheme information box | ||||||
frma | original format box | ||||||
imif | IPMP Information box | ||||||
schm | scheme type box | ||||||
schi | scheme information box | ||||||
iinf | item information | ||||||
xml | XML container | ||||||
bxml | binary XML container | ||||||
pitm | primary item reference | ||||||
fiin | file delivery item information | ||||||
paen | partition entry | ||||||
fpar | file partition | ||||||
fecr | FEC reservoir | ||||||
segr | file delivery session group | ||||||
gitn | group id to name | ||||||
tsel | track selection | ||||||
meco | additional metadata container | ||||||
mere | metabox relation |
本文使用mediainfo和mp4box进⾏分析
图中看到mp4⽂件由几个主要组成部分,分析案例:
2_audio_track_5s
ftyp — File Type Box
⼀般在文件的开始位置,描述的⽂件的版本、兼容协议等。
moov — Movie Box
Movie Box,包含本⽂件中所有媒体数据的宏观描述信息以及每路媒体轨道的具体信息。⼀般位于放在⽂ 件末尾,但如果为了⽀持http边下载边播放则需要将moov提前。
注意,当改变moov位置时,内部⼀些值 需要重新计算。
moov⾥⾯的box才是我们主要分析的box
mdat — Media Data Box
存放具体的媒体数据。
Moov Insider
mp4的媒体数据信息主要存放在Moov Box中,是我们需要分析的重点。moov的主要组成部分如下:
mvhd — Movie Header Box
Movie Header Box,记录整个媒体⽂件的描述信息,如创建时间、修改时间、时间度量标尺、可播放时 ⻓等。
下图示例中,可以获取⽂件信息如时⻓为 Duration: 5016 ms秒
文件持续播放时间:Duration/Time scale=5.016秒
14B2D6 Movie header (108 bytes)
14B2D6 Header (8 bytes)
14B2D6 Size: 108 (0x0000006C)
14B2DA Name: mvhd
14B2DE Version: 0 (0x00)
14B2DF Flags: 0 (0x000000)
14B2E2 Creation time: 0 (0x00000000) -
14B2E6 Modification time: 0 (0x00000000) -
14B2EA Time scale: 1000 (0x000003E8) - 1000 Hz
14B2EE Duration: 5016 (0x00001398) - 5016 ms
14B2F2 Preferred rate: 65536 (0x00010000) - 1.000
14B2F6 Preferred volume: 256 (0x0100) - 1.000
14B2F8 Reserved: (10 bytes)
14B302 Matrix structure (36 bytes)
14B302 a (width scale): 1.000
14B306 b (width rotate): 0.000
14B30A u (width angle): 0.000
14B30E c (height rotate): 0.000
14B312 d (height scale): 1.000
14B316 v (height angle): 0.000
14B31A x (position left): 0.000
14B31E y (position top): 0.000
14B322 w (divider): 1.000
14B326 Preview time: 0 (0x00000000)
14B32A Preview duration: 0 (0x00000000)
14B32E Poster time: 0 (0x00000000)
14B332 Selection time: 0 (0x00000000)
14B336 Selection duration: 0 (0x00000000)
14B33A Current time: 0 (0x00000000)
14B33E Next track ID: 4 (0x00000004)
udta — User Data Box
⾃定义数据
track — Track Box
记录媒体流信息,⽂件中可以存在⼀个或多个track,它们之间是相互独立的。
每个track包含以下⼏个组成部分:
tkhd — Track Header Box
包含关于媒体流的头信息。 下图示例中,可以看到流信息如视频流宽度720,长度1280。
视频的tkhd
14CEA6 Track Header - 3 (0x3) - 4875 (0x130B) ms (92 bytes)
14CEA6 Header (8 bytes)
14CEA6 Size: 92 (0x0000005C)
14CEAA Name: tkhd
14CEAE Version: 0 (0x00)
14CEAF Flags: 3 (0x000003)
14CEB2 Track Enabled: Yes
14CEB2 Track in Movie: 2 (0x0000000000000002)
14CEB2 Track in Preview: 0 (0x0000000000000000)
14CEB2 Track in Poster: 0 (0x0000000000000000)
14CEB2 Creation time: 0 (0x00000000) -
14CEB6 Modification time: 0 (0x00000000) -
14CEBA Track ID: 3 (0x00000003)
14CEBE Reserved: 0 (0x00000000)
14CEC2 Duration: 4875 (0x0000130B) - 4875 (0x130B) ms
14CEC6 Reserved: 0 (0x00000000)
14CECA Reserved: 0 (0x00000000)
14CECE Layer: 0 (0x0000)
14CED0 Alternate group: 2 (0x0002)
14CED2 Volume: 0 (0x0000) - 0.000
14CED4 Reserved: 0 (0x0000)
14CED6 Matrix structure (36 bytes)
14CED6 a (width scale): 1.000
14CEDA b (width rotate): 0.000
14CEDE u (width angle): 0.000
14CEE2 c (height rotate): 0.000
14CEE6 d (height scale): 1.000
14CEEA v (height angle): 0.000
14CEEE x (position left): 0.000
14CEF2 y (position top): 0.000
14CEF6 w (divider): 1.000
14CEFA Track width: 1920.000
14CEFE Track height: 800.000
音频的tkhd
14B34A Track Header - 1 (0x1) - 5016 (0x1398) ms (92 bytes)
14B34A Header (8 bytes)
14B34A Size: 92 (0x0000005C)
14B34E Name: tkhd
14B352 Version: 0 (0x00)
14B353 Flags: 3 (0x000003)
14B356 Track Enabled: Yes
14B356 Track in Movie: 2 (0x0000000000000002)
14B356 Track in Preview: 0 (0x0000000000000000)
14B356 Track in Poster: 0 (0x0000000000000000)
14B356 Creation time: 0 (0x00000000) -
14B35A Modification time: 0 (0x00000000) -
14B35E Track ID: 1 (0x00000001)
14B362 Reserved: 0 (0x00000000)
14B366 Duration: 5016 (0x00001398) - 5016 (0x1398) ms
14B36A Reserved: 0 (0x00000000)
14B36E Reserved: 0 (0x00000000)
14B372 Layer: 0 (0x0000)
14B374 Alternate group: 0 (0x0000)
14B376 Volume: 256 (0x0100) - 1.000
14B378 Reserved: 0 (0x0000)
14B37A Matrix structure (36 bytes)
14B37A a (width scale): 1.000
14B37E b (width rotate): 0.000
14B382 u (width angle): 0.000
14B386 c (height rotate): 0.000
14B38A d (height scale): 1.000
14B38E v (height angle): 0.000
14B392 x (position left): 0.000
14B396 y (position top): 0.000
14B39A w (divider): 1.000
14B39E Track width: 0.000
14B3A2 Track height: 0.000
mdia — Media Box
这是⼀个包含track媒体数据信息的container box。
子box包括:
- mdhd:Media Header Box,存放视频流创建时间,⻓度等信息。
- hdlr:Handler ReferenceBox,媒体的播放过程信息。
- minf:Media InformationBox,解释track媒体数据的handler-specific信息。minf同样是个containerbox,其内部需要关注的内容是stbl,这也是moov中最复杂的部分。stbl包含了媒体流每⼀个sample在⽂件中的offset,pts,duration等信息。想要播放⼀个mp4⽂件,必须根据stbl正确找到每个sample并送给 解码器。
mdia展开如下图所示
mdhd – Media Header Box
存放视频流创建时间,⻓度等信息。
视频的mdhd
音频的mdhd
⾳频的mdhd,也类似视频,但要注意Time scale,我们在计算时间戳的时候都要使⽤该Time scale,对 应我们流⾥⾯的AVStream->time_base
hdlr — Handler Reference Box
媒体的播放过程信息。
视频的hdlr
重点Component subtype:
音频的hdlr
minf — Media Information Box
解释track媒体数据的handler-specific信息。minf同样是个container box,其内部需要关注的内容是stbl,这也是moov中最复杂的部分。stbl包含了媒体流每⼀个sample在⽂ 件中的offset,pts,duration等信息。想要播放⼀个mp4⽂件,必须根据stbl正确找到每个sample并送给 解码器。
⽽且需要注意的是,minf⾥⾯的⼦容器,⾳频和视频轨是有区别的
- 视频轨:vmhd
- 音频轨则为: smhd
Stbl Insider — Sample Table Box
上⽂提到mdia中最主要的部分是存放⽂件中每个sample信息的stbl。在解析stbl 前,我们需要区分chunk和sample这两个概念。
在mp4⽂件中,sample是⼀个媒体流的基本单元,例如视频流的⼀个sample代表实际的nal数据。chunk 是数据存储的基本单位,它是⼀系列sample数据的集合,⼀个chunk中可以包含⼀个或多的sample。
stbl⽤来描述每个sample的信息,包含以下⼏个主要的⼦box:
stsd — Sample Description Box
存放解码必须的描述信息。
下图示例中,对于h264的视频流,其具体类型为 avc1 ,extensions中其中存放有sps,pps等解码必要信息。
视频的stsd
⾥⾯包含了avc1,avc1⾥⾯⼜包含了avcC和pasp
- avc1:包含了视频Width、Height
- avcC:包含了视频编码器相关的信息,包括sps、pps等信息
14CFDF Video (158 bytes)
14CFDF Header (8 bytes)
14CFDF Size: 158 (0x0000009E)
14CFE3 Name: avc1
14CFE7 Reserved: 0 (0x0000000000000000)
14CFED Data reference index: 1 (0x0001)
14CFEF Version: 0 (0x0000)
14CFF1 Revision level: 0 (0x0000)
14CFF3 Vendor:
14CFF7 Temporal quality: 0 (0x00000000)
14CFFB Spatial quality: 0 (0x00000000)
14CFFF Width: 1920 (0x0780)
14D001 Height: 800 (0x0320)
14D003 Horizontal resolution: 4718592 (0x00480000)
14D007 Vertical resolution: 4718592 (0x00480000)
14D00B Data size: 0 (0x00000000)
14D00F Frame count: 1 (0x0001)
14D011 Compressor name size: 0 (0x00)
14D012 Padding: (31 bytes)
14D031 Depth: 24 (0x0018)
14D033 Color table ID: 65535 (0xFFFF)
14D035 AVC decode (56 bytes)
14D035 Header (8 bytes)
14D035 Size: 56 (0x00000038)
14D039 Name: avcC
14D03D Version: 1 (0x01)
14D03E Specific (47 bytes)
14D03E Profile: 100 (0x64)
14D03F Compatible profile: 0 (0x00)
14D040 Level: 40 (0x28)
14D041 Reserved: 63 (0x3F) - (6 bits)
14D041 Size of NALU length minus 1: 3 (0x3) - (2 bits)
14D042 Reserved: 7 (0x7) - (3 bits)
14D042 seq_parameter_set count: 1 (0x01) - (5 bits)
14D043 seq_parameter_set (30 bytes)
14D043 Size: 28 (0x001C)
14D045 nal_ref_idc: 3 (0x3) - (2 bits)
14D045 nal_unit_type: 7 (0x7) - (5 bits)
14D046 profile_idc: 100 (0x64)
14D047 constraints (1 bytes)
14D047 constraint_set0_flag: No
14D047 constraint_set1_flag: No
14D047 constraint_set2_flag: No
14D047 constraint_set3_flag: No
14D047 constraint_set4_flag: No
14D047 constraint_set5_flag: No
14D047 reserved_zero_2bits: 0 (0x0)
14D048 level_idc: 40 (0x28) - (8 bits)
14D049 seq_parameter_set_id: 0 (0x0)
14D049 high profile specific (1 bytes)
14D049 chroma_format_idc: 1 (0x1) - 4:2:0
14D049 bit_depth_luma_minus8: 0 (0x0)
14D049 bit_depth_chroma_minus8: 0 (0x0)
14D049 qpprime_y_zero_transform_bypass_flag: No
14D049 seq_scaling_matrix_present_flag: No
14D04A log2_max_frame_num_minus4: 0 (0x0)
14D04A pic_order_cnt_type: 0 (0x0)
14D04A log2_max_pic_order_cnt_lsb_minus4: 2 (0x2)
14D04A max_num_ref_frames: 3 (0x3)
14D04B gaps_in_frame_num_value_allowed_flag: No
14D04B pic_width_in_mbs_minus1: 119 (0x077)
14D04D pic_height_in_map_units_minus1: 49 (0x031)
14D04E frame_mbs_only_flag: Yes
14D04E direct_8x8_inference_flag: Yes
14D04E frame_cropping_flag: No
14D04E vui_parameters_present_flag (17 bytes)
14D04E vui_parameters_present_flag: Yes
14D04E aspect_ratio_info_present_flag (2 bytes)
14D04E aspect_ratio_info_present_flag: Yes
14D04F aspect_ratio_idc: 1 (0x01) - (8 bits) - 1.000
14D050 overscan_info_present_flag: No
14D050 video_signal_type_present_flag (3 bytes)
14D050 video_signal_type_present_flag: Yes
14D050 video_format: 5 (0x5) - (3 bits) -
14D050 video_full_range_flag: 0 (0x0) - (1 bits) - Limited
14D050 colour_description_present_flag (3 bytes)
14D050 colour_description_present_flag: Yes
14D050 colour_primaries: 1 (0x01) - (8 bits) - BT.709
14D051 transfer_characteristics: 1 (0x01) - (8 bits) - BT.709
14D052 matrix_coefficients: 1 (0x01) - (8 bits) - BT.709
14D053 chroma_loc_info_present_flag: No
14D054 timing_info_present_flag (8 bytes)
14D054 timing_info_present_flag: Yes
14D054 num_units_in_tick: 1 (0x00000001) - (32 bits)
14D058 time_scale: 48 (0x00000030) - (32 bits)
14D05C fixed_frame_rate_flag: Yes
14D05C nal_hrd_parameters_present_flag: No
14D05C vcl_hrd_parameters_present_flag: No
14D05C pic_struct_present_flag: No
14D05C bitstream_restriction_flag (3 bytes)
14D05C bitstream_restriction_flag: Yes
14D05C motion_vectors_over_pic_boundaries_flag: Yes
14D05D max_bytes_per_pic_denom: 0 (0x0)
14D05D max_bits_per_mb_denom: 0 (0x0)
14D05D log2_max_mv_length_horizontal: 11 (0x0B)
14D05E log2_max_mv_length_vertical: 11 (0x0B)
14D05F max_num_reorder_frames: 2 (0x2)
14D05F max_dec_frame_buffering: 4 (0x4)
14D061 pic_parameter_set count: 1 (0x01)
14D062 pic_parameter_set (6 bytes)
14D062 Size: 5 (0x0005)
14D064 nal_ref_idc: 3 (0x3) - (2 bits)
14D064 nal_unit_type: 8 (0x8) - (5 bits)
14D065 pic_parameter_set_id: 0 (0x0)
14D065 seq_parameter_set_id: 0 (0x0)
14D065 entropy_coding_mode_flag: Yes
14D065 bottom_field_pic_order_in_frame_present_flag: No
14D065 num_slice_groups_minus1: 0 (0x0)
14D065 num_ref_idx_l0_default_active_minus1: 3 (0x3)
14D066 num_ref_idx_l1_default_active_minus1: 0 (0x0)
14D066 weighted_pred_flag: No
14D066 weighted_bipred_idc: 2 (0x2) - (2 bits)
14D066 pic_init_qp_minus26: 0 (0x0)
14D067 pic_init_qs_minus26: 0 (0x0)
14D067 chroma_qp_index_offset: 0 (0x0)
14D067 deblocking_filter_control_present_flag: Yes
14D067 constrained_intra_pred_flag: No
14D067 redundant_pic_cnt_present_flag: No
14D067 transform_8x8_mode_flag: Yes
14D067 pic_scaling_matrix_present_flag: No
14D067 second_chroma_qp_index_offset: 0 (0x0)
14D068 -------------------------
14D068 --- AVC, accepted ---
14D068 -------------------------
14D069 Padding?: (4 bytes)
14D06D Pixel Aspect Ratio (16 bytes)
14D06D Header (8 bytes)
14D06D Size: 16 (0x00000010)
14D071 Name: pasp
14D075 hSpacing: 1 (0x00000001)
14D079 vSpacing: 1 (0x00000001)
⾳频的stsd
包含了⾳频相关的信息,⽐如采样率,通道数量等。
MP4布局
mp4数据索引和真正的数据分开存储
索引数据(moov)
真正的数据(mdat)
stts — Time-to-Sample Box
定义每个sample时⻓。时间戳的问题。
Time-To-Sample的table entry布局如下:
- sample count:sample个数
- sample duration:sample持续时间
持续时间相同的连续sample可以放到⼀个entry⾥达到节省空间的目的。
这里先给出来的是视频的stts,Number of entries,这个参数需要注意并不是sample的个数,sample的 实际数量需要将每个entry的sample count进⾏累加才是真正的sample个数。
下图示例中,第1个sample时间为3720,单位⽤mdhd的time scale进⾏换算,⽐如视频的是90000,此 时换算成秒为3720/90000 = 0.0413333333333333秒。
14D07D Time to Sample (664 bytes)
14D07D Header (8 bytes)
14D07D Size: 664 (0x00000298)
14D081 Name: stts
14D085 Version: 0 (0x00)
14D086 Flags: 0 (0x000000)
14D089 Number of entries: 81 (0x00000051)
14D08D Sample Count: 1 (0x00000001)
14D091 Sample Duration: 3720 (0x00000E88)
14D095 Sample Count: 1 (0x00000001)
14D099 Sample Duration: 3780 (0x00000EC4)
14D09D Sample Count: 1 (0x00000001)
14D0A1 Sample Duration: 3690 (0x00000E6A)
14D0A5 Sample Count: 2 (0x00000002)
14D0A9 Sample Duration: 3780 (0x00000EC4)
14D0AD Sample Count: 1 (0x00000001)
14D0B1 Sample Duration: 3690 (0x00000E6A)
14D0B5 Sample Count: 2 (0x00000002)
14D0B9 Sample Duration: 3780 (0x00000EC4)
14D0BD Sample Count: 1 (0x00000001)
14D0C1 Sample Duration: 3690 (0x00000E6A)
14D0C5 Sample Count: 2 (0x00000002)
14D0C9 Sample Duration: 3780 (0x00000EC4)
14D0CD Sample Count: 1 (0x00000001)
14D0D1 Sample Duration: 3690 (0x00000E6A)
14D0D5 Sample Count: 2 (0x00000002)
14D0D9 Sample Duration: 3780 (0x00000EC4)
14D0DD Sample Count: 1 (0x00000001)
14D0E1 Sample Duration: 3690 (0x00000E6A)
14D0E5 Sample Count: 2 (0x00000002)
...
14D2C5 Sample Count: 1 (0x00000001)
14D2C9 Sample Duration: 3780 (0x00000EC4)
14D2CD Sample Count: 1 (0x00000001)
14D2D1 Sample Duration: 3690 (0x00000E6A)
14D2D5 Sample Count: 2 (0x00000002)
14D2D9 Sample Duration: 3780 (0x00000EC4)
14D2DD Sample Count: 1 (0x00000001)
14D2E1 Sample Duration: 3690 (0x00000E6A)
14D2E5 Sample Count: 2 (0x00000002)
14D2E9 Sample Duration: 3780 (0x00000EC4)
14D2ED Sample Count: 1 (0x00000001)
14D2F1 Sample Duration: 3690 (0x00000E6A)
14D2F5 Sample Count: 2 (0x00000002)
14D2F9 Sample Duration: 3780 (0x00000EC4)
14D2FD Sample Count: 1 (0x00000001)
14D301 Sample Duration: 3690 (0x00000E6A)
14D305 Sample Count: 2 (0x00000002)
14D309 Sample Duration: 3780 (0x00000EC4)
14D30D Sample Count: 1 (0x00000001)
14D311 Sample Duration: 3750 (0x00000EA6)
14D315 结束位置
再给出个⾳频的stts,只是mdhd的time scale的差别,之前我们看到⾳频为44100,则计算第⼀个 sample的时间
1024/44100=0.0232199546485261秒。
14B4C4 Time to Sample (1048 bytes)
14B4C4 Header (8 bytes)
14B4C4 Size: 1048 (0x00000418)
14B4C8 Name: stts
14B4CC Version: 0 (0x00)
14B4CD Flags: 0 (0x000000)
14B4D0 Number of entries: 129 (0x00000081)
14B4D4 Sample Count: 1 (0x00000001)
14B4D8 Sample Duration: 1024 (0x00000400)
14B4DC Sample Count: 1 (0x00000001)
14B4E0 Sample Duration: 1025 (0x00000401)
14B4E4 Sample Count: 2 (0x00000002)
14B4E8 Sample Duration: 1024 (0x00000400)
14B4EC Sample Count: 1 (0x00000001)
14B4F0 Sample Duration: 1023 (0x000003FF)
14B4F4 Sample Count: 1 (0x00000001)
14B4F8 Sample Duration: 1024 (0x00000400)
14B4FC Sample Count: 1 (0x00000001)
14B500 Sample Duration: 1025 (0x00000401)
14B504 Sample Count: 1 (0x00000001)
14B508 Sample Duration: 1024 (0x00000400)
14B50C Sample Count: 1 (0x00000001)
14B510 Sample Duration: 1023 (0x000003FF)
14B514 Sample Count: 2 (0x00000002)
14B518 Sample Duration: 1024 (0x00000400)
14B51C Sample Count: 1 (0x00000001)
14B520 Sample Duration: 1025 (0x00000401)
14B524 Sample Count: 1 (0x00000001)
14B528 Sample Duration: 1024 (0x00000400)
14B52C Sample Count: 1 (0x00000001)
14B530 Sample Duration: 1023 (0x000003FF)
14B534 Sample Count: 2 (0x00000002)
14B538 Sample Duration: 1024 (0x00000400)
14B53C Sample Count: 1 (0x00000001)
14B540 Sample Duration: 1025 (0x00000401)
14B544 Sample Count: 1 (0x00000001)
14B548 Sample Duration: 1024 (0x00000400)
14B54C Sample Count: 1 (0x00000001)
14B550 Sample Duration: 1023 (0x000003FF)
14B554 Sample Count: 2 (0x00000002)
14B558 Sample Duration: 1024 (0x00000400)
14B55C Sample Count: 1 (0x00000001)
14B560 Sample Duration: 1025 (0x00000401)
14B564 Sample Count: 1 (0x00000001)
14B568 Sample Duration: 1024 (0x00000400)
14B56C Sample Count: 1 (0x00000001)
14B570 Sample Duration: 1023 (0x000003FF)
14B574 Sample Count: 1 (0x00000001)
14B578 Sample Duration: 1024 (0x00000400)
stss — Sync Sample Box
同步sample表,存放关键帧列表,关键帧是为了⽀持随机访问。
stss的table entry布局如下:
下图示例中,该视频track有3个关键帧:
14D315 Sync Sample (28 bytes)
14D315 Header (8 bytes)
14D315 Size: 28 (0x0000001C)
14D319 Name: stss
14D31D Version: 0 (0x00)
14D31E Flags: 0 (0x000000)
14D321 entry-count: 3 (0x00000003)
stsc — Sample-To-Chunk Box
Sample-To-Chunk Box,sample-chunk映射表。上⽂提到mp4通常把sample封装到chunk中,⼀个 chunk可能会包含⼀个或者⼏个sample。Sample-To-Chunk Atom的table entry布局如下图所示:
- First chunk:使⽤该表项的第⼀个chunk序号
- Samples per chunk:使⽤该表项的chunk中包含有⼏个sample
- Sample description ID:使⽤该表项的chunk参考的stsd表项序号
下图示例中,可以看到该视频track⼀共有1个stsc表项,chunk序列1-x,每个chunk包含⼀个sample。 这⾥则说明每个chunk⾥⾯只有⼀个sample(⼀个chunk是可以有多个sample)。
stsz — Sample Size Box
Sample Size Box,指定了每个sample的size。Sample Size Atom包含两sample总数和⼀张包含了每个 sample size的表。
sample size 表的entry布局如下图:
下图示例中,该视频流⼀共有110个sample,第1个sample大小为42072字节,第2个sample⼤⼩为7354 个字节。
stco — Chunk Offset Box
Chunk Offset Box,指定了每个chunk在⽂件中的位置,这个表是确定每个sample在⽂件中位置的关 键。该表包含了chunk个数和⼀个包含每个chunk在⽂件中偏移位置的表。每个表项的内存布局如下:
需要注意,这⾥stco只是指定的每个chunk在⽂件中的偏移位置,并没有给出每个sample在⽂件中的偏 移。想要获得每个sample的偏移位置,需要结合 Sample Size box(stsz)和Sample-To-Chunk(stsc) 计 算后取得。
下图示例中,该视频流第1个chunk在⽂件中的偏移为4750,⽽这⾥是每个chunk只有⼀个sample,此时 第⼀个sample的起始位置就为4750->0x1D78,数据⼤⼩则参照stsz,第⼀个sample size为172818。
⽐如偏移位置,7544->0x1D78
如何计算sample偏移位置
上⽂提到通过stco并不能直接获取某个sample的偏移位置,下⾯举例说明如何获取某⼀个pts对应的 sample在⽂件中的位置。
⼤体需要以下步骤:
- 将pts转换到媒体对应的时间坐标系
- 根据stts((decoding) time-to-sample)计算某个pts对应的sample序号
- 根据stsc(sample-to-chunk)计算sample序号存放在哪个chunk中
- 根据stco(chunk offset)获取对应chunk在⽂件中的偏移位置
- 根据stsz获取sample在chunk内的偏移位置并加上第4步获取的偏移,计算出sample在⽂件中的偏移
例如,想要获取3.64秒视频sample数据在⽂件中的位置(2_audio_track_5s.mp4):
- 根据time scale参数,将3.64秒转换为视频时间轴对应的3640000 (假如时间刻度不为毫秒)
- 视频轨:time scale为90000,转成对应的时间戳为3.64秒*90000 - 遍历累加下表所示stts所有项⽬,计算得到3640000位于第110个sample = 327600
- 计算出多个sample_deltas叠加才到了327600, 我们这⾥姑且按3780作为平均值计算,实际是 37201+37801+36901+37802 … 这样⼀直叠加进⾏。327600/3780 = 86.66666666666667,取整为86
type stts
size 664
flags 0
version 0
sample_counts 1,1,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1, 2,1,2,1,2,1,2,1,2,1,1,1,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1, 2,1,2,1,2,1,2,1,2,1,2,1,1,1,1,2,1,2,1,2,1,2,1
sample_deltas 3720,3780,3690,3780,3690,3780,3690,3780,3690,3780, 3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3 780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,37 50,3720,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,369 0,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780,3690,3780 ,3690,3780,3690,3780,3690,3780,3750,3720,3780,3690,3780,3690,3780, 3690,3780,3690,3780,3750
- 查询下表所示stsc所有项⽬,计算得到第86个sample位于第86个chunk,并且在该chunk中位于第1 个sample(因为我们的码流是每个chunk对应了⼀个sample)
Property name Property value
type stsc
size 28
flags 0
version 0
first_chunk 1
samples_per_chunk 1
sample_description_index 1
- 查询下表所示stco所有项⽬,得到第86个chunk在⽂件中偏移位置为1004678。使⽤hexinator
Property name Property value
type stco
size 484
flags 0
version 0
chunk_offsets 7544,182562,204381,206907,209520,236820,240924,242 781,..............省略
- 查询下表所示stsz所有项⽬,得到第86个sample的size为20934。计算得到3.64秒视频sample数据 在⽂件中
offset:1004678+0 = 1004678
size:20934
Property name Property value
type stsz
size 488
flags 0
version 0
sample_sizes 172818,20829,722,567,25207,1946,822,674,23828,2141 ,824,974,22426,2794..省略
sample_count 117
验证:⽤编辑器打开mp4⽂件,定位到⽂件偏移1004678位置,。 09分隔符,这⾥占⽤了6个字节, 再看真正的数据区域,前4字节也为 NALU的⻓度0x000051bc= 20924
总共占⽤的字节计算 4+2+4+20924 = 20934
文章来源:https://www.toymoban.com/news/detail-410697.html
参考资料
扩展:《整理mp4协议重点,将协议读薄》文章来源地址https://www.toymoban.com/news/detail-410697.html
到了这里,关于MP4格式分析的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!