Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

PCM data size for audio encoding

skorolev
Beginner
667 Views
Hello,

I've been trying to use audio encoder classes included in IPP samples to compress PCM samples I am getting from an audio board.

What I have is a stereo buffer of 2 channels/16bit/48000 frequency. The size of the buffer is 3840 bytes for each frame, i.e. 1920 per channel ( = 48000Hz / 25 frames in PAL).

AACEncoder doesn't accept this data since it expects a buffer of at least 2048 bytes per channel. Same is valid for MP3Encoder, though in case of the latter incoming uncompressed buffer supposes to be about 2304 bytes per channel.

I am not quite sure what I am doing wrong. I looked into the source code of AACEncoder and the size requirements do not seem to depend on frequency but solely on a number of channels.

Thanks,

Serge
0 Kudos
1 Reply
Vladimir_Dudnik
Employee
667 Views

Hello, there is comment from our expert:

Serge,

Yes, you are right, size of AAC audio frame is equal to 1024 samples per channel (2048 bytes), size of MP3 frame - 1152 samples per channel (2304 bytes). It means that every call GetFrame of encoder encodes exactly frame_size * num_channels samples of input data. This size doesn't depend on sample frequency. As you can see, in your case AAC can encode 46.875 audio frames per second (48000/1024), MP3 ~ 41.6667. These numbers don't equal to video frame rate (25 frames per second). It means, that you should encode audio and video streams independently and use time stamps for further synchronization.

PS. You wrote "The size of the buffer is 3840 bytes for each frame, i.e. 1920 per channel ( = 48000Hz / 25 frames in PAL)." I think that you wanted to write "The size of the buffer is 3840 * 2 bytes for each frame, i.e. 1920 * 2 per channel ( = 48000Hz * (16 bit/ 8 bit) / 25 frames in PAL)."


Regards,
Vladimir
Reply