Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Comunicados
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

PCM data size for audio encoding

skorolev
Principiante
2.574 Visualizações
Hello,

I've been trying to use audio encoder classes included in IPP samples to compress PCM samples I am getting from an audio board.

What I have is a stereo buffer of 2 channels/16bit/48000 frequency. The size of the buffer is 3840 bytes for each frame, i.e. 1920 per channel ( = 48000Hz / 25 frames in PAL).

AACEncoder doesn't accept this data since it expects a buffer of at least 2048 bytes per channel. Same is valid for MP3Encoder, though in case of the latter incoming uncompressed buffer supposes to be about 2304 bytes per channel.

I am not quite sure what I am doing wrong. I looked into the source code of AACEncoder and the size requirements do not seem to depend on frequency but solely on a number of channels.

Thanks,

Serge
0 Kudos
1 Responder
Vladimir_Dudnik
Funcionário
2.574 Visualizações

Hello, there is comment from our expert:

Serge,

Yes, you are right, size of AAC audio frame is equal to 1024 samples per channel (2048 bytes), size of MP3 frame - 1152 samples per channel (2304 bytes). It means that every call GetFrame of encoder encodes exactly frame_size * num_channels samples of input data. This size doesn't depend on sample frequency. As you can see, in your case AAC can encode 46.875 audio frames per second (48000/1024), MP3 ~ 41.6667. These numbers don't equal to video frame rate (25 frames per second). It means, that you should encode audio and video streams independently and use time stamps for further synchronization.

PS. You wrote "The size of the buffer is 3840 bytes for each frame, i.e. 1920 per channel ( = 48000Hz / 25 frames in PAL)." I think that you wanted to write "The size of the buffer is 3840 * 2 bytes for each frame, i.e. 1920 * 2 per channel ( = 48000Hz * (16 bit/ 8 bit) / 25 frames in PAL)."


Regards,
Vladimir
Responder