Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

G.722.1 decoder and RTP Packetisation

jbaker
Beginner
2,456 Views
Hello,

I am having a spot of difficulty with the g.722.1 decoder sample source code.

We have an application which receives 722.1 rtp packets and it attempts to decode these into raw audio information to write to file. It seems as though I have mostly done things correctly however the final raw audio seems to be bad data. Silence followed by bursts of random noise.

Is there anything in particular that I should be doing with the data from the rtp payload before passing it onto the apiG722Decode function. I am a bit confused by the phrase (from rfc3047) "The G.722.1 encoder bit stream is split into a sequence of octets (60 or 80 depending on the bit rate), and each octet is in turn mapped into an RTP octet." If the output from the encoder is put in left to right order MSB to LSB directly into the rtp payload then I have something else wrong here...

Any thoughts?

Cheers,

Jesse Baker
0 Kudos
18 Replies
Vladimir_Dudnik
Employee
2,456 Views
Hi, could you please specify what IPP version, OS and platform do you use?
Regards,
Vladimir
0 Kudos
Vyacheslav_Baranniko
New Contributor II
2,456 Views
Hi Jesse
What I can only guessabout root causeit wasa multiple per packetG722.1 RTP stream you dealt with so decode function to be invoked multiple time per one packet.
Could youattach the RTP recorded file and the file resulted after G722.1 decoder? Orat least just one RTP packet andthe data decoded?
Hoping to be ableto help you then.
Vyacheslav
0 Kudos
Vyacheslav_Baranniko
New Contributor II
2,456 Views
The IPP G722.1 decoderfollows the ITU reference G.722.1 C-code and accesses a bitstreamby 16bit words.On little endian CPU the least significant byte of a16bit word is stored first in memoryfollowed by a most significant byte of the word.According to G.722.1/Figure A.1a most significant byte is to be put intopayload first and soRTP transmits 16bit wordsof bitstream in big-endian order.That meanseach16bit word of recieved RTPG.722.1 frame must be reordered to little-endian prior to decoding by IPP decoder.
Vyacheslav
0 Kudos
jbaker
Beginner
2,456 Views
Hello, thanks for your reply. Sorry for the delay in getting back to you.

We have tested this against several units sending 1, 2, and 3 frames per packet. I have also tried reordering incoming words and bytes without any success. I havn't, however, tried reversing words on the unit which is sending me 1 frame per packet. I'll get onto that and get back to you.

Thanks again.

Jesse
0 Kudos
jbaker
Beginner
2,456 Views
Things are looking good actually, I now have an output audio stream that is 1/2 as fast as it should be, probably just a buffer size problem. Thanks for your help.
0 Kudos
Vyacheslav_Baranniko
New Contributor II
2,456 Views
Hi Jesse
1/2 slowdownis most probably due tothat you didnot take in accountthat the G722.1 codec is wideband, i.e. applicable toa 16000 KHz pcm audio.For example, 1sec wideband audiowill take 2 sec to play at8000 KHz so feel like voice isslowdown at 1/2.
Vyacheslav
0 Kudos
jbaker
Beginner
2,456 Views
Yep, you are completely correct, previously we were sending 8khz linear pcm and after correcting for 16khz linear pcm everything worked nicely.

An interesting by-product of this was a windows media writer was consistantly overwriting the last half of each audio sample whilst a real media writer would simply stack them one after another. The final result of this was a wmv file where the audio track was the same length as the video track and a rm file where the audio track was twice as long. And of course the audio track from the realmedia file could be extracted, speed up and listened to as normal sound, however the extracted and speed up audio track from the windows media file was horribly corrupt.

Anyway, thanks for your help. Now, onto the encoder...

Jesse
0 Kudos
telefonie
Beginner
2,456 Views
I have the same Problem
I try to decode Siren (= G722.1 / 16000 bit/s) .
Normal Siren files will be decoded.
Target is RTP-Data vomWindows Messanger. Windows Messanger sends RTP-Data coded as Siren. (SiP:Siren/16000 bitrate=16000) with one frame per packet. Size of packets is ok with 40 okteds.
intel demodecoder decoderG722 ist testet with siren source generatet from tool of G722/siren patentholder. It works.
I write RTP-Data to a file and try todecode ist. Result: Only noisesome maximum peaks.
I rearranged the bytes as suggestet (short: Networkbyte order to little endian).
I rearranged the bits in every octed.
I rearranged the bytes and bits together.
Nothing works.
I need help.
Does Microssoft use a special byte order?
How to rearrange the bytes and/or bits for correct decoding?
Thanks for help
Joachim
0 Kudos
Vyacheslav_Baranniko
New Contributor II
2,456 Views
Seem vmw records 8bit A-law or Mu-law pcm while rmw does 16bit linear pcm (uniform)
0 Kudos
telefonie
Beginner
2,456 Views

Hello vbaranni,

at first, thanks for your fast answer.

I dont know the meaning of acronyms "vmw" and "rmw". Please explain it or write the full name of the acronyms.

I thought the input of G722.1 is 16Bit linear pcm. Ifsome clientuse before alaw- or ulaw-compression,he have to expand to 16Bit linear before using another compression sheme like G7221. Same does Intel IPPSamplesif output shopuld be a-law or u-law. After encodingsample codecsmay compresses resultwith a-law or u-law.

40 Octeds are right on 16Bit linear coded with SIREN (G722.1 16000 kbit/s)
(Octeds 40/60/80 on G722.1 with 16000/24000/32000 kbit/s)

Therefore I dont unterstand your answer. Ill be glad if you describe it more detailed.
And in which way this problemcan be solved.

Regards
telefonie

Message Edited by telefonie on 10-13-2005 06:09 AM

0 Kudos
Vyacheslav_Baranniko
New Contributor II
2,456 Views
Hello telefonie Sorry for confusing you, it was my answer to Jessie's last post. Regarding to your question about SIREN. IPP G722.1 16000 kbits is compatible to SIREN in coding a wideband (16KHz) 16bit pcm. It composes a 40 octets long bitstream as sequence of 16bit words. The RTP G722.1 payload is composed of octets in big-endian order, so to decode it by IPP G722.1 decoder on little-endian machine an octet rearrangement is required as octest actually will be decoder as word sequences. To re-arrange an octets you can use ippsSwapBytes_16u IPP function. Vyacheslav
0 Kudos
telefonie
Beginner
2,456 Views

Hello,

ill try the swap-function as mentioned, but I tryed to rearange the bytes in the same kind like mentioned before and it dont work, so I think this wont work also.

After the rearagement the stream should be:

byte2, byte1, byte4, byte3, byte6, byte5, ....

Regards

Joachim

0 Kudos
rporter
Beginner
2,456 Views
Have you been successful in your attempt to decode Siren? I'm interested in learning the problems you experienced, since I'm having difficulties with Siren as well.
I can decode a real live G722.1 voice session from MS Communicator (I did have to play with encoded byte order). As a note, it seems the RTP packets default to 60 bytes of encoded data.
However, my attempts to decode Siren are falling short. I can't quite tell what is going on--there is not enough correlation between the voice session and the PCM data todetermine what is going on (e.g. blips at right time, but values indicate a byte ordering issue). All ideas are welcome.
Thanks,
Rick
0 Kudos
telefonie
Beginner
2,456 Views
Hello Rick Porter,
How do you force your MS Communicator to a special codec like G722.1?
I use the Windows Messanger. Messanger choses the codec automaticly. Only with sipp on the farendside I can say G722.1 is the only usable codec and then Messanger uses G722.1.
60 bytes are one frame at G722.1 24Kbit.
Siren has 40 bytes sized frames because of 16Kbit.
My attempts of decoding failed:
I do all 16 combinations of reversing bytes of frame, long, word and reversing bits in bytes. Nothing helps.
Has someone connections to Microsoft and can ask them about this Problem? If yes, please post the answer here.
Thanks,
Joachim
0 Kudos
rporter
Beginner
2,456 Views
Hi telefonie,
Covergence develops a SIP security/management device (named the Eclipse). One of the many things the Eclipse will do is allow/disallow codecs based on policy. I configure a policy on the Eclipse to disallow Siren. Then, I have the MS Communicator clients go thru the Eclipse. Since Siren is the only "higher priority" codec, the clients establish a session with the next codec (G722.1). The same trick works for Windows Messenger--I only mentioned MS Communicator in the unlikely event that someone knew of a Communicator specific bug. Sorry that this solution probably does not help you.
Hopefully we can figure outthis Siren issue, or hear from someone with the answer.
later,
Rick
0 Kudos
avs9000
Beginner
2,456 Views

Hi! I am have a problem at use of functions of coding - decoding G726. (I use IPP4.0 on WindowsXP SP2). I code input PCM a signal function ippsEncode_G726_16s8u 16 kbps and decode the target buffer function ippsDecode_G726_8u16s. As a result after decoding to an initial sound the crash and handicapes is added. In what here a problem? And what on volume should be used the buffer for these functions?

Thanks for the help
Esli chto - mojno po russki na mail

Alexander

0 Kudos
telefonie
Beginner
2,456 Views

Hello Rick,

I see you decode G722.1 from Microsoft. Am I right?Sirren should be the same at 16KBit/s, but it didnt work. So Microsoft do somthing strange with Sirren. Do you tryed also the ACM-Codec Vivo-Active Sirren?

You can also contactme directy to joachimneumann@t-online.de

Regards

Joachim

0 Kudos
telefonie
Beginner
2,456 Views

Hello Alexander,

Wrong discussion Group (This is G722.1).

Try the demo decoder decoderG726 and encoder to see correct working. Downloadable here at intel "codec samples" for ipps.

The PCM Signal should be linear 16Bit.
Initalization habe to be done for -16 or-24 or-32 or-40 before encoding/decoding.

Hint: There is another Topic for rtp packetizationof G726 at this forum. Search for G726.

Regards

Joachim

0 Kudos
Reply