I'm playing a bit with the UMC library. I've written a very simple screen-recorder (under Windows) that also captures the microphone. Since I'm actually creating the video, I cannot depend on some fixed fps to set the time-stamps, so internally, I use a timer with 1 ms (theoretical) resolution (from Windows SDK timeGetTime()).
I use the H264 video-codec and AAC audio-codec with the MP4 muxer. Initially I had some problems with the time-stamps for the video-frames as well, but when I convert my internal milliseconds with timestamp = (ms / 1000.0), the video plays back in (visually) correct speed. The audio is an entirely different story. Using the same conversion, I get audio in high speed, aligned to the first 50% of the video. That is, if I feed PCM-buffers to the audio-codec in the size requested by the codec (audio_params_.m_SuggestedInputSize). If I feed the larger buffers that I request from Windows, it seems like the encoder or muxer splits the buffer into smaller frames, and then mess up the timestamps of the individual frames - this is at least what the VLC player indicates ("timing screwed, stopping resampling")).
So my question is - what is the actual timestamps expected by the AAC audio-codec and the MP4 muxer? And, are they relative to the start of the movie, or to the previous audio-segment?
The complete source-code for my test-project can be found here: http://war.jgaa.com/files/IPP-test.7z
Sorry for such a long delay in responding to your question. Is this still an outstanding issue for you? Checking for the file link you cite in your post I find no file.
Please let us know if this issue is still outstanding for you so we can provide help.