Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
17060 Discussions

Poor voice recognition with background noise

CLi37
Beginner
1,121 Views

In silence environment, the voice recognition worked excellent. But in the environment with background noise, the rate of voice recognition dropped sharply. This is a problem for real application.

0 Kudos
4 Replies
samontab
Valued Contributor II
1,121 Views

The good old cocktail party effect...

0 Kudos
Brennon_W_
Beginner
1,121 Views

That is always going to be the case with most voice rec tech.

I've found that almost always the quality on a recording is quite low for audio streams that are being used for recognition - my understanding is that this is to enable a balance of performance - a high bit rate would take much longer to process and would cause also cause memory constraints that translate to shorter lengths of audio to be processed (well in an async recognition service that takes context of the entire sentence).

The only way around this is to concentrate at a hardware level on noise reduction.

I wish it was much better also.

Best of luck.

 

Cheers

0 Kudos
samontab
Valued Contributor II
1,122 Views

I think the problem is that our brains are so great at their job that most people assume that a lot of "trivial tasks", like listening to a conversation in a noisy environment, will be easy for a computer when in fact they are really hard.

It's good to have a brain :)

0 Kudos
CLi37
Beginner
1,122 Views

Brennon W. wrote:

That is always going to be the case with most voice rec tech.

I've found that almost always the quality on a recording is quite low for audio streams that are being used for recognition - my understanding is that this is to enable a balance of performance - a high bit rate would take much longer to process and would cause also cause memory constraints that translate to shorter lengths of audio to be processed (well in an async recognition service that takes context of the entire sentence).

The only way around this is to concentrate at a hardware level on noise reduction.

But even the white noise can make voice recognition worse in RS. There are many methods to cancellation noise. The mic array was supposed to work better. The slow computing speed can be solved by software optimization or hardware acceleration. If the basic functions are not faster enough, the complex applications may become disable. i.e. run voice recognition, face recognition, and gesture recognition in multiple threads.   

0 Kudos
Reply