Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

usage of ippRegExpReplace

evanyan
Beginner
727 Views
Hi,

I want to use ippRegExpReplace() to replace ALL matched sub strings in the src string, but didn't find a elegant way to do it.

From the doc, I see ippRegExpReplace() doesn't allocate memory for the dest string and IppRegExpFind array. So I have to test again and again to see whether the result can fit into the allocated memory. Following is how I did.

do { // loop until we got enough memory for dest and finds

if (destUsedLen == destLen) {
destLen *= 2; // expand the size by 2
delete[] dest;
dest = new char[destLen];
}

if (numFinds == findsLen) {
findsLen *= 2;
delete[] f;
f = new IppRegExpFind[findsLen];
}

srcLen = origStrLen;
destUsedLen = destLen;
numFinds = findsLen;
status = ippsRegExpReplace_8u(srcStr, &srcLen, destStr, &destUsedLen, f, &numFinds, regexState, replaceState);

} while ((status == ippStsNoErr) && (numFinds > 0) && ((numFinds >= findsLen) || (destUsedLen >= destLen)));

Is that the right (but ugly) way? Or did I miss anything?

Thanks,
-Evan
0 Kudos
6 Replies
Igor_B_Intel1
Employee
727 Views
Hi,
It is better to check srcLen param. On function output it holds how many source elements was searched. If it less then origStrLen and all dst buffer is filled than you can move pSrc pointer to the srcLen bytes and repeate the replacement. In that case no need to start the replecement from origin of source string.

Igor S. Belyakov
0 Kudos
evanyan
Beginner
727 Views
Hi Igor S.,

Thanks for your reply.

As my testing, when the match result cannot fit into IppRegExpFind array, srcLen will also equal to origStrLen.

Could you please tell me what are the values of srcUsedLen, dstUseLen, numFinds in the following cases?

o dst buffer is too small to fit in, but IppRegExpFind array is big enough;
o IppRegExpFind array is too small to fit in, but dst buffer is big enough;
o both of IppRegExpFind array and dst buffer are too small to fit in.

The answers to above cases should really be documented. I've spent a lot of time struggling with the replace API.

It turns out my previous approach also doesn't work, because I found that when IppRegExpFind array is too small to fit, numFinds is 0. So far, I haven't figure out a way to use ippRegExpReplace() to replace all matched substrings given above cases.

Thanks,
-Evan


0 Kudos
evanyan
Beginner
727 Views
As least two unexpected behaviors are found so far:

1).
Before invoking ippRegExpReplace:

regex: "[BD]"; option: "g"; replacement: "##"; src: "ABCDE"; destLen: 14; numFinds: 3

After invoking:

srcUsedLen: 1; numFinds: 2; dest: "A##C##E"; destUsedLen: 7

The src string has been entirely parsed. I expected srcUsedLen to be 5.

2).
Before invoking:

regex: "([AB])([CD])([EF])"; option: "g"; replacement: "##"; src: "ACE"; destLen: 10; numFinds: 3

After invoking:

srcUsedLen: 3; numFinds: 0;

The pre-allocated Finds array is too small to fit all results, but I would expect numFinds after invoking to be 3 - fill what can fit. Under current behavior, I have no way to tell the difference of this case from no matching - e.g. use "XYZ" as src string, the result is identical.

Thanks,
-Evan
0 Kudos
evanyan
Beginner
727 Views
Forgot to mention, I'm using IPP 7 Beta.

Thanks,
-Evan
0 Kudos
evanyan
Beginner
727 Views
Could someone confirm that whether they're bugs of IPP? Thanks!
0 Kudos
Vladimir_Dudnik
Employee
727 Views
Hello,

Thaks for reporting issue.We were able to reproduce the problem and will investigate it further. We will update you once we have results.

Regards,
Vladimir
0 Kudos
Reply