Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7014 Discussions

Beginner
957 Views
Can MKL compute the summary stats below?

MdAPE => median absolute percentage error
MAPE => median absolute percentage error

thanks, Scott
7 Replies
Employee
957 Views
Hi Scott,
The present version of Intel MKL Summary Stats does not provide functionality for median/mean absolute deviation/percentage error.At the same time, the library has "elementary" building blocks for those estimates includingmean and median. Can you please briefly clarify

-are those estimates- "bottleneck" in your code (that is, your application can spendsignificant amount of time incomputation of those estimates depending on the problem size)?

- what are the typical dimensions you work with (dimension of random vector/number of observations/etc)?

Andrey

Beginner
957 Views
Hi Andrey,

Actually, the only one I really care about is the MdAD from which I can easily get MdAPE.
It would probably be more efficient to compute it at the same time as other statistics.
But, I can always calculate it in another step.
As a robust measure of spread, I consider MdAD to be the next most important summary stat after standard deviation.

Our software is applied to an incredibly broad array of datasets of which most are small or filtered. But, I'm planning for the future.

thanks, Scott
Employee
957 Views
Hi Scott,
We wouldanalyze your requestto understand what could be done.
Best,
Andrey
Moderator
957 Views
Hello Scott, this issue has been submitted to our internal development tracking database for further investigation, we will inform you once a new update becomes available.Here is a bug tracking number for your reference: 200220845.

Honored Contributor III
957 Views
>It would probably be more efficient to compute it at the same time as other statistics.

I doubt that this is so except for small arrays. The median and other measures of rank require different types of algorithms than the other statistics. The mean, variance, etc. all require that one keep running sums of expressions involving only the current data item and, possibly, other previously computed statistics.

The selection algorithm for computing the median, for example, has O(N) complexity on average, but can degenerate to O(N2) when the array is already sorted and, had we known it to be so, we could have simply picked the N/2-th element of the array as the median.

It would be fairly cheap for the library to compute an estimate of the median along with the other moments. For example, the median of 3 medians of 3 sampled triplets could provide such an estimate. Or, the median of a randomly chosen sample from the input array. In many cases, such an estimate may be all that is needed.

A user who wanted the true median could make a second call to the library with this estimated median, which could be used efficiently with the selection algorithm (or another algorithm that benefits from having an estimate of the median) to find the exact median. Normal users who have no interest in the median would not experience a noticeable loss of performance.
Beginner
957 Views
You misunderstand.
I said that computing the Median Absolute Deviation at the same time that you compute the Median might be more efficient.

I was NOT referring to the computation of the mean or any other moment.
Honored Contributor III
957 Views
Good to know; there is no misunderstanding, then, as to the features desired from the proposed enhancements.