Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.

cumsum error on large float arrays

Michael_R_1
Beginner
1,899 Views
Hi, We've run into some inconsistencies when calculating the cumulative sum for large float arrays with numpy.cumsum, which doesn't seem to happen with Anaconda's default (non-Intel) distribution. When the array is longer than about 110,000 points, the result of the 3rd item in the list is already wrong. This doesn't happen with a vector that's only 100,000 points long. I'm pasting the code and some results comparing the intel distribution and default distribution. At first, I thought the problem could be resolved by pre-allocating the memory space and use the "out" argument, because the 3rd item in the list is still correctly calculated, but the end of the array is still wrong. We've also got some trouble with the numpy.unwrap function in the intel distribution, again on large arrays, but I'm still trying to get a consistent example. Are there specific options in the intel distribution for setting precision that could influence this? Below is some code with results in comments, showing the differences between the intel distribution and the anaconda distribution. Any help or suggestions here would be appreciated! Regards Michael ------
import numpy as np
xx=np.arange(1000000)*.1
yyy=np.cumsum(xx)
yyy
''' without intel distribution
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99997500e+10,   4.99998500e+10,   4.99999500e+10])
'''
''' with intel distribution
array([  0.00000000e+00,   1.00000000e-01,   2.00000000e-01, ...,
         1.99999300e+05,   9.99998000e+04,   9.99999000e+04])
'''


xx=np.arange(1000000)*.1
yy=xx
np.cumsum(xx,out=yy)
yy
''' without intel distribution
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99997500e+10,   4.99998500e+10,   4.99999500e+10])
'''
''' with intel distribution
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         2.99998800e+05,   1.99999500e+05,   1.99999700e+05])
'''


xx=np.arange(100000)*.1
yy=xx
np.cumsum(xx,out=yy)
yy
''' with intel distribution
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99975000e+08,   4.99985000e+08,   4.99995000e+08])
'''
''' without intel distribution
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99975000e+08,   4.99985000e+08,   4.99995000e+08])
'''
0 Kudos
20 Replies
DavidLiu
Employee
1,852 Views

Hi Michael, 

I've attempted this on two different systems, and have been unable to produce your issue.  Could you give us some more details on the conda, numpy, and python versions, in addition to the hardware configuration(s) you are on?

Thanks,

David

0 Kudos
Michael_R_1
Beginner
1,852 Views

Hi David,

Interesting. We're seeing the same error on three different computers: one surface pro 4, one desktop i7-7700, and one gigabyte laptop, i7-4710HQ, each with win10 Home-x64, and the following python packages: python 3.5.2, numpy 1.11.2, conda 4.3.14, intelpython 2017.0.2.

The computers have different amounts of ram (16gb-64gb), and for each of them we followed the straightforward intel python distribution as described here (https://software.intel.com/en-us/articles/using-intel-distribution-for-python-with-anaconda), after installing anaconda.

This is running in the IPython shell, via spyder, same error if I run on the Python console (the latter one starts up saying "Python 3.5.2 |Intel Corporation| (default, Feb  5 2017, 02:57:01) [MSC v.1900 64 bit (AMD64)] on win32...".

Michael

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

Hi David and Michael,

I was curious about this issue and I could reproduce the problem Michael reports on macOS El Capitan, on a MacBook Pro powered by the following Intel CPU: Intel® Core™ i5-4278U Processor. The results are different as Michel reports in his explanation.

 

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

Hi David and Michael,

I used the default Python version that comes installed with macOS El Capitan, which is Python 2.7.10.

 

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

Hi David and Michael,

Not sure whether it helps or not. However, I also executed the code on the following console provided by PythonAnywhere (https://www.pythonanywhere.com/try-ipython/), and you can see the output is the same one that Michael reports and it is different than the results generated by the Intel distribution.

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

Hi David and Michael,

I executed the first example Michael reported on a Windows 10 laptop powered by an Intel Core i7-6700HQ CPU. The results do not have the differences that Michael reported. So, Intel Distribution for Python produces a different result on macOS / Windows or on the different CPUs. Not sure which is the issue. In this case, the Intel Distribution for Python produces the same results than Python 3.5.2 (not Intel).

Python 3.5.2 (64-bit), non Intel distribution produces the results that Michael has reported.

Sample output:

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> xx=np.arange(1000000)*.1
>>> yyy=np.cumsum(xx)
>>> yyy
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99997500e+10,   4.99998500e+10,   4.99999500e+10])
>>>

The same coded executed on Intel Distribution for Python produces the following output. No different from the previous output but different from the results reported by Michael.

Python 3.5.2 |Intel Corporation| (default, Feb  5 2017, 02:57:01) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distribution
>>> import numpy as np
>>> xx=np.arange(1000000)*.1
>>> yyy=np.cumsum(xx)
>>> yyy
array([  0.00000000e+00,   1.00000000e-01,   3.00000000e-01, ...,
         4.99997500e+10,   4.99998500e+10,   4.99999500e+10])

So, Michael, it would be great if you can share OS and hardware info.

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

Michael,

I hadn't seen your message in which you described the hardware... So, forget about my last lines in which I was suggesting you to provide hardware info. :)

0 Kudos
Michael_R_1
Beginner
1,852 Views
Hi Gaston, Thanks for that. Very strange. I'm not seeing any pattern yet. How did you install the intel distribution for your last post? Simply Anaconda, then intel distribution followed possibly by updates? We followed that procedure for each of the three computers we're using here, all three with the same problem when using the intel distribution. I've updated in the meantime to conda 4.3.16, same problem still. Cheers, Michael
0 Kudos
Sergey_M_Intel2
Employee
1,852 Views

Hello everybody,

Intel engineers reproduced the error which appears to be related to how numpy computes cumulative sum. Our recent numpy optimizations did not take into account that. Interestingly internal tests did not reveal this issue during validation.

Engineers will report soon whether they see the workaround.

Sorry about that,

Sergey

0 Kudos
gaston-hillar
Valued Contributor I
1,851 Views

Michael Roelens wrote:

Hi Gaston,

Thanks for that. Very strange. I'm not seeing any pattern yet. How did you install the intel distribution for your last post? Simply Anaconda, then intel distribution followed possibly by updates? We followed that procedure for each of the three computers we're using here, all three with the same problem when using the intel distribution.

I've updated in the meantime to conda 4.3.16, same problem still.

Cheers,

Michael

Michael, I've used the installation provided by Intel for Windows to install Intel Distribution for Python on Windows 10.

0 Kudos
Oleksandr_P_Intel
1,852 Views

 

Dear Michael, 

Thank you very much for taking the time to bring this to our attention. We reproduced the problem, and it affects universal functions applied to large arrays of doubles, floats, or corresponding complexes.

`np.cumsum` is the chief mainstream operation affected, although non-standard uses of any universal functions can be affected. 

Regrettably, there is no setting within numpy to work around the issue. Chunking the array using slices, and applying the function to these chunks comes to mind, but this is too much to ask.

We are working to provide a hotfix. 

The `np.unwrap` is affected because it uses `np.cumsum` underneath.

Our release process relies on community tests for validation, but evidently no test exercised the culprit optimization code we added. 

I will announce the hotfix on this thread as soon as it becomes available. 

Thank you for your understanding,
Oleksandr

0 Kudos
Michael_R_1
Beginner
1,852 Views
Hi Oleksandr and Sergey, I'm very impressed by how quickly you responded and figured out what's going wrong. Thanks for confirming, and looking forward to the hotfix! Kind regards, Michael
0 Kudos
Oleksandr_P_Intel
1,852 Views

Hi Michael, 

I am happy to report that the fix for this issue has been posted. Please try updating NumPy in the distribution by running

 conda update -c intel numpy

This should fix the issue underlying the observed erroneous behavior.

0 Kudos
Michael_R_1
Beginner
1,852 Views

Hi Oleksandr,

Thanks a lot for that. I had to uninstall intelpython3_core/full (2017.0.2) to be able to install the 1.11.3 version of numpy here though, because conda said the two weren't compatible at the moment. But the updated package does indeed fix the cumsum problem.

Kinda makes me wonder: what is the intelpython3_core or full package needed for? Is it some kind of wrapper that contains a bunch of packages?

Thank again, for the quick fix!

 

Michael

 

 

0 Kudos
Todd_T_Intel
Employee
1,852 Views

Michael,

You are essentially correct. The intelpython3_core package is a "metapackage": it contains no files of its own, but rather collects a set of other packages into a named unit for ease of installation. When you install a particular version of "intelpython3_core" or "intelpython3_full", you will get all the packages we released.

There should be no need for you to manually uninstall it. I will check the update logic to be certain it is working as expected.

Todd

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

@Oleksandr,

Should I run this update on Intel Distribution for Python 3.5.2, too, or is it only necessary to run it for Intel Distribution for Python 2.7? I'm working with both versions on Windows, macOS and Linux.

0 Kudos
Oleksandr_P_Intel
1,852 Views

The changes in the update are not specific to any particular version of Python, or to the platform. Updates were posted for all platforms, and for both Python 2.7 and Python 3.5

0 Kudos
Todd_T_Intel
Employee
1,852 Views

The problem with the intelpython3_core  package saying there was a conflict when attempting to update to the repaired numpy should be fixed.

Sorry for the trouble.

Todd

0 Kudos
gaston-hillar
Valued Contributor I
1,852 Views

@Oleksandr,

Thanks for clarification. I've successfully updated all my versions.

0 Kudos
Michael_R_1
Beginner
1,585 Views
Hi Todd, Thanks for that. Indeed, after updating my package index, I was able to install the two together now. Thanks everyone for the quick fix and support! Michael
0 Kudos
Reply