import numpy as np xx=np.arange(1000000)*.1 yyy=np.cumsum(xx) yyy ''' without intel distribution array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99997500e+10, 4.99998500e+10, 4.99999500e+10]) ''' ''' with intel distribution array([ 0.00000000e+00, 1.00000000e-01, 2.00000000e-01, ..., 1.99999300e+05, 9.99998000e+04, 9.99999000e+04]) ''' xx=np.arange(1000000)*.1 yy=xx np.cumsum(xx,out=yy) yy ''' without intel distribution array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99997500e+10, 4.99998500e+10, 4.99999500e+10]) ''' ''' with intel distribution array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 2.99998800e+05, 1.99999500e+05, 1.99999700e+05]) ''' xx=np.arange(100000)*.1 yy=xx np.cumsum(xx,out=yy) yy ''' with intel distribution array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99975000e+08, 4.99985000e+08, 4.99995000e+08]) ''' ''' without intel distribution array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99975000e+08, 4.99985000e+08, 4.99995000e+08]) '''
I've attempted this on two different systems, and have been unable to produce your issue. Could you give us some more details on the conda, numpy, and python versions, in addition to the hardware configuration(s) you are on?
Interesting. We're seeing the same error on three different computers: one surface pro 4, one desktop i7-7700, and one gigabyte laptop, i7-4710HQ, each with win10 Home-x64, and the following python packages: python 3.5.2, numpy 1.11.2, conda 4.3.14, intelpython 2017.0.2.
The computers have different amounts of ram (16gb-64gb), and for each of them we followed the straightforward intel python distribution as described here (https://software.intel.com/en-us/articles/using-intel-distribution-for-python-with-anaconda), after installing anaconda.
This is running in the IPython shell, via spyder, same error if I run on the Python console (the latter one starts up saying "Python 3.5.2 |Intel Corporation| (default, Feb 5 2017, 02:57:01) [MSC v.1900 64 bit (AMD64)] on win32...".
Hi David and Michael,
I was curious about this issue and I could reproduce the problem Michael reports on macOS El Capitan, on a MacBook Pro powered by the following Intel CPU: Intel® Core™ i5-4278U Processor. The results are different as Michel reports in his explanation.
Hi David and Michael,
Not sure whether it helps or not. However, I also executed the code on the following console provided by PythonAnywhere (https://www.pythonanywhere.com/try-ipython/), and you can see the output is the same one that Michael reports and it is different than the results generated by the Intel distribution.
Hi David and Michael,
I executed the first example Michael reported on a Windows 10 laptop powered by an Intel Core i7-6700HQ CPU. The results do not have the differences that Michael reported. So, Intel Distribution for Python produces a different result on macOS / Windows or on the different CPUs. Not sure which is the issue. In this case, the Intel Distribution for Python produces the same results than Python 3.5.2 (not Intel).
Python 3.5.2 (64-bit), non Intel distribution produces the results that Michael has reported.
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> xx=np.arange(1000000)*.1 >>> yyy=np.cumsum(xx) >>> yyy array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99997500e+10, 4.99998500e+10, 4.99999500e+10]) >>>
The same coded executed on Intel Distribution for Python produces the following output. No different from the previous output but different from the results reported by Michael.
Python 3.5.2 |Intel Corporation| (default, Feb 5 2017, 02:57:01) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. Intel(R) Distribution for Python is brought to you by Intel Corporation. Please check out: https://software.intel.com/en-us/python-distribution >>> import numpy as np >>> xx=np.arange(1000000)*.1 >>> yyy=np.cumsum(xx) >>> yyy array([ 0.00000000e+00, 1.00000000e-01, 3.00000000e-01, ..., 4.99997500e+10, 4.99998500e+10, 4.99999500e+10])
So, Michael, it would be great if you can share OS and hardware info.
Intel engineers reproduced the error which appears to be related to how numpy computes cumulative sum. Our recent numpy optimizations did not take into account that. Interestingly internal tests did not reveal this issue during validation.
Engineers will report soon whether they see the workaround.
Sorry about that,
Michael Roelens wrote:
Thanks for that. Very strange. I'm not seeing any pattern yet. How did you install the intel distribution for your last post? Simply Anaconda, then intel distribution followed possibly by updates? We followed that procedure for each of the three computers we're using here, all three with the same problem when using the intel distribution.
I've updated in the meantime to conda 4.3.16, same problem still.
Michael, I've used the installation provided by Intel for Windows to install Intel Distribution for Python on Windows 10.
Thank you very much for taking the time to bring this to our attention. We reproduced the problem, and it affects universal functions applied to large arrays of doubles, floats, or corresponding complexes.
`np.cumsum` is the chief mainstream operation affected, although non-standard uses of any universal functions can be affected.
Regrettably, there is no setting within numpy to work around the issue. Chunking the array using slices, and applying the function to these chunks comes to mind, but this is too much to ask.
We are working to provide a hotfix.
The `np.unwrap` is affected because it uses `np.cumsum` underneath.
Our release process relies on community tests for validation, but evidently no test exercised the culprit optimization code we added.
I will announce the hotfix on this thread as soon as it becomes available.
Thank you for your understanding,
I am happy to report that the fix for this issue has been posted. Please try updating NumPy in the distribution by running
conda update -c intel numpy
This should fix the issue underlying the observed erroneous behavior.
Thanks a lot for that. I had to uninstall intelpython3_core/full (2017.0.2) to be able to install the 1.11.3 version of numpy here though, because conda said the two weren't compatible at the moment. But the updated package does indeed fix the cumsum problem.
Kinda makes me wonder: what is the intelpython3_core or full package needed for? Is it some kind of wrapper that contains a bunch of packages?
Thank again, for the quick fix!
You are essentially correct. The intelpython3_core package is a "metapackage": it contains no files of its own, but rather collects a set of other packages into a named unit for ease of installation. When you install a particular version of "intelpython3_core" or "intelpython3_full", you will get all the packages we released.
There should be no need for you to manually uninstall it. I will check the update logic to be certain it is working as expected.
Should I run this update on Intel Distribution for Python 3.5.2, too, or is it only necessary to run it for Intel Distribution for Python 2.7? I'm working with both versions on Windows, macOS and Linux.
The changes in the update are not specific to any particular version of Python, or to the platform. Updates were posted for all platforms, and for both Python 2.7 and Python 3.5