I'm putting together a Docker image for Pytorch, and of the 1.8GB deployment size, 800MB is the conda MKL install. When doing machine learning work I can end up creating hundreds of temporary containers on a platform that charges by the second, and pulling a big image is painful. I'm not the first one to notice this - here are two other complaint threads - but the only solution I've seen proposed is to switch to OpenBLAS. That's not very appealing because, well, MKL is far faster.
One solution I have seen mentioned for MKL in general is to build a version that only targets a single architecture. That's not an option for the Python MKL because the source isn't available. So, questions:
Does anyone know of any other way of bringing the MKL install size down?
Does anyone know how much smaller a single-platform MKL variant would be?
If anyone from the MKL team comes across this, could you consider publishing 'slim' MKL variants that target just a single architecture, and carve away any rarely-used features?