- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was trying to adapt the Spark SVD example provided in the library such that it would run on a Kubernetes Cluster using Spark.
Are there any existing attempts to do this?
So far, the issue seems to be that the Spark Docker image builder is based on Alpine Linux, which seems to be conflicting with glibc (this is a known issue). Even though I have spent a significant amount of time trying to integrate glibc, and then install MKL and DAAL on top of that, I was not able to execute a successful run.
Below I have appended the current Dockerfile for the image creation process. Currently, I am getting the following error:
Exception in thread "main" java.lang.UnsatisfiedLinkError: /opt/intel/compilers_and_libraries_2018.3.222/linux/daal/lib/intel64_lin/libJavaAPI.so: Error relocating /opt/intel/compilers_and_libraries_2018.3.222/linux/daal/lib/intel64_lin/libJavaAPI.so: __snprintf_chk: symbol not found
FROM openjdk:8-alpine RUN apk update && \ apk add bash ARG spark_jars=jars ARG img_path=kubernetes/dockerfiles ENV GLIBC_VERSION 2.27-r0 ENV ALPINE_VERSION 3.7 # Download and install glibc RUN apk add --update curl && \ curl -Lo /etc/apk/keys/sgerrand.rsa.pub https://raw.githubusercontent.com/sgerrand/alpine-pkg-glibc/master/sgerrand.rsa.pub && \ curl -Lo glibc.apk "https://github.com/sgerrand/alpine-pkg-glibc/releases/download/${GLIBC_VERSION}/glibc-${GLIBC_VERSION}.apk" && \ curl -Lo glibc-bin.apk "https://github.com/sgerrand/alpine-pkg-glibc/releases/download/${GLIBC_VERSION}/glibc-bin-${GLIBC_VERSION}.apk" && \ apk add glibc-bin.apk glibc.apk && \ /usr/glibc-compat/sbin/ldconfig /lib /usr/glibc-compat/lib && \ echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf && \ apk del curl && \ rm -rf glibc.apk glibc-bin.apk /var/cache/apk/* # install openblas RUN apk update && set -ex ;\ echo "@community http://dl-cdn.alpinelinux.org/alpine/v$ALPINE_VERSION/community" >> /etc/apk/repositories ;\ apk update; \ apk add --no-cache --update libstdc++ \ g++ \ gfortran \ libgfortran \ openblas-dev@community \ openblas@community \ arpack@community \ arpack-dev@community \ ;\ rm /var/cache/apk/*; # Since existing libraries might not be linked properly, fore the linking manually. # This is for locally built dependencies. #RUN ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/libblas.so && \ # ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/libblas.so.3 && \ # ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/libblas.so.3.5 && \ # ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/liblapack.so && \ # ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/liblapack.so.3 && \ # ln -s -f /usr/OpenBLAS/lib/libopenblas.so /usr/lib/liblapack.so.3.5 && \ # ln -s -f /usr/OpenBLAS/include/cblas.h /usr/OpenBLAS/lib/clapack.h; # These are for apk-installed dependencies. RUN ln -s -f /usr/lib/libopenblas.so /usr/lib/libblas.so && \ ln -s -f /usr/lib/libopenblas.so /usr/lib/libblas.so.3 && \ ln -s -f /usr/lib/libopenblas.so /usr/lib/libblas.so.3.5 && \ ln -s -f /usr/lib/libopenblas.so /usr/lib/liblapack.so && \ ln -s -f /usr/lib/libopenblas.so /usr/lib/liblapack.so.3 && \ ln -s -f /usr/lib/libopenblas.so /usr/lib/liblapack.so.3.5 && \ ln -s -f /usr/include/cblas.h /usr/include/clapack.h; RUN apk update && \ /usr/glibc-compat/sbin/ldconfig /usr/lib; # the painful process of installing DAAL (and MKL, and a whole lot of other stuff). # Lots of hacky magic happening. RUN cd /sbin && \ mv ldconfig ldconfig_musl && \ ln -s /usr/glibc-compat/sbin/ldconfig ldconfig && \ apk add --no-cache --update curl && \ curl -fL https://raw.githubusercontent.com/orctom/alpine-glibc-packages/master/usr/lib/libstdc++.so.6.0.21 -o /usr/lib/libstdc++.so.6.0.21 && \ ln -sf /usr/lib/libstdc++.so.6.0.21 /usr/lib/libstdc++.so.6.new; mv /usr/lib/libstdc++.so.6.new /usr/lib/libstdc++.so.6 && \ wget http://registrationcenter-download.intel.com/akdlm/irc_nas/tec/13005/l_mkl_2018.3.222.tgz && \ tar -zxvf l_mkl_2018.3.222.tgz && \ cd l_mkl_2018.3.222 && \ sed -i 's/ACCEPT_EULA=decline/ACCEPT_EULA=accept/g' silent.cfg && \ ./install.sh -s silent.cfg && \ cd .. && \ rm -rf l_mkl_2018.3.222 && \ rm -rf l_mkl_2018.3.222.tgz && \ wget http://registrationcenter-download.intel.com/akdlm/irc_nas/tec/13007/l_daal_2018.3.222.tgz && \ tar -zxvf l_daal_2018.3.222.tgz && \ cd l_daal_2018.3.222 && \ sed -i 's/ACCEPT_EULA=decline/ACCEPT_EULA=accept/g' silent.cfg && \ ./install.sh -s silent.cfg && \ cd .. && \ rm -rf l_daal_2018.3.222 && \ rm -rf l_daal_2018.3.222.tgz && \ cd /opt/intel/compilers_and_libraries_2018.3.222/linux/daal/bin/ && \ ./daalvars.sh intel64; RUN echo "/opt/intel/mkl/lib/intel64" >> /usr/glibc-compat/etc/ld.so.conf && \ touch ~/.profile; RUN echo ". /opt/intel/bin/compilervars.sh intel64" >> ~/.profile && \ echo "export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/opt/intel/daal/lib/intel64" >> ~/.profile && \ echo "export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/usr/glibc-compat/lib" >> ~/.profile && \ echo "export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/usr/lib" >> ~/.profile && \ echo "export PATH=\$PATH:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/tbb/lib/intel64_lin/gcc4.7:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/daal/../tbb/lib/intel64_lin/gcc4.4" >> ~/.profile && \ /usr/glibc-compat/sbin/ldconfig; ## The following is provided mostly as is by the Spark distro. # Before building the docker image, first build and make a Spark distribution following # the instructions in http://spark.apache.org/docs/latest/building-spark.html. # If this docker file is being used in the context of building your images from a Spark # distribution, the docker build command should be invoked from the top level directory # of the Spark distribution. E.g.: # docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile . RUN set -ex && \ apk upgrade --no-cache && \ apk add --no-cache bash tini && \ mkdir -p /opt/spark && \ mkdir -p /opt/spark/work-dir \ touch /opt/spark/RELEASE && \ rm /bin/sh && \ ln -sv /bin/bash /bin/sh && \ chgrp root /etc/passwd && chmod ug+rw /etc/passwd COPY ${spark_jars} /opt/spark/jars COPY bin /opt/spark/bin COPY sbin /opt/spark/sbin COPY conf /opt/spark/conf COPY ${img_path}/spark/entrypoint.sh /opt/ COPY examples /opt/spark/examples COPY data /opt/spark/data ENV SPARK_HOME /opt/spark # make sure the Intel libraries are visible to the compiler. ENV LD_LIBRARY_PATH /opt/intel/daal/lib/intel64:/usr/glibc-compat/lib:/usr/lib/ ENV PATH $PATH:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/tbb/lib/intel64_lin/gcc4.7:/opt/intel/compilers_and_libraries_2018.3.222/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.3.222/linux/daal/../tbb/lib/intel64_lin/gcc4.4 WORKDIR /opt/spark/work-dir RUN apk update ENTRYPOINT [ "/opt/entrypoint.sh" ]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We didn`t experiments with alpine linux. It looks like that the problem is common for alpine linux users - the glibc package is mostly experimental.
You may try to workaround undefined __snprintf_chk in following way - write your own function which will be actually wrapper for snprintf. Something like this:
___snprintf_chk (char *s, size_t maxlen, int flags, size_t slen, const char *format, ...)
{
va_list arg;
int done;
va_start (arg, format);
done = snprintf_chk (s, slen, format, arg);
va_end (arg);
return done;
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hm, I am not sure how much sense that would make, because it is likely that other symbols are broken, too, and I wouldn't want to implement all of them myself.
The reason why I was wondering about alpine specifically is more the reason that it is the default base image for Spark on Kubernetes; it turned out to cost me less time to switch to an Ubuntu-based base image, for which it seems to work properly.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page