OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1721 Discussions

OpenCL Runtime for Intel Xeon Processors Wmvare VPS

Istvan_V_
Beginner
1,923 Views

I rent a VPS.

OS: Debian 9, CLI only.

My problem is that the server is very slow, (I use hashcat) so I would like to know:

Should I install the driver for OpenCL Runtime for Intel Core and Intel Xeon Processors?

  1. Do I need only the OpenCL™ 2.0 CPU Driver Package for Linux*(64-bit)?
  2. Shall I install   Intel® SDK for OpenCL™ Applications 2016 R2 for Linux* (64-bit), too?
  3. My VPS is not a GPU accelerated server. I would like to increase the server's CPU capabilities. Is it possible to increase it by installing these 2 software ?

Thank your help.

freeroute@freeroute:~$ uname -a
Linux freeroute 4.9.0-kali4-amd64 #1 SMP Debian 4.9.30-2kali1 (2017-06-22) x86_64 GNU/Linux

freeroute@freeroute:~$ lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 08)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:07.7 System peripheral: VMware Virtual Machine Communication Interface (rev 10)
00:0f.0 VGA compatible controller: VMware SVGA II Adapter
00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
00:11.0 PCI bridge: VMware PCI bridge (rev 02)
00:15.0 PCI bridge: VMware PCI Express Root Port (rev 01)

00:18.7 PCI bridge: VMware PCI Express Root Port (rev 01)
02:00.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
02:01.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
02:02.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)

cat /proc/cpuinfo:
processor    : 0
vendor_id    : GenuineIntel
cpu family    : 6
model        : 45
model name    : Intel(R) Xeon(R) CPU E5-2650L v4 @ 1.70GHz
stepping    : 2
microcode    : 0xb00001f
cpu MHz        : 1699.076
cache size    : 35840 KB
physical id    : 0
siblings    : 1
core id        : 0
cpu cores    : 1
apicid        : 0
initial apicid    : 0
fpu        : yes
fpu_exception    : yes
cpuid level    : 13
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx hypervisor lahf_lm epb dtherm ida arat pln pts
bugs        :
bogomips    : 3399.99
clflush size    : 64
cache_alignment    : 64
address sizes    : 40 bits physical, 48 bits virtual
power management:

 

0 Kudos
1 Solution
Jeffrey_M_Intel1
Employee
1,923 Views

Installing both the CPU driver/runtime and the SDK will give you the full range of capabilities and tools.  The distribution is split into 

  • Driver/runtime packages: these are the redistributable components an end user would need to run OpenCL applications
  • SDK: tools, IDE integration, offline compiler, etc. for developing OpenCL applications

As you pointed out, one of the main reasons for using OpenCL is access to accelerators such as GPU or FPGA.  For CPU you would not get a boost from additional hardware access.  However, there are still reasons to consider CPU OpenCL:

  • Portability: OpenCL is not performance portable, but a reference implementation written in OpenCL gets you much further toward an optimized OpenCL implementation than starting with standard code.
  • Optimization: the CPU implementation has some nice automatic vectorization and threading.  OpenCL is not the only way to achieve this for CPU, but if your algorithm is a good match for OpenCL NDrange partitioning of SIMD operations this could be a quicker path toward better CPU utilization than other methods.  

Since your VPS only gives you 1 "processor", performance improvement may come down to vectorization alone.  Since Hashcat appears to have CPU OpenCL optimizations built in, it should be an easy experiment to see how much improvement you get.  Without more cores to distribute the work you may be comparing the autovectorization from the C/C++ CPU compiler vs. the compiler in the OpenCL CPU runtime.  

View solution in original post

0 Kudos
1 Reply
Jeffrey_M_Intel1
Employee
1,924 Views

Installing both the CPU driver/runtime and the SDK will give you the full range of capabilities and tools.  The distribution is split into 

  • Driver/runtime packages: these are the redistributable components an end user would need to run OpenCL applications
  • SDK: tools, IDE integration, offline compiler, etc. for developing OpenCL applications

As you pointed out, one of the main reasons for using OpenCL is access to accelerators such as GPU or FPGA.  For CPU you would not get a boost from additional hardware access.  However, there are still reasons to consider CPU OpenCL:

  • Portability: OpenCL is not performance portable, but a reference implementation written in OpenCL gets you much further toward an optimized OpenCL implementation than starting with standard code.
  • Optimization: the CPU implementation has some nice automatic vectorization and threading.  OpenCL is not the only way to achieve this for CPU, but if your algorithm is a good match for OpenCL NDrange partitioning of SIMD operations this could be a quicker path toward better CPU utilization than other methods.  

Since your VPS only gives you 1 "processor", performance improvement may come down to vectorization alone.  Since Hashcat appears to have CPU OpenCL optimizations built in, it should be an easy experiment to see how much improvement you get.  Without more cores to distribute the work you may be comparing the autovectorization from the C/C++ CPU compiler vs. the compiler in the OpenCL CPU runtime.  

0 Kudos
Reply