<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Installing both the CPU in OpenCL* for CPU</title>
    <link>https://community.intel.com/t5/OpenCL-for-CPU/OpenCL-Runtime-for-Intel-Xeon-Processors-Wmvare-VPS/m-p/1153542#M6138</link>
    <description>&lt;P&gt;Installing both the CPU driver/runtime and the SDK will give you the full range of capabilities and tools. &amp;nbsp;The distribution is split into&amp;nbsp;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;Driver/runtime packages: these are the redistributable components an end user would need to run OpenCL applications&lt;/LI&gt;
	&lt;LI&gt;SDK: tools, IDE integration, offline compiler, etc. for developing OpenCL applications&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;As you pointed out, one of the main reasons for using OpenCL is access to accelerators such as GPU or FPGA. &amp;nbsp;For CPU you would not get a boost from additional hardware access. &amp;nbsp;However, there are still reasons to consider CPU OpenCL:&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;Portability: OpenCL is not performance portable, but a reference implementation written in OpenCL gets you much further toward an optimized OpenCL implementation than starting with standard code.&lt;/LI&gt;
	&lt;LI&gt;Optimization: the CPU implementation has some nice automatic vectorization and threading. &amp;nbsp;OpenCL is not the only way to achieve this for CPU, but if your algorithm is a good match for OpenCL NDrange partitioning of SIMD operations this could be a quicker path toward better CPU utilization than other methods. &amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Since your VPS only gives you 1 "processor", performance improvement may come down to vectorization alone. &amp;nbsp;Since Hashcat appears to have CPU OpenCL optimizations built in, it should be an easy experiment to see how much improvement you get. &amp;nbsp;Without more cores to distribute the work you may be comparing the autovectorization from the C/C++ CPU compiler vs. the compiler in the OpenCL CPU runtime. &amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 25 Jul 2017 23:08:53 GMT</pubDate>
    <dc:creator>Jeffrey_M_Intel1</dc:creator>
    <dc:date>2017-07-25T23:08:53Z</dc:date>
    <item>
      <title>OpenCL Runtime for Intel Xeon Processors Wmvare VPS</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/OpenCL-Runtime-for-Intel-Xeon-Processors-Wmvare-VPS/m-p/1153541#M6137</link>
      <description>&lt;P&gt;I rent a VPS.&lt;/P&gt;

&lt;P&gt;OS: Debian 9, CLI only.&lt;/P&gt;

&lt;P&gt;My problem is that the server is very slow, (I use &lt;A href="https://hashcat.net/hashcat/"&gt;hashcat&lt;/A&gt;) so I would like to know:&lt;/P&gt;

&lt;P&gt;Should I install the &lt;A href="https://software.intel.com/en-us/articles/opencl-drivers#latest_CPU_runtime"&gt;driver for OpenCL Runtim&lt;/A&gt;e for Intel Core and Intel Xeon Processors?&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;Do I need only the &lt;A href="https://software.intel.com/en-us/articles/opencl-drivers#latest_CPU_runtime"&gt;OpenCL™&amp;nbsp;2.0 CPU Driver Package for Linux*(64-bit)?&lt;/A&gt;&lt;/STRONG&gt;&lt;/EM&gt;&lt;/LI&gt;
	&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;A href="https://software.intel.com/en-us/articles/opencl-drivers#latest_linux_SDK_release" rel="nofollow"&gt;Shall I install &amp;nbsp; Intel® SDK for OpenCL™ Applications 2016 R2&amp;nbsp;for Linux* (64-bit), too?&lt;/A&gt; &lt;/STRONG&gt;&lt;/EM&gt;&lt;/LI&gt;
	&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;My VPS is not a GPU accelerated server. I&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt; would like to increase the server's CPU capabilities. Is it possible to increase it by installing these 2 software ?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/EM&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;SPAN class="short_text" lang="en"&gt;&lt;SPAN&gt;Thank your help.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;freeroute@freeroute:~$ &lt;STRONG&gt;uname -a&lt;/STRONG&gt;&lt;BR /&gt;
	&lt;STRONG&gt;Linux freeroute 4.9.0-kali4-amd64 #1 SMP Debian 4.9.30-2kali1 (2017-06-22) x86_64 GNU/Linux&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;freeroute@freeroute:~$&lt;STRONG&gt; lspci&lt;/STRONG&gt;&lt;BR /&gt;
	00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)&lt;BR /&gt;
	00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01)&lt;BR /&gt;
	00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 08)&lt;BR /&gt;
	00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)&lt;BR /&gt;
	00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)&lt;BR /&gt;
	00:07.7 System peripheral: VMware Virtual Machine Communication Interface (rev 10)&lt;BR /&gt;
	&lt;STRONG&gt;00:0f.0 VGA compatible controller: VMware SVGA II Adapter&lt;/STRONG&gt;&lt;BR /&gt;
	00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)&lt;BR /&gt;
	00:11.0 PCI bridge: VMware PCI bridge (rev 02)&lt;BR /&gt;
	00:15.0 PCI bridge: VMware PCI Express Root Port (rev 01)&lt;BR /&gt;
	&lt;BR /&gt;
	00:18.7 PCI bridge: VMware PCI Express Root Port (rev 01)&lt;BR /&gt;
	02:00.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)&lt;BR /&gt;
	02:01.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)&lt;BR /&gt;
	02:02.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;cat /proc/cpuinfo:&lt;/STRONG&gt;&lt;BR /&gt;
	processor&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0&lt;BR /&gt;
	vendor_id&amp;nbsp;&amp;nbsp; &amp;nbsp;: GenuineIntel&lt;BR /&gt;
	cpu family&amp;nbsp;&amp;nbsp; &amp;nbsp;: 6&lt;BR /&gt;
	model&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: 45&lt;BR /&gt;
	&lt;STRONG&gt;model name&amp;nbsp;&amp;nbsp; &amp;nbsp;: Intel(R) Xeon(R) CPU E5-2650L v4 @ 1.70GHz&lt;/STRONG&gt;&lt;BR /&gt;
	stepping&amp;nbsp;&amp;nbsp; &amp;nbsp;: 2&lt;BR /&gt;
	microcode&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0xb00001f&lt;BR /&gt;
	cpu MHz&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: 1699.076&lt;BR /&gt;
	cache size&amp;nbsp;&amp;nbsp; &amp;nbsp;: 35840 KB&lt;BR /&gt;
	physical id&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0&lt;BR /&gt;
	siblings&amp;nbsp;&amp;nbsp; &amp;nbsp;: 1&lt;BR /&gt;
	core id&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0&lt;BR /&gt;
	cpu cores&amp;nbsp;&amp;nbsp; &amp;nbsp;: 1&lt;BR /&gt;
	apicid&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0&lt;BR /&gt;
	initial apicid&amp;nbsp;&amp;nbsp; &amp;nbsp;: 0&lt;BR /&gt;
	fpu&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: yes&lt;BR /&gt;
	fpu_exception&amp;nbsp;&amp;nbsp; &amp;nbsp;: yes&lt;BR /&gt;
	cpuid level&amp;nbsp;&amp;nbsp; &amp;nbsp;: 13&lt;BR /&gt;
	wp&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: yes&lt;BR /&gt;
	&lt;STRONG&gt;flags&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;: &lt;/STRONG&gt;fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx hypervisor lahf_lm epb dtherm ida arat pln pts&lt;BR /&gt;
	bugs&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;:&lt;BR /&gt;
	bogomips&amp;nbsp;&amp;nbsp; &amp;nbsp;: 3399.99&lt;BR /&gt;
	clflush size&amp;nbsp;&amp;nbsp; &amp;nbsp;: 64&lt;BR /&gt;
	cache_alignment&amp;nbsp;&amp;nbsp; &amp;nbsp;: 64&lt;BR /&gt;
	address sizes&amp;nbsp;&amp;nbsp; &amp;nbsp;: 40 bits physical, 48 bits virtual&lt;BR /&gt;
	power management:&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2017 13:00:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/OpenCL-Runtime-for-Intel-Xeon-Processors-Wmvare-VPS/m-p/1153541#M6137</guid>
      <dc:creator>Istvan_V_</dc:creator>
      <dc:date>2017-07-25T13:00:07Z</dc:date>
    </item>
    <item>
      <title>Installing both the CPU</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/OpenCL-Runtime-for-Intel-Xeon-Processors-Wmvare-VPS/m-p/1153542#M6138</link>
      <description>&lt;P&gt;Installing both the CPU driver/runtime and the SDK will give you the full range of capabilities and tools. &amp;nbsp;The distribution is split into&amp;nbsp;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;Driver/runtime packages: these are the redistributable components an end user would need to run OpenCL applications&lt;/LI&gt;
	&lt;LI&gt;SDK: tools, IDE integration, offline compiler, etc. for developing OpenCL applications&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;As you pointed out, one of the main reasons for using OpenCL is access to accelerators such as GPU or FPGA. &amp;nbsp;For CPU you would not get a boost from additional hardware access. &amp;nbsp;However, there are still reasons to consider CPU OpenCL:&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;Portability: OpenCL is not performance portable, but a reference implementation written in OpenCL gets you much further toward an optimized OpenCL implementation than starting with standard code.&lt;/LI&gt;
	&lt;LI&gt;Optimization: the CPU implementation has some nice automatic vectorization and threading. &amp;nbsp;OpenCL is not the only way to achieve this for CPU, but if your algorithm is a good match for OpenCL NDrange partitioning of SIMD operations this could be a quicker path toward better CPU utilization than other methods. &amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Since your VPS only gives you 1 "processor", performance improvement may come down to vectorization alone. &amp;nbsp;Since Hashcat appears to have CPU OpenCL optimizations built in, it should be an easy experiment to see how much improvement you get. &amp;nbsp;Without more cores to distribute the work you may be comparing the autovectorization from the C/C++ CPU compiler vs. the compiler in the OpenCL CPU runtime. &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2017 23:08:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/OpenCL-Runtime-for-Intel-Xeon-Processors-Wmvare-VPS/m-p/1153542#M6138</guid>
      <dc:creator>Jeffrey_M_Intel1</dc:creator>
      <dc:date>2017-07-25T23:08:53Z</dc:date>
    </item>
  </channel>
</rss>

