- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Firend,
I am new in Xe_sudio composer of intel. I have good knowlege of Parallel Programing on GPU with CUDA and OPenCL. I want to learen intel xe composer icc , mkl & ipp. I have read all installtion guide and tutorial. But Can any one suggest me how will i start programing.
That Means,
How I will use single core and multiple core of my processor.
How will i divide my execution on diffrent cores.
Please Help me! I am using Intel i7 Processor.
Thanks,
Link Copied
48 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Welcome to X86 parallel programming. I would ask you to start with our compiler User and Reference guides to know about the Intel compiler (ICC) usage.
http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-mac/index.htm
This document have discussed about the various Multi-Threading models supported by Intel ICC compiler, Using which you can take advantage of running your code efficiently on all the available cores of your system. Have you purchased an Intel compiler or any other suite like Parallel studio and so on? Or you can download the evaluation copy from http://software.intel.com/en-us/intel-parallel-studio-XE-2013-evaluation-options . Please feel free to put up your queries.
Regards,
Sukruth H V
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's very good i want more.
Please I am checking Sample example
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Take a look at a folder [ CompilerDIR ]\Samples\en_US\C++... ( or so ) and you could find there several C/C++ examples.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
take a look at our Content Library:
http://software.intel.com/en-us/search/site
Search for everything you're interested in (e.g. OpenMP*, Intel(R) Threading Building Blocks, Intel(R) Cilk(TM) Plus, ...).
Make sure to select the proper filters as you'll get flooded with results otherwise. I suggest to filter for either "Article", "Blog post" and "Courseware". Those three also can contain interesting examples.
Best regards,
Georg Zitzlsberger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Can anyone give me link where all mkl Liberary function WITH DESCRIPTION available? I have read following link
http://software.intel.com/sites/products/documentation/hpc/mkl/mkl_userguide_lnx/index.htm
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've been searching myself, and I don't think there is a single satisfactory guide to "all" MKL functions. In several categories (e.g. BLAS, LAPACK, fftw), compatibility with open source libraries is maintained so literature on those libraries is applicable.
Specific questions about MKL should be posed on the MKL forum http://software.intel.com/en-us/forums/intel-math-kernel-library
Unfortunately, today I don't see the additional references which ought to be at the top of that forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okey, Thank You very Much!!! :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I have installed Intel XE COMPOSER it will give me following messages.
ERROR: Package is corrupted. Installation cannot continue.
Like follow (But i have downloaded from Intel site)
Please tell me what i have to do?
Step no: 6 of 7 | Installation
--------------------------------------------------------------------------------
Each component will be installed individually. If you cancel the installation,
components that have been completely installed will remain on your system. This
installation may take several minutes, depending on your system and the options
you selected.
--------------------------------------------------------------------------------
Installing Amplifier XE Command line interface component...
ERROR: Package is corrupted. Installation cannot continue.
--------------------------------------------------------------------------------
Installing Inspector XE Command line interface component...
ERROR: Package is corrupted. Installation cannot continue.
--------------------------------------------------------------------------------
Installing Advisor XE Command line interface component...
ERROR: Package is corrupted. Installation cannot continue.
--------------------------------------------------------------------------------
Installing Intel C++ Compiler XE 13.0 Update 1 on IA-32 component... failed
--------------------------------------------------------------------------------
Installing Intel C++ Compiler XE 13.0 Update 1 on Intel(R) 64 component... done
--------------------------------------------------------------------------------
Installing Intel Debugger 13.0 Update 1 on IA-32 component... failed
--------------------------------------------------------------------------------
Installing Intel Debugger 13.0 Update 1 on Intel(R) 64 component... done
--------------------------------------------------------------------------------
Installing Intel Math Kernel Library 11.0 Update 1 on IA-32 component... failed
--------------------------------------------------------------------------------
Installing Intel Math Kernel Library 11.0 Update 1 on Intel(R) 64 component... done
--------------------------------------------------------------------------------
Installing Intel Integrated Performance Primitives 7.1 Update 1 on IA-32
component... failed
--------------------------------------------------------------------------------
Installing Intel Integrated Performance Primitives 7.1 Update 1 on Intel(R) 64
component... done
--------------------------------------------------------------------------------
Installing Intel Threading Building Blocks 4.1 Update 1 core files and examples
component... done
--------------------------------------------------------------------------------
Finalizing installation... done
--------------------------------------------------------------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
in that case I'd recommend to download it again. There are known issues where the download is interrupted. Please, always verify the size of the downloaded tar-ball. And, in addition you can compare the MD5 sum with the latest Intel(R) Composer XE 2013 Update 1 packages:
[bash]
$ md5sum l_ccompxe_2013.1.117.tgz
8796a1a1e5c98107ca69c75a7aa2b379 l_ccompxe_2013.1.117.tgz
$ md5sum l_fcompxe_2013.1.117.tgz
355c201ef30167580e5b0dfc217fbbe8 l_fcompxe_2013.1.117.tgz
[/bash]
Use the link "Start download with a download manager" when downloading the packages. This should always work.
Best regards,
Georg Zitzlsberger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, i will Download new file and then i will try to install it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Friend Georg Zitzlsberger,
I am trying to write program that calculate the aggregate sum of vector element.
Example
suppose A={1,2,3,4,5,6,7,8,9,0}
Result = 45
i got max through put time 140 to 150 milliseconds for 600,000,000 (600 million number)
Can we reduce time for execution time?
I have write following program.
#include
#include
#include
#include
unsigned int compute(unsigned int i)
{
return i; // return a value computed from i
}
int main(int argc, char* argv[])
{
unsigned int n = 400000000;
int *A = (int *)malloc(sizeof(int)*n);
for(unsigned int i = 0; i < n; i++)
{
A=1;
}
cilk::reducer_opadd total;
// Compute 1..n
std::clock_t start = std::clock();
cilk_for(unsigned int i = 0; i <= n; ++i)
{
total += A;
}
std::cout << "Total (" << total.get_value()
<< ") is correct";
std::cout << "Total Time : "<<( double( std::clock() - start ) /double(CLOCKS_PER_SEC/1000)) <<'\n';
return 0;
}
Command for execution : #icc -fast -prallel filename.c filename
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any MKL function for aggregate sum of vector element?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you mean the BLAS ?sum functions? Evidently, with only 9 elements, those will be slower than any reasonable in-lined method, such as accumulate() or __sec_reduce_add(). Even the compilers' inline optimizations for sum reduction optimization may not be effective for such a short vector, and it may be worth while to prevent the compiler using AVX.
If you succeed in forcing threading on such a small case, you may succeed in running slower than MKL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes friend i had given only sample example of program i have used 600,000,000 element and find sum of that number and it was giving result in 140 to 150 ms i want more optimzation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I doubt that this simple example has much room for improvements. The compiler should already create sufficiently fast code.
Using threading for this (trivial) workload is likely overhead. Intel(R) Cilk(TM) Plus runtime takes care about the right balancing of grain size, though.
Hence, I don't see further room for improvement that would justify the effort for this example.
Best regards,
Georg Zitzlsberger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Means, It's an optimized code
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
you cannot download the videos. You need a Flash* player to view them.
Best regards,
Georg Zitzlsberger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A common practice is to reduce the frequency of reductions:
[cpp]
#include "stdafx.h"
#include
#include
#include
#include
#include
#include
int _tmain(int argc, _TCHAR* argv[])
{
unsigned int n = 400000000;
int *A = (int *)malloc(sizeof(int)*n);
for(unsigned int i = 0; i < n; i++)
{
A=1;
}
cilk::reducer_opadd total;
// Compute 1..n
total.set_value(0);
clock_t start = clock();
cilk_for(unsigned int i = 0; i < n; ++i)
{
total += A;
}
std::cout << "Total (" << total.get_value()
<< ") is correct";
std::cout << "Total Time : "<< ( double( clock() - start ) /double(CLOCKS_PER_SEC/1000)) << '\n';
//==============
total.set_value(0);
start = clock();
cilk_for(unsigned int i = 0; i < n; i+=65536)
{
unsigned int jend = std::min(i + 65536, n);
int my_total = 0;
for(unsigned int j=i; j < jend; ++j)
my_total += A;
total += my_total;
}
std::cout << "Total (" << total.get_value()
<< ") is correct";
std::cout << " Total Time : "<< ( double( clock() - start ) /double(CLOCKS_PER_SEC/1000)) << '\n';
return 0;
}
[/cpp]
Total (400000000) is correctTotal Time : 2944
Total (400000000) is correct Total Time : 354
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page