A simple C code for parallel multiplication of two 512×512 matrices ran with OpenMPI across 4 nodes (were applicable). Three instances of the same source code were compiled with GNU C (gcc 4.4.7), Intel 11 and PGI 11.8 compilers. Results presented on the graph below.
Both OpenMPI 1.10.2 and “the code” were compiled with the same parallel compilers (the lower the values the better performance).
The winner is: Intel ICC compiler ! (well PGI is almost the same as Intel, but costs more money)
Alex Pedcenko
Recent Comments