Using C# generics can cause performance issues. When I started porting my matrix-matrix multiplication test bed to C++, I was curious if templates provided better performance. I had hoped C++’s compiled nature and more robust optimizer would make templates run nearly as fast as regular code.
But they don’t. Additionally, the performance hit is split differently than in C#. In C#, most of the performance loss is in going from the indexer method to the generic method. In C++, most of the performance loss is in going from the regular method to the indexer method.
My C++ matrices all inherit from the MatrixBase virtual class. Three indexer methods, GetValue, SetValue, and AddValue, allow for generic matrix-matrix multiplication routines. All three are implemented as inline methods in every class header file.
The execution time of multiplying two 1000 x 1000 double-precision matrices 20 times is disappointing. The code was compiled in Visual C++ under Visual Studio 2012 with /Ob2, /O2, /Oi, and /Ot. All listed methods use accumulators, store data in arrays of arrays, and run in parallel across four cores. The figures shown are elapsed time in seconds for the run.
Acc 5.361 Acc-Indexer 29.939 Acc-Template 40.211
Other multiplication and storage methods had similar differences. C# seemed to do well inlining the indexers then had an optimizer issue implementing the generics.
VC++ seemed to have issues inlining the indexers, then did an OK job implementing the templates. Why? Is this a compiler-specific issue? I hope to test this in g++ and ICL, though this is low on my priority list.
Since templates are not critical to the core sections of my code, I will not be going through the generated assembly code. Templates work well in the non-critical sections. I recommend using templates to simplify testing and in the routines that organize benchmarking. But I can’t recommend them for performance-critical applications, at least not using Visual C++.
These performance differences surprise me and could be my fault. If you have a suggestion to improve template performance, please leave a comment.