HI,
Coding and understanding someone else codes will be a major effort especially
at the beginning..
The structure of my code follows this outline:
NewCode
directory :
Makefile
// to compile the codes
Add directory:
mat-addkernels.c /h // matrix addition, matrix
comparison,
Error
doubly_compensated_sum.c // Error Analysis/Estimation
Examples
Example.3.c, example.4.c, example.6.c
Executable
MAT-ADD-Generator
addgen.c // matrix addition kernel generator
Matrices
architecture.h
// architecture specific macros
mat-operands.h
// specify how we store and access matrices,
row/column major ...
Mul
mat-mulkernels.c/h
// multiplication kernels
Scaling
scaling.c/h
// for processor allowing frequency/voltage scaling
Scripts
Sort
quicksort.h/c // this is used for the error analysis
You will need to
install the BLAS library you like and modify the Makefile
Goto: GotoBLAS directory Linux_P4SSE2.
ATLAS pre-built library for P4, you may use any one (either pre-built or not)
as you wish.
There is not much more into it. The files example*, as the name says, offer
examples how to call the matrix multiplication routines.
The codes you will see are all for double precision matrices (typedef double Mat).
the file example.4.c present my implementation of the
parallel algorithm for the Opteron system with two
processors with each processor composed of two cores. This implementation is
specific to the operating system available on that particular machine.
Unfortunately, this package is not self installing and it will require some
work in the understanding of its structure, installation and use. Nothing major.
The High performance MM routines should be installed separately. Then my code
can be built and used. Some tuning for the Matrix Addition (MA) is advised but
I have found that the optimized version available fits most of the architecture
I have used.
Makefile
Every architecture has its compiler and libraries. My
code will use Matrix Multiplication (MM) routines that can be from ATLAS, GotoBLAS or your preferred vendor library. At this time, I
have experimented (heavily) with ATLAS, GotoBLAS, in the past I used SGI BLAS and recently MKL BLAS.
architecture.h
The file with macros specifying what library I am going to use. For example,
the macro mm_leaf_computation is used to identify the leaf computation
(when Strassen/Winograd yield control to the fancy
library routines).
mat-operands.h
The matrices are defined here and basic routines for they manipulation,
division and definitions are here as well. For example, how
to get the sub-matrix A0 from the matrix A ...
mat-addkernels.c
Matrix addition for matrices in row and column major.
mat-mulkernels.c/h
Strassen, oblivious + Strassen,
dynamic Strassen, Winograd,
and Oblivious + Winograd are all defined here. The
LEAF constant is the recursion point (defined in the mat-mulkernels.h)