Coding and understanding someone else codes will be a major effort especially at the beginning..

The structure of my code follows this outline:
NewCode directory :
Makefile                  // to compile the codes

Add directory:
    mat-addkernels.c /h    // matrix addition, matrix comparison,


  doubly_compensated_sum.c   // Error Analysis/Estimation


   Example.3.c, example.4.c, example.6.c



   addgen.c   // matrix addition kernel generator


   architecture.h            // architecture specific macros
   mat-operands.h         // specify how we store and access matrices,
   row/column major ...
   mat-mulkernels.c/h    // multiplication kernels
   scaling.c/h                 // for processor allowing frequency/voltage scaling



quicksort.h/c         // this is used for the error analysis


You will need to install the BLAS library you like and modify the Makefile  
Goto: GotoBLAS directory Linux_P4SSE2.
ATLAS pre-built library for P4, you may use any one (either pre-built or not) as you wish.

There is not much more into it. The files example*, as the name says, offer examples how to call the matrix multiplication routines.
The codes you will see are all for double precision matrices (typedef double Mat).

the file example.4.c present my implementation of the parallel algorithm for the Opteron system with two processors with each processor composed of two cores. This implementation is specific to the operating system available on that particular machine.

Unfortunately, this package is not self installing and it will require some work in the understanding of its structure, installation and use. Nothing major.


The High performance MM routines should be installed separately. Then my code can be built and used. Some tuning for the Matrix Addition (MA) is advised but I have found that the optimized version available fits most of the architecture I have used.

Every architecture has its compiler and libraries. My code will use Matrix Multiplication (MM) routines that can be from ATLAS, GotoBLAS or your preferred vendor library. At this time, I have experimented (heavily) with ATLAS, GotoBLAS, in the past I used SGI BLAS and recently MKL BLAS.

The file with macros specifying what library I am going to use. For example, the macro  mm_leaf_computation is used to identify the leaf computation (when Strassen/Winograd yield control to the fancy library routines).

The matrices are defined here and basic routines for they manipulation, division and definitions are here as well. For example, how to get the sub-matrix A0 from the matrix A ...

Matrix addition for matrices in row and column major.

Strassen, oblivious + Strassen, dynamic Strassen, Winograd, and Oblivious + Winograd are all defined here. The LEAF constant is the recursion point (defined in the mat-mulkernels.h)


Copyright (c) 2007, P. D'Alberto, A. Nicolau, and A. Kumar.