Using the new multi_arr class

A new multi_arr class has been created that allocates multi-dimensional arrays as a single block of memory, similar to the way the compiler would allocate such arrays. Currently such structures are implemented as arrays of pointers to arrays of pointers to small blocks of the actual data. This creates a lot of indirection and likely also leads to poor CPU cache utilization. They are however easy to pass around. The new multi_arr structure promises to give the best of both worlds: one contiguous block of memory that is efficient to access and also easy to pass around since all the dimensions are stored in the class.

The first task would be to test if this really gives better assembly and what kind of speedups would be achievable. If this gives a positive result the code should be gradually converted. This will be a laborious process.

Some examples of the use of the multi_arr class would be

	multi_arr<double> arr;
	arr.alloc(3,4,2); // memory will be allocated by MALLOC,
	                  // so may or may not be initialized to SNaN/invalid
	                  // depending on how the code was compiled 
	arr.invalidate(); // this will set float or double arrays to all SNaN
	                  // it will set any other array to all 0xff bytes.
	arr.zero();       // this will set the array to all zero
	arr.ind(0,0,0) = 1.; //arr.ind() can be used as l-value
	arr.ind(0,0,1) = 2.;
	double x = arr.ind(0,0,0)*arr.ind(0,0,1)
	arr.clear();      // this will deallocate the array