NAME

matrix_algebra -- evaluate matrix expression(s)

PROTOTYPE

unitptr matrix_algebra( int iNum, char *apcName[], int *piColDim, int *piRowDim, char *pcOpt, char *pcProg, unitptr uHost)

ARGUMENTS

int iNum
nr of input variables to use
char *apcName[]
their names (currently only single letters ok)
int *piColDim
their column dimensions
int *piRowDim
their row dimensions
char *pcOpt
option string
char *pcProg
the program to execute
unitptr uHost
host unit

RETURN VALUE:

A pointer to the created unit or NULL in the case of an error.

INTERFACE OF CREATED UNIT:

Input fields:

inp_0[]:
input field for setting first input variable. Dimension is given by piColDim[0]*piRowDim[0].
...
....
inp_k[]:
input field for setting input variable k=iNum-1. Dimension is given by piColDim[k]*piRowDim[k].
CTL_in[]:
control field (at position iNum). A value of 0 disables execution of the unit.

Output fields:

out_0[]:
...
...
out_f[]:
there is one output for each computed matrix expression. [see below] \newpage

SYNOPSIS:

Executes a program of linear algebra operations specified in string pcProg and copies results into output fields. Syntax is C-like, but with extensions for matrix operations. Identifiers are restricted to single letter variable names with juxtaposition denoting matrix multiplication (for further differences, see SUPPORTED SYNTAX below). Example: specifying pcProg as the string

     INPUT: A[10,10],y[10];
     (AA'+id(10))\y;

computes the solution vector x of the linear equation (AA'+1)x=y from a given matrix A and a given vector y. and returns it at output field 0. A is expected at input field 0 (100 elements, packed), and y is expected at input field 1 (10 elements, packed). In general, matrix, vector and scalar data are imported by declaring some identifiers (here, A and y) as `input variables'' (see below). For each input variable, the created unit will have an input field of matching dimension which allows to set the elements of the variable. Results are returned by statements that are expressions that are not assigned to a left hand side variable, e.g,

     result_expression1;
     result_expression2;
     ..
     result_expression_n;

This will give rise to the creation of n output fields with appropriate dimensions to hold the results of the n matrix expressions result_expression1..n above. NOTE: the above statements copy the current value of result_expression to the output, i.e., if result_expression is a single variable, subsequent changes will only be seen at the output if followed by a further statement "result_expression;". Intermediate and repeatedly needed expressions can be assigned to auxilary variables (example: D=A+BC; ). Such assignments do not give rise to the creation of an output field for their result value; instead, the result value is copied into the variable specified on the left hand side of the assignment (this variable need not be among the set of names specified in pcNames; in this case, a properly dimensioned variable is created). Variables created in this way retain their values between successive invocations ( exec, adapt, ..) of the unit. The input and output fields will be packed, unless they hold only a single element.

INPUT VARIABLES:

Such variables are associated with a NST-input field of appropriate dimension to set their elements. They can be defined in two different ways (which may be combined). Either, an input variable can be requested by specifying its name in an element of array parameter apcName[], and its column and row dimensions in corresponding elements of piColDim[], piRowDim[]. iNum must be set to the array index plus 1 of the highest such definition. Or, the required input variables are specified as part of the program in pcProg, using a declaration of the form

    INPUT: A, B[5], C[3,4];

This would define three input variables A,B and C of dimensions 1x1 (i.e., a scalar), 1x5 (i.e., a row vector) and 3x4 (i.e., a matrix). NOTE: there is no analogous keyword OUTPUT: !.

SUPPORTED SYNTAX:

Legal syntax is as in C, with the single data type float (including vectors and matrices), no subroutines, no block nesting, and the following further main differences.
comments:
// starts a comment that extends to the end of the current line. The usual C-style multi-line comments are not supported.
variables:
variables must be single letter names. Declarations are implicit: if a variable is assigned to an expression, it inherits the dimension of the assigned expression. Explicit declarations are used for defining `input variables'' (cf. above). The function set() is provided to make variables with specified dimensions and values.
control structures:
there is only while(){}, for..{} and if(){}else{}. The for-syntax differs from C: there is a `simple form'' for i=a to b step c {..} where step c may also be absent (all variables must be scalars). There is also a second `matrix form'' (for more details, cf. section CONTROL STRUCTURES below). break, continue and return are as in C. The conditions for if and while may be matrices, in which case the boolean AND of all their elements is used as the truth value (NOTE: this makes A!=B true only if A and B differ in ALL their elements!).

Boolean and arithmetic operations:

The usual boolean ( ||, &&, !, ==, !=, <=, >=, <, >) operations are provided, extended to operate elementwise on matrix operands (which must be of the same dimension). For the comparison operations, this condition is relaxed: if one operand is a scalar s and the other a MxN matrix A, the result is again a MxN matrix, with elements given by the MxN comparisons of s with the elements of A. Addition and subtraction can be applied to any matrix operands, which then must all be of the same column and row dimensions. Sums of matrices may contain scalar values (or 1 x 1 matrices), which are then `promoted'' to properly sized matrices in which all elements have the same value. (thus, 2+A adds the value 2 to all elements of the matrix A). For the addition of a unit matrix, use the id() function. Dealing with matrices, we have two types of multiplication (as well as division): elementwise multiplication (division) denoted by the operators '*' and 'div', and matrix multiplication. The latter has no special operator: if two matrix variables follow each other, there is an implied matrix multiplication (binding stronger than '*' and 'div', thus, A B*C D div E evaluates as ((AB)*(CD)) div E ). Matrix division has two infix operators '/' and '\\' which are interpreted as right or left multiplication with the inverse. Thus, A/B is denotes the matrix product A inverseof(B), A\\B denotes inverseof(A) B. For instance, the solution vector x of the linear system Ax=y is obtained as x=A\\y.

Assignment operations:

Only =,+=, and -=: are provided. The left hand side of an assigment must be a single variable or a single matrix element (such as, e.g. A_[2,3]) of the same dimension as the assigned expression. (if the left hand side variable occurs for the first time, its dimension is defined to be that of the assigned expression). The additional assignment operator := allows an assignment, if the dimensions of the left and right hand side differ, but the number of elements still is the same (in this case, elements are assigned as if both sides were vectors obtained by concatenating their rows). Unlike in C, assignments have no value and cannot be used in places where a value is expected.

Conditional expressions:

The familiar A ? B : C is supported for matrices. If some of the operands are scalars, they are `promoted'' to the dimension of the remaining matrices (which all must be of the same dimension). This means, that they are treated as if they were a matrix of suitable dimension, with all elements of the same value. The result is a matrix with the same dimension as the non-scalar operands (or 1x1, if all operands are scalar). If A is a scalar, only one of B, C is evaluated (as in C). However, if A is a matrix, BOTH B and C are fully evaluated and the result matrix R is defined in an elementwise fashion: Rij=Aij?Bij:Cij.

Index operator:

If A is a matrix, the ij-element of A is given by A_[i,j]. In memory, A is a linear array of size(A)=cols(A)*rows(A) elements, in the order obtained by concatenating all rows (starting with row 0). If a is a vector of length size(A), a:=A is legal and makes the linear array of elements of A accessible via a. However, usually this is not necessary, since even for matrices `linear addressing'' is also allowed, i.e., A_k and a_k (and A_[k] as well) would yield the same elements in the above example. Note that indices are treated like any other vector e.g, A_b with b=[i,j] a 2-vector is legal syntax. "_" binds stronger than the other binary operations, i.e., A_bc and A_b*2 are interpreted a (A_b)c and (A_b)2. If index values are out of range, periodic boundary conditions are assumed, i.e., for a 2x3-matrix A, A_[2,4]=A_[0,1] and A_6=A_0. To extract entire rows, columns or submatrices, use the row(), col() and rect() functions described below.

Further matrix operations:

"'"
a postfix operator to denote the transpose A' of a matrix A. Special case: if a, b are both n x 1 (i.e. column-) vectors, a'b yields their scalar product.
"@"
infix operator for convolution operation. The operand to the right of a "@" is the convolution kernel and must be a matrix with an odd number of rows and columns. "@" binds stronger than the above operators. A@B@C is evaluated as (A@B)@C. Since all matrices are of finite size, this is different from A@(B@C)!
[..]
concatenation of vectors/matrices: the contents between the brackets is a semicolon separated list of "row-lists". Each row-list is a comma-separated list of values. The result is a matrix. Example: [1,2,3;4,5,6] specifies the 2x3 matrix with (1,2,3) as its first and (4,5,6) as its second row. The elements of each row-list may be matrices, if they all have the same column dimension.

NST-related commands:

exec_opnd(i):
execute NST unit at relative position i (i.e., i-th successor unit for i>0, i-th predecessor unit for i<0; no operation is performed if i=0). i must be a scalar constant.
adapt_opnd(i),init_opnd(i),reset_opnd(i):
analogous calls for the NST adapt, init and reset methods (cf. also prog_unit (cf. also prog_unit )).
EXEC:
jump mark where execution starts when invocation is via exec_unit(). Note that the colon is part of the jump mark!
ADAPT:
jump mark where execution starts when invocation is via adapt_unit().
INIT:
jump mark where execution starts when invocation is via ctrl_unit(NST_INIT).
RESET:
jump mark where execution starts when invocation is via ctrl_unit(NST_RESET).
SAVE:
a pseudo-jump mark (pseudo, since no real jump is involved here) for the save method: expects to be followed by a comma-separated list (ended by a semicolon) of variable names. The variables in this list will later be saved whenever a save command is used to write a save-file. If the the SAVE: mark is absent, the default is to always save all variables. However, in particular for image data, this can lead to huge files (the matrix elements are currently saved as formatted floats!), therefore, the SAVE: mark is provided to restrict the set of to-be-saved variables selectively. A SAVE: that is directly followed by a semicolon (i.e., the variable list is empty) specifies that no variables shall be saved altogether. NOTE: 1. any variables behind the SAVE: must already be defined. 2. When multiple SAVE: lists are given, the last one overrides any previous.
If no jump marks are specified, there is an implicit EXEC: at the begin of the program.

SPECIAL FUNCTIONS:

There is a large number of special functions, most of them allowing the use of matrices as their operands. If the argument to a function is a dimension specification, it can usually be given as a pair of (constant) scalar expressions or by giving a matrix that has the desired size. Some functions have optional arguments that may be omitted. Some functions return results via their arguments; such arguments must always be variables, never expressions or values.

Elementwise Functions:

In the following, d denotes a dimension specification. This can be a single or a pair of constant scalar expression(s) (e.g., "1+cols(A),2*rows(A)" ). or a single variable identifier In the second case, the variable identifier, say, "A", specifies as dimension its own dimension (i.e., it is equivalent to the pair "col(A),row(A)" ). exp(x), log(x), tanh(x), abs(x), sgn(x), sqrt(x): sin(x), cos(x), fermi(x), ceil(x), floor(x): replace each matrix element by the corresponding function value. fermi'(x), tanh'(x): the derivatives of the fermi() and the tanh() function
set(d,a):
returns a matrix of dimension d, but with elements set to value a. If a is absent, a value of 0 is assumed.
set(d,a,b,c):
returns a matrix of dimension d, but with element ij set to value a+bi+cj. If c is absent, c=0 is assumed.
ins(x,y,i,j):
return copy of matrix x in which a submatrix is replaced with the elements of matrix y such that y[00] is copied into the position of former element x[ij]. If j or both i,j are omitted, values of i=0 and/or j=0 are assumed. If i<0, -i is taken an offset from the last row of the matrix x. Similarly, if j<0, -j is taken as an offset from the rightmost column of x.
rect(x,i,j,k,l):
return submatrix of matrix x that starts at row i and column j (i.e., x[ij] will be the 00-element of the returned submatrix) and that is a k x l matrix. If k and l are absent, the single element x[ij] is returned. If i<0, -i is taken an offset from the last row of the matrix x. Similarly, if j<0, -j is taken as an offset from the rightmost
pow(x,y):
each element to the power y (this differs from the y-fold matrix product x^y!!)
flipud(x):
flip x as a pixel matrix upside down
fliprl(x):
flip x as pixel matrix, reversing left-right-order

Special constants:

#pi:
the value pi

Operations with scalar result:

cols(x):
nr of columns of x
rows(x):
nr of rows of x
size(x):
nr of elements of x (=cols(x)*rows(x))
norm(x):
returns euclidean norm
norm1(x):
returns max abs value

Operations operating on columns of a matrix or on elements of a vector:

The following operations yield a scalar result for a vector argument, and a row vector for a matrix argument:
sum(x):
sum of each column of x
avg(x):
average of each column of x
std(x):
standard deviation of each column of x

Max and min operations of a matrix or elements of a vector:

min(x):
minimal element of x
min(x,i,j):
minimal element of x with index position returned in i,j which must be scalar variables. If i,j did not exist before, they will become defined by their use as arguments in min(x,i,j).
Finally, if instead two arguments (x,v) are given, and the second argument v is a row vector with the row dimension of x, or a column vector with the column dimension of x, we have
min(x,v):
vector of the same shape as v, holding the minimal elements of each row (column) of x, if v is a column (row) vector. Vector v itself will be used to return the index values of the minimal elements.
Again, v need not be defined before. In this case it will be defined and dimensioned by the min() call, with the smaller dimension of the matrix x determining its dimension and shape ( v becoming a column, if the column dimension is smaller, a row otherwise). Special case: if x is a vector, and i a scalar or not yet defined, min(x,i) will return the minimal element of x and its index position in i. The max() function is fully analogous.

Operations requiring or yielding a square shaped matrix operand:

id(n):
return a nxn square matrix
tr(x):
the sum of the diagonal elements (trace) of x
det(x):
matrix determinant
diag(x):
returns a diagonal matrix whose diagonal is given by x, which can be a vector or a square matrix.
diagv(x):
the diagonal part of the square matrix x, returned as a mx1 vector
diag0(x):
a copy of x with all diagonal elements zeroed (i.e., the matrix x-diag(x) )
eigv(x,w):
return real parts of eigenvalues of a general square matrix x and write imaginary parts into the vector w (argument w must be a previously defined variable name; it omitted, if the imaginary parts are not needed).

Operations for symmetric matrices:

eigs(x):
computes the vector of eigenvalues of a symmetric matrix x (this is not checked; if x is non-symmetric, the result is invalid). Eigenvalues are returned as a column vector and are sorted in descending order (can be reversed with flip() ).
eigs(x,T):
eigenvalues as before, but in addition writes into T the eigenvectors. T must be a previously defined matrix variable of the same size as x. Eigs(x,x) is allowed (overwrites the original matrix with the eigenvectors).

Further special functions:

svd(a,w,v):
computes singular value decomposition a=uwv' with u a matrix of the same shape as a, but orthogonal columns, w the diagonal vector of singular values and v an orthogonal square matrix. Arguments w and v are optional and can be omitted if these values are not required.
rot(a):
if a is a scalar: a 2x2 rotation matrix by angle a if a is a 3-vector: a 3x3 rotation matrix about the axis a if a is a 4-vector: a 4x4 homogeneous rotation matrix about the axis a
sort(a):
sort rows of matrix a so that first column is in ascending order
sort(a,c):
sort rows of matrix a so that column c is in ascending order (columns can be sorted by transposing a and the result).
normal(d):
random matrix, elements normally distributed.
random(d):
random matrix, elements uniformly distributed in [0,1].
tf4(a,d):
[not implemented] (.a and d 3-vectors) the homogeneous transformation specified by a rotation a and a translation d

CONTROL STRUCTURES:

if (A) {..} else {..}:
an if-else construct (the else part may be missing). A is a allowed to be a matrix (note that this includes simple boolean expressions such as A<B etc.). If so, the condition evaluates to TRUE if all individual matrix elements are TRUE (note that this leads to a somewhat counterintuitive evaluation of some expressions, such as A!=B). The {..} that follow an else may also be another if-statement:

     if (A==1) { B=2; } else if (A==2) { B=4; } else B=0;

If the if- or else-body consists only of a single statement, the surrounding {..} may be absent.
while (A) {..}:
the usual while-statement, with A being any simple (i.e., a single comparison at most) boolean (matrix-) expression. Again, if A is matrix valued, the condition evaluates to the AND of all individual matrix elements. If only a single statement is to be iterated, the surrounding {..} may be absent.
for i=a to b step c {..}:
usual for-statement (C-equivalent would be for (i=a;i<=b;i+=c) ). The "step c"-part may be absent, in which case a step size of 1 is assumed. Note that (i) a,b,c must all be scalars, and (ii), if b-a is an integer multiple of c, both limits will be included in the set of iterated values for i. NOTE: since the limit values are always included, for i=0 to 5 does 6 iterations!
There is a second form of the for statement with a vector-valued loop-variable:
for i=B {..}:
executes the statements within {..} for all columns of the matrix B. When the statement is encountered, B is evaluated; then, the `loop variable'' i is set to in turn to all columns of B, and for each such assignment the body of the for statement is executed. Note that i must be a vector with as many elements as there are in a column of B. Example:

     for i = [1,2,3;
              1,4,9] { ... }

In both forms, a pair of parenthesis around the for-condition is tolerated (to conform closer with C-syntax), but not required.

EXAMPLES:

In the following examples, all upper case letters A,B,C are m x m input matrices, x,y,z are column vectors of dimension m, s is a scalar value and X,Y,Z are general, rectangular matrices.
det(A-s);
evaluates the characteristic polynomial of A at s and returns scalar value at a single output field of one pin.
X'/(XX');
evaluates pseudoinverse for a rectangular matrix
M = YX'/(XX'+ s id(X); M;
finds the coefficient matrix M that solves the matrix equation MX=Y best in the sense of least squares. Here, s is a regularization parameter.
To extract the R,G and B channels from a 3MN vector (or matrix) C of RGB-triples:

   A = set(size(C)/3,3); // make MN x 3 array
   A := C; // reshape C into MN x 3 array
   R = col(C,0); // first row is red (els 0,3,6.. of orig. C)
   G = col(C,1);
   B = col(C,2);

To compute analytic functions of symmetric matrices, such as the matrix exponential EXP(tA) [this not meant as the elementwise function!], one can use the eig() operation (we assume A is a square input matrix, t a scalar):

   U = set(A); // suitable matrix for the eigenvectors
   v = eigs(A,U); // eigenvalues v and eigenvectors U
   U diag(exp(tv)) U'; // returns matrix exponential of tA

To avoid the full diagonalization at each invocation (e.g., perhaps only t is changing), split the processing, using a label:

   RESET:
   U = set(A); // suitable matrix for the eigenvectors
   v = eigs(A,U); // eigenvalues v and eigenvectors U
   EXEC:
   U diag(exp(tv)) U'; // returns matrix exponential of tA

The following computes the convolution of an input image A and sends the positive and the negative parts to separate output fields:

   M = { 0.5, 1, 0.5 ;
         1, -6, 1 ;
         0.5, 1, 0.5 }; // define filter mask

   C = A @ M; // "@" is convolution
   C>=0 ? C : 0; // positive part to output field 0
   C< 0 ? -C : 0; // abs of negative part to output field 1


A slightly more involved ``program'' for a simple Hebb-network (W assumed to be a m x m matrix:

    EXEC: y = fermi(Wx); y; return; // transform
    ADAPT: W += syx'/(x'x); return; // normalized hebb rule,
                                        // s=learning rate
    INIT: W = set(yx',0); // initialization

Note that in the INIT-statement setting W=0 would define W as a 1 x 1 matrix. To have the proper size for later, we use the set() function which returns a matrix of the same shape as its argument, but filled with a given value (this makes W automatically adjusting, if the dimensions of x and/or y are changed).

EFFICIENCY CONSIDERATIONS:

Evaluation of expressions is from left to right. Thus, if A is a matrix and x,y,z are scalars, the product Axyz is evaluated as ((Ax)y)z) and thus needs three passes over all matrix elements, while both A(xyz) and xyzA need only one (since the remaining multiplications need to deal only with scalars). Likewise, if A is a square matrix, the sum 1+2+3+A evaluates faster than A+1+2+3, since in the first case the promotion of 1+2+3 to a (diagonal) matrix occurs only at the last step, while in the other case each addition involves a matrix addition (however, in the present case the loop would be restricted to the diagonal elements). Thus, in general one should move the scalar part of a computation `to the left'' or bracket it suitably.

PITFALLS:

The use of single variable names and juxtaposition for matrix multiplication leads to succinct code but requires some care. If a product, such as tr, coincides with the name of a predefined function (here tr() ), then tr(A) is parsed as the function tr() of the argument A (since "tr(" is considered as a single token, which is different from "tr" ), but tr (A) (i.e., with a space separating tr and (A) ) is parsed as the matrix product tr and the expression (A) (since now the reserved token "tr(" cannot any longer be recognized). If one wants to safeguard against such cases, it is recommendable to separate the factors of a matrix product by spaces.

SEE ALSO:

prog_unit

FILE

/amnt/loge/users/nistaff02/nistaff/rhaschke/nst7/man/../o.linux//../foldersrc/nst_matrix_algebra.c