NAME
matrix_algebra -- evaluate matrix expression(s)
PROTOTYPE
unitptr matrix_algebra( int iNum, char *apcName[], int *piColDim, int *piRowDim, char *pcOpt, char *pcProg, unitptr uHost)
ARGUMENTS
- int iNum
- nr of input variables to use
- char *apcName[]
- their names (currently only single letters ok)
- int *piColDim
- their column dimensions
- int *piRowDim
- their row dimensions
- char *pcOpt
- option string
- char *pcProg
- the program to execute
- unitptr uHost
- host unit
RETURN VALUE:
A pointer to the created unit or NULL in the case of an error.
INTERFACE OF CREATED UNIT:
Input fields:
- inp_0[]:
- input field for setting first input variable. Dimension is given by
piColDim[0]*piRowDim[0].
- ...
- ....
- inp_k[]:
- input field for setting input variable k=iNum-1. Dimension is given by
piColDim[k]*piRowDim[k].
- CTL_in[]:
- control field (at position iNum). A value of
0 disables execution of the unit.
Output fields:
- out_0[]:
-
- ...
- ...
- out_f[]:
- there is one output for each computed matrix expression.
[see below]
\newpage
SYNOPSIS:
Executes a program of linear algebra operations specified in string pcProg and
copies results into output fields.
Syntax is C-like, but with extensions for matrix operations.
Identifiers are restricted to single letter
variable names with juxtaposition denoting matrix multiplication
(for further differences, see SUPPORTED SYNTAX below).
Example: specifying
pcProg as the string
INPUT: A[10,10],y[10];
(AA'+id(10))\y;
computes the solution vector x of the linear equation (AA'+1)x=y
from a given matrix A and a given vector y. and returns it
at output field 0. A is expected at input field 0 (100 elements, packed),
and y is expected at input field 1 (10 elements, packed).
In general, matrix, vector and scalar data are imported by declaring
some identifiers (here, A and y) as `input variables'' (see below). For
each input variable,
the created unit will have an input field of matching
dimension which allows to set the elements of the variable.
Results are returned by statements that are expressions that
are not assigned to a left hand side variable, e.g,
result_expression1;
result_expression2;
..
result_expression_n;
This will give rise to the creation of n output fields with
appropriate dimensions to hold the results of the n matrix
expressions result_expression1..n above.
NOTE: the above statements copy the current value of result_expression
to the output, i.e., if result_expression is a single variable, subsequent
changes will only be seen at the output if followed by a further
statement "result_expression;".
Intermediate and repeatedly needed expressions can be assigned
to auxilary variables (example: D=A+BC; ). Such assignments
do not give rise to the creation of an output field for their
result value; instead, the result value is copied into the
variable specified on the left hand side of the assignment
(this variable need not be among the set of names specified
in pcNames; in this case, a properly dimensioned variable
is created). Variables created in this way retain their values
between successive invocations ( exec, adapt, ..) of the unit.
The input and output fields will be packed, unless they hold only a single
element.
INPUT VARIABLES:
Such variables are associated
with a NST-input field of appropriate dimension to set their elements.
They can be defined in two different ways (which may be combined).
Either, an input variable can be requested by specifying its
name in an element of array parameter apcName[], and its column and
row dimensions in corresponding elements of piColDim[], piRowDim[].
iNum must be set to the array index plus 1 of the highest such
definition. Or, the required input variables are specified as
part of the program in pcProg, using a declaration of the form
INPUT: A, B[5], C[3,4];
This would define three input variables A,B and C of dimensions
1x1 (i.e., a scalar), 1x5 (i.e., a row vector) and 3x4 (i.e., a matrix).
NOTE: there is no analogous keyword OUTPUT: !.
SUPPORTED SYNTAX:
Legal syntax is as in C, with the single data type float
(including vectors and matrices), no subroutines,
no block nesting, and the following further main differences.
- comments:
- // starts a comment that extends to the end of the
current line. The usual C-style multi-line comments are not supported.
- variables:
- variables must be single letter names. Declarations
are implicit: if a variable is assigned to an expression, it inherits
the dimension of the assigned expression. Explicit declarations are
used for defining `input variables'' (cf. above).
The function set() is provided to make variables
with specified dimensions and values.
- control structures:
- there is only while(){}, for..{}
and if(){}else{}. The for-syntax differs from C: there
is a `simple form'' for i=a to b step c {..} where
step c may also be absent (all variables must be scalars).
There is also a second
`matrix form'' (for more details, cf. section CONTROL STRUCTURES
below). break, continue and return
are as in C. The conditions for if and while may be matrices,
in which case the boolean AND of all their elements is used
as the truth value (NOTE: this makes A!=B true only if A and
B differ in ALL their elements!).
Boolean and arithmetic operations:
The usual boolean ( ||, &&, !, ==, !=, <=, >=, <,
>) operations are provided, extended to operate elementwise on matrix operands
(which must be of the same dimension).
For the comparison operations, this condition is relaxed:
if one operand is a scalar s and the other a MxN matrix A, the result
is again a MxN matrix, with elements given by the MxN comparisons of
s with the elements of A.
Addition and subtraction can be applied to any matrix
operands, which then must all be of the same column and row dimensions.
Sums of matrices may contain scalar values (or 1 x 1 matrices),
which are then `promoted'' to properly sized matrices in which all
elements have the same value.
(thus, 2+A adds the value 2 to all elements of the matrix
A). For the addition of a unit matrix, use the id() function.
Dealing with matrices, we have two types of multiplication
(as well as division): elementwise
multiplication (division) denoted by the operators '*' and 'div', and
matrix multiplication. The latter has no special operator: if two matrix variables
follow each other, there is an implied matrix multiplication (binding stronger
than '*' and 'div', thus, A B*C D div E evaluates as ((AB)*(CD)) div E ).
Matrix division has two infix operators '/' and '\\' which are interpreted
as right or left multiplication with the inverse. Thus,
A/B is denotes the matrix product A inverseof(B),
A\\B denotes inverseof(A) B. For instance, the solution vector x of the
linear system Ax=y is obtained as x=A\\y.
Assignment operations:
Only =,+=, and -=: are provided.
The left hand side of an assigment must be a single variable
or a single matrix element (such as, e.g. A_[2,3])
of the same dimension as the assigned expression.
(if the left hand side variable occurs for the first time,
its dimension is defined to be that of the assigned expression).
The additional assignment operator := allows an assignment,
if the dimensions of the left and right hand side differ, but
the number of elements still is the same (in this case, elements
are assigned as if both sides were vectors obtained by
concatenating their rows). Unlike in C, assignments have no
value and cannot be used in places where a value is expected.
Conditional expressions:
The familiar A ? B : C is supported for matrices. If
some of the operands are scalars, they are `promoted'' to the
dimension of the remaining matrices (which all must be of the
same dimension). This means, that they are treated as if they
were a matrix of suitable dimension, with all elements of the same
value. The result is a matrix with the same dimension as the
non-scalar operands (or 1x1, if all operands are scalar).
If A is a scalar, only one of B, C is evaluated (as in C).
However, if A is a matrix, BOTH B and C are fully evaluated
and the result matrix R is defined in an elementwise fashion:
Rij=Aij?Bij:Cij.
Index operator:
If A is a matrix,
the ij-element of A is given by A_[i,j]. In memory, A is
a linear array of size(A)=cols(A)*rows(A) elements, in the
order obtained by concatenating all rows (starting with row 0).
If a is a vector of length size(A), a:=A is legal and makes
the linear array of elements of A accessible via a. However,
usually this is not necessary, since even for matrices
`linear addressing'' is also allowed, i.e., A_k and a_k
(and A_[k] as well) would yield the same elements in the above example.
Note that indices are treated like any other vector
e.g, A_b with b=[i,j] a 2-vector is legal syntax. "_" binds stronger than
the other binary operations, i.e., A_bc and A_b*2 are interpreted
a (A_b)c and (A_b)2.
If index values are out of range, periodic
boundary conditions are assumed, i.e., for a 2x3-matrix A,
A_[2,4]=A_[0,1] and A_6=A_0.
To extract entire rows, columns or submatrices, use the
row(), col() and rect() functions described below.
Further matrix operations:
- "'"
- a postfix operator to denote the transpose A' of a matrix A.
Special case: if a, b are both n x 1 (i.e. column-) vectors,
a'b yields their scalar product.
- "@"
- infix operator for convolution operation. The operand to the
right of a "@" is the convolution kernel and
must be a matrix with an odd number of rows and columns.
"@" binds stronger than the above operators.
A@B@C is evaluated as (A@B)@C. Since all matrices are of
finite size, this is different from A@(B@C)!
- [..]
- concatenation of vectors/matrices: the contents between the
brackets is a semicolon separated list of "row-lists". Each row-list
is a comma-separated list of values. The result is a matrix.
Example: [1,2,3;4,5,6] specifies the 2x3 matrix with (1,2,3)
as its first and (4,5,6) as its second row. The elements of each row-list may
be matrices, if they all have the same column dimension.
NST-related commands:
- exec_opnd(i):
- execute NST unit at relative position i (i.e.,
i-th successor unit for i>0, i-th predecessor unit
for i<0; no operation is performed if i=0). i must
be a scalar constant.
- adapt_opnd(i),init_opnd(i),reset_opnd(i):
- analogous calls
for the NST adapt, init and reset methods (cf. also
prog_unit (cf. also
prog_unit )).
- EXEC:
- jump mark where execution starts when invocation is via
exec_unit(). Note that the colon is part of the jump mark!
- ADAPT:
- jump mark where execution starts when invocation is via
adapt_unit().
- INIT:
- jump mark where execution starts when invocation is via
ctrl_unit(NST_INIT).
- RESET:
- jump mark where execution starts when invocation is via
ctrl_unit(NST_RESET).
- SAVE:
- a pseudo-jump mark (pseudo, since no real jump is involved here)
for the save method: expects to be followed by a
comma-separated list (ended by a semicolon) of variable names.
The variables in this list will later be
saved whenever a save command is used to write a save-file.
If the the SAVE: mark is absent, the default is to always save
all variables. However, in particular for image data, this can
lead to huge files (the matrix elements are currently saved
as formatted floats!), therefore, the SAVE: mark is provided to
restrict the set of to-be-saved variables selectively.
A SAVE: that is directly followed by a semicolon (i.e.,
the variable list is empty) specifies that no variables
shall be saved altogether. NOTE: 1. any variables behind the
SAVE: must already be defined. 2. When multiple SAVE: lists
are given, the last one overrides any previous.
If no jump marks are specified, there is an implicit EXEC: at the
begin of the program.
SPECIAL FUNCTIONS:
There is a large number of special functions, most of them allowing
the use of matrices as their operands. If the argument to a function
is a dimension specification, it can usually be given as a pair
of (constant) scalar expressions or by giving a matrix that has
the desired size. Some functions have optional arguments that may
be omitted. Some functions return results via their arguments;
such arguments must always be variables, never expressions or
values.
Elementwise Functions:
In the following, d denotes a dimension specification. This
can be a single or a pair of constant scalar expression(s)
(e.g., "1+cols(A),2*rows(A)" ).
or a single variable identifier
In the second case, the variable identifier, say, "A",
specifies as dimension its own dimension (i.e., it
is equivalent to the pair "col(A),row(A)" ).
exp(x), log(x), tanh(x), abs(x), sgn(x), sqrt(x):
sin(x), cos(x), fermi(x), ceil(x), floor(x): replace each matrix element by the
corresponding function value.
fermi'(x), tanh'(x): the derivatives of the fermi() and the tanh()
function
- set(d,a):
- returns a matrix of dimension d, but with
elements set to value a. If a is absent, a value
of 0 is assumed.
- set(d,a,b,c):
- returns a matrix of dimension d, but with
element ij set to value a+bi+cj. If c is absent,
c=0 is assumed.
- ins(x,y,i,j):
- return copy of matrix x in which a submatrix is replaced
with the elements of matrix y such that y[00] is copied
into the position of former element x[ij].
If j or both i,j are omitted, values of
i=0 and/or j=0 are assumed. If i<0, -i is taken
an offset from the last row of the matrix x. Similarly,
if j<0, -j is taken as an offset from the rightmost
column of x.
- rect(x,i,j,k,l):
- return submatrix of matrix x that starts at
row i and column j (i.e., x[ij] will be the 00-element of
the returned submatrix) and that is a k x l matrix.
If k and l are absent, the single element x[ij] is returned.
If i<0, -i is taken an offset from the last row of the matrix x.
Similarly, if j<0, -j is taken as an offset from the rightmost
- pow(x,y):
- each element to the power y (this differs from
the y-fold matrix product x^y!!)
- flipud(x):
- flip x as a pixel matrix upside down
- fliprl(x):
- flip x as pixel matrix, reversing left-right-order
Special constants:
- #pi:
- the value pi
Operations with scalar result:
- cols(x):
- nr of columns of x
- rows(x):
- nr of rows of x
- size(x):
- nr of elements of x (=cols(x)*rows(x))
- norm(x):
- returns euclidean norm
- norm1(x):
- returns max abs value
Operations operating on columns of a matrix or on elements of a vector:
The following operations yield a scalar result for a vector argument,
and a row vector for a matrix argument:
- sum(x):
- sum of each column of x
- avg(x):
- average of each column of x
- std(x):
- standard deviation of each column of x
Max and min operations of a matrix or elements of a vector:
- min(x):
- minimal element of x
- min(x,i,j):
- minimal element of x with index position
returned in i,j which must be
scalar variables. If i,j
did not exist before, they will become defined
by their use as arguments in min(x,i,j).
Finally, if instead two arguments (x,v) are given, and the second argument
v is a row vector with the row dimension of x, or a column vector
with the column
dimension of x, we have
- min(x,v):
- vector of the same shape as v, holding
the minimal elements of each row (column) of x, if
v is a column (row) vector. Vector v itself
will be used to return the index values
of the minimal elements.
Again, v need not be defined before. In this case it will be
defined and dimensioned by the min() call, with the smaller
dimension of the matrix x determining its dimension and
shape ( v becoming a column, if the column dimension is smaller,
a row otherwise).
Special case: if x is a vector, and i a scalar or not yet
defined, min(x,i)
will return the minimal element of x and its index position in i.
The max() function is fully analogous.
Operations requiring or yielding a square shaped matrix operand:
- id(n):
- return a nxn square matrix
- tr(x):
- the sum of the diagonal elements (trace) of x
- det(x):
- matrix determinant
- diag(x):
- returns a diagonal matrix whose diagonal is given by
x, which can be a vector or a square matrix.
- diagv(x):
- the diagonal part of the square matrix x, returned as a mx1 vector
- diag0(x):
- a copy of x with all diagonal elements zeroed (i.e.,
the matrix x-diag(x) )
- eigv(x,w):
- return real parts of eigenvalues of a general square matrix
x and write imaginary parts into the vector w (argument w
must be a previously defined variable name; it
omitted, if the imaginary parts are not needed).
Operations for symmetric matrices:
- eigs(x):
- computes the vector of eigenvalues of a symmetric matrix x
(this is not checked; if x is non-symmetric, the result is
invalid). Eigenvalues are returned as a column vector and
are sorted in descending order
(can be reversed with flip() ).
- eigs(x,T):
- eigenvalues as before, but in addition writes into T
the eigenvectors. T must be a previously defined matrix variable
of the same size as x. Eigs(x,x) is allowed (overwrites
the original matrix with the eigenvectors).
Further special functions:
- svd(a,w,v):
- computes singular value decomposition a=uwv'
with u a matrix of the same shape as a, but orthogonal
columns, w the diagonal vector of singular values and
v an orthogonal square matrix. Arguments w and v
are optional and can be omitted if these values are
not required.
- rot(a):
- if a is a scalar: a 2x2 rotation matrix by angle a
if a is a 3-vector: a 3x3 rotation matrix about the
axis a
if a is a 4-vector: a 4x4 homogeneous rotation matrix about the
axis a
- sort(a):
- sort rows of matrix a so that first column is in ascending order
- sort(a,c):
- sort rows of matrix a so that column c is in ascending order
(columns can be sorted by transposing a and the result).
- normal(d):
- random matrix, elements normally distributed.
- random(d):
- random matrix, elements uniformly distributed in [0,1].
- tf4(a,d):
- [not implemented] (.a and d 3-vectors) the homogeneous transformation
specified by a rotation a and a translation d
CONTROL STRUCTURES:
- if (A) {..} else {..}:
- an if-else construct (the else part may be
missing). A is a allowed to be a matrix (note that this includes simple
boolean expressions such as A<B etc.). If so, the condition evaluates to TRUE
if all individual matrix elements are TRUE (note that this leads to a somewhat
counterintuitive evaluation of some expressions, such as A!=B).
The {..} that follow an else may also be another if-statement:
if (A==1) { B=2; } else if (A==2) { B=4; } else B=0;
If the if- or else-body consists only of a single statement,
the surrounding {..} may be absent.
- while (A) {..}:
- the usual while-statement, with A being any
simple (i.e., a single comparison at most) boolean (matrix-) expression.
Again, if A is matrix valued, the condition evaluates to the AND of
all individual matrix elements.
If only a single statement is to
be iterated, the surrounding {..} may be absent.
- for i=a to b step c {..}:
- usual for-statement (C-equivalent would
be for (i=a;i<=b;i+=c) ). The "step c"-part may be absent, in which case
a step size of 1 is assumed. Note that (i) a,b,c must all be scalars, and
(ii), if b-a is an integer multiple of c,
both limits will be included in the set of iterated values for i.
NOTE: since the limit values are always included, for i=0 to 5
does 6 iterations!
There is a second form of the for statement with a vector-valued
loop-variable:
- for i=B {..}:
- executes the statements within {..} for all columns
of the matrix B. When the statement is encountered, B is evaluated; then,
the `loop variable'' i is set to in turn to all columns of B, and for
each such assignment the body of the for statement is executed.
Note that i must be a vector with as many elements as there are in
a column of B. Example:
for i = [1,2,3;
1,4,9] { ... }
In both forms, a pair of parenthesis around the for-condition is tolerated
(to conform closer with C-syntax), but not required.
EXAMPLES:
In the following examples, all upper case letters A,B,C are m x m input matrices,
x,y,z are column vectors of dimension m, s is a scalar value and X,Y,Z are
general, rectangular matrices.
- det(A-s);
- evaluates the characteristic polynomial
of A at s and returns scalar value at a single output field
of one pin.
- X'/(XX');
- evaluates pseudoinverse for a rectangular matrix
- M = YX'/(XX'+ s id(X); M;
- finds the coefficient matrix M that solves the matrix
equation MX=Y best in the sense of least squares. Here, s is
a regularization parameter.
To extract the R,G and B channels from a 3MN vector (or matrix) C of RGB-triples:
A = set(size(C)/3,3); // make MN x 3 array
A := C; // reshape C into MN x 3 array
R = col(C,0); // first row is red (els 0,3,6.. of orig. C)
G = col(C,1);
B = col(C,2);
To compute analytic functions of symmetric matrices, such as the matrix
exponential EXP(tA) [this not meant as the elementwise function!], one can
use the eig() operation (we assume A is a square input matrix, t a scalar):
U = set(A); // suitable matrix for the eigenvectors
v = eigs(A,U); // eigenvalues v and eigenvectors U
U diag(exp(tv)) U'; // returns matrix exponential of tA
To avoid the full diagonalization at each invocation (e.g., perhaps only t
is changing), split the processing, using a label:
RESET:
U = set(A); // suitable matrix for the eigenvectors
v = eigs(A,U); // eigenvalues v and eigenvectors U
EXEC:
U diag(exp(tv)) U'; // returns matrix exponential of tA
The following computes the convolution of an input image A and sends the
positive and the negative parts to separate output fields:
M = { 0.5, 1, 0.5 ;
1, -6, 1 ;
0.5, 1, 0.5 }; // define filter mask
C = A @ M; // "@" is convolution
C>=0 ? C : 0; // positive part to output field 0
C< 0 ? -C : 0; // abs of negative part to output field 1
A slightly more involved ``program'' for a simple Hebb-network (W assumed
to be a m x m matrix:
EXEC: y = fermi(Wx); y; return; // transform
ADAPT: W += syx'/(x'x); return; // normalized hebb rule,
// s=learning rate
INIT: W = set(yx',0); // initialization
Note that in the INIT-statement setting W=0 would define W as a 1 x 1
matrix. To have the proper size for later, we use the set() function
which returns a matrix of the same shape as its argument, but filled with
a given value (this makes W automatically adjusting, if the dimensions of
x and/or y are changed).
EFFICIENCY CONSIDERATIONS:
Evaluation of expressions is from left to right. Thus, if A is a
matrix and x,y,z are scalars, the product Axyz is evaluated
as ((Ax)y)z) and thus needs three passes over all matrix elements,
while both A(xyz) and xyzA need only one (since the remaining
multiplications need to deal only with scalars). Likewise, if
A is a square matrix, the sum 1+2+3+A evaluates faster than
A+1+2+3, since in the first case the promotion of 1+2+3 to a
(diagonal) matrix occurs only at the last step, while in the other
case each addition involves a matrix addition (however, in the
present case the loop would be restricted to the diagonal elements).
Thus, in general one should move the scalar part of a computation
`to the left'' or bracket it suitably.
PITFALLS:
The use of single variable names and juxtaposition for matrix
multiplication leads to succinct code but requires some care.
If a product, such as tr, coincides with the name of a predefined
function (here tr() ), then tr(A) is parsed as the function
tr() of the argument A (since "tr(" is considered as a single
token, which is different from "tr" ), but tr (A)
(i.e., with a space separating
tr and (A) ) is parsed as the matrix product tr and the expression
(A) (since now the reserved token "tr(" cannot any longer be
recognized). If one wants to safeguard against such cases,
it is recommendable to separate the factors of a matrix product
by spaces.
SEE ALSO:
prog_unit
FILE
/amnt/loge/users/nistaff02/nistaff/rhaschke/nst7/man/../o.linux//../foldersrc/nst_matrix_algebra.c