NAME

psom2 - parametric self-organizing map

PROTOTYPE

unitptr psom2( int iXDim, int iMDim, int *piKnotMax, float *ppfKnotVal[], int *piKnotMaxUsed, char *tOptions, unitptr host)

ARGUMENTS

int iXDim
dimension of embedding space and input/output vector
int iMDim
dimension of map manifold (iMdim<=iXdim).
int *piKnotMax
iMdim dimensional array holding for each map axis i, i=0..iMdim-1, the number of knots at which knot values are specified. If piKnotMax==NULL, the minimal values piKnotMax[i]=2 are assumed for each axis. Below, this array will be referred to as knots[].
float *ppfKnotVal[]
pointer to iMdim arrays holding knot values used for each axis direction (NOTE: you cannot use a two-dimensional array ppfKnotVal[iMdim][m] here!). If NULL, ppfKnotVal[] are assumed equidistant. Below, this array will be referred to as knot_val[][]
int *piKnotMaxUsed
maximal number of respected knots per axis (or NULL)
char *tOptions
option string, see below
unitptr host
host unit of generated unit.

INTERFACE OF CREATED UNIT:

X_in[iXdim]:
(input field 0) input vector
Y_in[iMdim]:
(input field 1) normalized map coordinates nates to start best match search (for mode=CTL_in[3]>0) or to specify adaptation center (see below)
Z_in[iXdim]:
(input field 2) mask for input vector
Eps_in[2]:
(input field 3) adaptation gains: lambda (initial Levenberg Marquart parameter) and eps2 (adjustment learning constant).
Ctl_in[5]:
(input field 4) control inputs: time constant c (presently ignored), (1) minimal iteration step size threshhold maxerr (in normalized map coord), # (2) maxit maximal number of internal best- match search iterations per exec-call and (3) mode- parameter (default=0, see below).
X_out[iXdim]:
(output field 0) output vector = closest projection of input vector onto map manifold
Y_out[iMdim]:
(output field 1) normalized map coord belonging to output vector
Z_out[num*iXdim]:
(packed output field 2) components of internal weight vectors w[a], a=0..num- 1.
out_4[iXdim]:
(unpacked output field 4) stores scaling per x-dimension. If fmPSOM_KnotsXvolDiag is negative this will be recomputed to the range (max-min)**-2 in order to weigth each dimension equaly.

EXECUTION OF CREATED UNIT:

Finds for an input vector X_in the the closest point w=W(s) that lies on a user-specified "parametrized self-organizing map" ("PSOM") represented by a iMdim- dimensional manifold W embedded in a iXdim- dimensional embedding space. The computed point w is returned at output field X_out and satisfies the equation

d(x,w;Z_in) = min_s d(x,W(s); Z_in)

where

d(x,w;Z_in) = sum_j Z_in[j]*(x[j]-w[j])^2

and W(s) is from a iMdim- dimensional manifold that is defined by values on a lattice of num=knots[0]*knots[1]*..*knots[iMdim-1] knot points. The knot points are given by the num points (knot_val[0][a_0],knot_val[1][a_1],..,knot_val[iMdim-1][a_(num-1)]), a_k=0..knots[k]-1. Each knot point can therefore be identified by the integer iMdim-tuple a_0...a_(n-1) ("lattice tuple"), or, alternatively, by the single integer


   a = a_0 + a_1*knots[0] + a_2*knots[0]*knots[1] + ...
      + a_(n-1)*knots[0]*..*knots[iMdim-2].

The map manifold W is constructed using num reference vectors w[a], together with an equal number of internally generated erated basis functions B(a,s). Here, a=0..num-1 denotes a lattice tuple a_0..a_(n-1) as indicated by the previous equation, and B(a,s) is such that > B(a,s_b) = 1 if a=b > B(a,s_b) = 0 if a<>b and s_a,s_b are knot points belonging to lattice tuples a, b, respectively (for a fuller description of the B(a,s) see below). The manifold of values W(s) is then defined by

W(s) = sum_a B(a,s) w_a

Here, s is a iMdim-dimensional vector of "normalized map coordinates" and has the iMdim-dimensional unit cube [0,1]^n as its range. Due to the choice of the B(a,s) the value of W at a knot point s_a is given by W(s_a)=w_a. Therefore, the map manifold W passes through the points w_a. These points -- and therefore the shape of the map -- can be assigned with the adapt-routine of this unit. The exec-routine uses the previous equation to determine for a given input x a vector s of normalized map coordinates that makes the distance d(W(s),x;Z_in) as small as possible. This is achieved by iterating the following equation ("best match search"):

s(t+1) = s(t) + eps1 * (DW/Ds)^T * (x - W(s(t)))

If CTL_in[3]=0 (the default) this iteration sequence is started with the normalized map coordinates s(0) that belong to the best matching reference vector (using euclidean metric and the metric coefficients of input field Z_in as weighting factors for the iXdim axes). If CTL_in[3]>0, the vector specified at input field Y_in is used as starting value instead. The exec-routine returns when either fmPSOM_MaxIter(u) iterations of this equations have been made or ||s(t+1)-s(t)||< fmPSOM_ErrLim(u) occurs.

ADAPTATION OF CREATED UNIT:

a. AUTOCOUNT-LOADING-MODE:: The adaptation routine of this unit allows to adjust the reference vectors w_a. If a call ctrl_unit(mode,u) with mode=NST_INIT has been made, the unit will be for the next num=knots[0]*knots[1]..knots[iMdim-1] adapt_unit-calls in "autocount-loading mode". In this mode, the exec_unit-call is without effect, and, starting with the first internal psom-node, each adapt_unit-call has the effect to store the current input vector at input field X_in as reference vector for one of the internal psom-nodes. The internal nodes are considered as arranged on a iMdim- dimensional lattice with knots[i] lattice points along its i-th side. Nodes are visited in the following order: the i=0-lattice coordinate varies fastest, the i=1-lattice coordinate second fastest and so on, up to the i=n-1-st lattice coordinate. The autocount-loading mode will end in the RAM-mode either after exactly num adapt_unit-calls (after which each node has obtained a reference vector), or when ctrl_unit(mode,u) with one of the other mode-parameter values NST_I_ZERO, or NST_I_RND is issued.

b. RAM-ADAPT:

If a call ctrl_unit(mode,u) with mode=NST_I_RND or NST_I_ZERO has been made (and there was no intervening exec_unit(u) -call since then), the unit will be in random-access mode. In this RAM mode, the unit uses the values at input fields x=X_in, s=Y_in to adapt its internal

reference vectors according to the formula:

w_a = w_a + eps2 * B(a,s) *(X_in - w_a)

This is the learning rule for a self-organizing map with reference vectors w_a (defined on the lattice of knot points) and neighborhood function B(a,s).

c. POST-ADAPT-MODE:

If an exec_unit(u) call has been issued (not beeing in the autocount-loading-mode) then the computation is as before, except that now B(s,a), computed by the previous exec-call (in response to the input vector X_in ), is reused. IMPORTANT NOTE: adaptation in this mode will operate only on those dimensions of the embedding space that have been chosen with the -inp mask option, specifying a letter 'O' in the mask position of each dimension that shall become adapted (example: -inp OO for a 2d-embedding space makes all embedding dimensions adaptive).

DESCRIPTION:

Uses a set of num=knots[0]*knots[1]*..*knots[iMdim-1] of iXdim -dimensional reference vectors v[0]..v[num-1] "attached" to a lattice of as many knot points to establish a iMdim- dimensional "parametrized self-organizing map" ("PSOM") W that is embedded in a iXdim- dimensional space V and that passes through the given reference vectors. The manifold that represents the map is set up by assigning suitable values to the reference vectors (using the adapt- routine of the created unit). Once the manifold has been set up, the exec-routine of the created unit will for a given input vector x determine as a result vector the closest vector w=W(s) of the manifold. In addition, the "normalized map coordinates" s are returned (output field Y_out[iMdim]). For the distance computation the user can specify a diagonal metric at input field Z_in[iXdim]. In particular, if the k-th entry of that field is zero, the corresponding component of x will be ignored. However, after return of the exec- routine, w will contain a value in its k-th element that is determined by the shape of the manifold W and that can be considered as a kind of "associative completion" of the input vector x, using those elements for which nonzero mask values were specified. The result vector w is computed by an iterative procedure. If the value at input field CTL_in(u,3)=0 (this is the default), the starting value is the lattice point of the best-match reference vector, otherwise the current value at X(u,1) is used. The user can specify both an error threshold maxerr (at input field CTL_in(u,1)) and a limit maxit (at input field CTL_in(u,2)) on the number of performed iterations. The exec-routine will return when either maxit iterations have been made, or the (mean square root) change of the result (in normalized map coordinates) per step has become smaller than maxerr. Once the manifold has been set up, it still can be adap- tively changed. For details, see section ADAPTATION. The internal weight values w[a], a=0..num-1, are available at the packed output field Z_out[iXdim*num] in the order w[a=0][0..iXdim-1], w[a=1][0..iXdim-1],..w[a=num-1][0..iXdim-1].

BASIS FUNCTIONS:

To define the map, the unit uses a set of num internally generated basis functions B(a,s) (each such basis function can be imagined as specifying the shape of an 'activity 'bubble centered at location s_a in the map).

Polynomial Basis Functions:

These are the default; they obey the 'orthogonality 'property

    B(a,s_a) = 1, B(a,s_b) = 0 if a<>b

The multivariate basis functions B(a,.) are derived from products of single-variable basis functions f[i,j](.): [0,1]->R, i=0..iMdim-1, j=0..knots[i]-1 by forming the products

B(a,s) = prod_i f[i,a_i](s_i)

Here, the product runs over all map axes i=0..iMdim-1, a_i are the components of the lattice tuple associated with a=0..num-1, and s_i is the i-th component of the normalized map coordinate vector s. The functions f[i,j] obey f[i,j](knot_val[i][j])=1 and vanish at all knot values knot_val[i][k] for which k<>j. They are defined by


    f[i][j](x) = product_(k & k<>j)(x-t_ik)/(t_ij-t_ik)
    t_ik = knot_val[i][k]
    d_ij = t_i1 for j = 0
         = 1 - t_(i,j-1) for j = knots[i]-1
         = (t_(i,j+1) - t_(i,j-1))/2 else.

Gaussian Basis Functions:

To choose gaussian basis functions of width sigma requires to specify in tOptions the option -gauss sigma. Then, the B(a,s_a) will be derived (see below) from the gaussian basis functions

G(a,s) = exp(-(s-s_a)^2/2sigma^2)

Similarly as for the polynomial basis functions, these are derived from products of 1d-gaussians along each axis. However, the G(a,s) are 'not 'orthogonal; therefore, internally the psom computes from the G(a,s) functions a dual function set B(a,s) that obeys the same orthogonality property as in the polynomial case. The B(a,s) are a linear combination of the G(a,s):

B(a,s) = sum_b M_ab G(b,s)

where matrix M obeys sum_b M_abG(b,s_c)=1 if a=c and zero otherwise ( M is the inverse of G when restricted to the lattice). In the current implementation, the transformation to the B(a,s) functions is actually carried out by equivalently applying the transposed transformation (which is again given by M, since M is symmetric) to the weights w. However, this transformation step is only carried out after the weights have been set with AUTOCOUNT-LOADING Mode. As a result, the constructed manifold will then pass through the originally (during autocount loading mode) given data points w. All other adaptation modes operate on the transformed weight values and thus act as if orthogonal basis functions B had been used from the outset. NOTE: the use of gaussian basis functions is currently not compatible with HYPO PSOM mode (cf. below).

OPERATOR MODE:

The PSOM unit can be made to call operands on each iteration step. To turn this feature on, use the -operands and -operMode token to the option text input (also tCtrlPsom), see below. The -operMode code

recognizes the following bits:

0x1:
complete full Xout on (else s=Y_out is valid only)
0x2:
call twice on back steps (otherwise successful steps look the same as unsuccessful ones),
0x4:
do not issue NST_W_PENUP, NST_W_PENDOWN to operands. I.e. the defaults will signal the operands via the NST_W_PENDOWN code the first iteration and using NST_W_PENUP, the last iteration.

CONTROL MODES:

NST_I_USER compute the embedding map location X_out[] based on the given position in the mapping manifold s=Y_in[]. NST_INIT switch to autoload adaptation mode, see above. NST_I_RND or NST_RND. turn it off, see above.

HYPO PSOM (Preliminary):

The number of respected knots per axis i can be restricted to piKnotMaxUsed[i]. This is ignored if piKnotMaxUsed==NULL. If if one component is 0, this and all following are replaced by the previous max number. See also psomSetHypo().

DEFAULTS:

all mask values Z_in[i]=1. No autoscaling mode (THIS IS A CHANGE!).

PIN MACROS:

The following macros are defined in nst_psom.h

   fmPSOM2_Lambda0(u) X((u),3,0) /* 1st Levenberg-Marquart <1.0> */
   fmPSOM_Eps2_in(u) X((u),3,1) /* eps2 <1>*/
                           X((u),4,0) /* n.u. <default value> */
   fmPSOM_ErrLim(u) X((u),4,1) /* min step limit <1.0e-5> */
   fmPSOM_MaxIter(u) X((u),4,2) /* max num of iterations <50> */
   fmPSOM2_ExecMode(u) X((u),4,3) /* exec mode <0=best match> */
   fmPSOM2_AdjLambda(u) X((u),4,4) /* auto-adjust Lambda [0.05..0.5] <0.2>*/
   
   mPSOM2_LambdaOut(u) Y((u),3,1) /* last Lambda used */
   fmPSOM_Iter(u) Y((u),3,0) /* num of iterations */
   fmPSOM_xyDist(u) Y((u),3,2) /* dist in input subspace */
   fmPSOM_DeltaSabs(u) Y((u),3,3) /* last step in map coord */
   fmPSOM_KnotsXvolDiag(u) Y((u),3,4) /* hyper box diagonal of vectors <recompute if negative> */

If fmPSOM_KnotsXvolDiag(u) is set negative, the stored reference vector will be scanned and field out_4 will be recomputed. After each successful iteration the Levenberg- Marquart parameter Lambda will be decreased by the factor fmPSOM2_AdjLambda(u) or increased by the factor 1.0/fmPSOM2_AdjLambda(u)+0.123. An additional field out_4[iXdim] contains information of the

hyperbox volume spanned by the set of reference vectors:

(max-min)**-2 of knot values along x-directions[(0..iXdim-1)].

FURTHER OPTIONS:

Note: only a subset of the options below will work for gaussian basis functions!
-gauss <sigma>:
use gaussian basis functions exp(-x^2/(2*sigma^2))
-limit <value>:
set extrapolation limit value (relative to s-range)
-abslimit <value>:
set extrapolation limit value absulute [-value...+value]
-dumpPsom:
dump to trace stderr
-iMode <mode>:
switch adapt mode
-defaultSpacing:
switch to default spacing
-linearMapSpacing:
switch to linear spacing
-chebyshevSpacing:
do chebyshev spacing
-hypoAll <useMaxNum>:
use useMaxNum for all axis
-BehaveOff_HypoOddCentered:
set global behave
-BehaveNocare_HypoOddCentered:
set global behave
-BehaveWantUpper_HypoOddCentered:
set global behave
-BehaveWantLower_HypoOddCentered:
set global behave
-maxShiftHypoPsom <maxNum>:
set unit centered behave
-hypoAll <useMaxNum>:
use useMaxNum for all axis
-knotvalExp <axisNum> <val>.. <val>:
set knotval explicit (axis<0=all)
-metricEuclidian:
set Euclidian distance metric
-metricEquiStrong:
set distance metric to equivalent strong axis (estimated from spanned interval)
-retractFromBorder <factor>:
best match search will not start at marginal nodes, restract within the node volume instead
-inputDef <--ii-->:
Z_in def string 'iIoOaA-'
-loadData <fileName>:
load data from FILE
-saveDesc <fileName>:
save PSOM description data for unit creation -.-operands <num> number of operands called during iteration
-operMode <code> operator mode; Bits:
0x1: comp full Xout, 0x2: call twice on back steps, 0x4 do not issue NST_W_PENUP, NST_W_PENDOWN to operands
-setTrace <fileName>:
open and start writing to tracefile ('-'=stdout, close on'NULL')
-exec1:
exec one iteration
-completeS:
complete s (X_out[] by Y_in[], same as NST_I_USER)
-help:
help text

EXTRA TRACEFEATURES:

Each psom will print additional trace information to a given filepointer (e.g. stdout), default is a quiet operation.

   setTracePsomFP(FILE * fp, unitptr uPsom)

will activate tracing and direct the trace info into stream fp. (fp=NULL will turn it off).

   FILE* getTracePsomFP(unitptr uPsom)

will return the set trace info file pointer.

CHANGES:

Default for the distance computation is now -metricEuclidean (i.e., the plain euclidean distance is used). To get back the former autoscaling default, specify option -metricEquiStrong ). The option -limit value allows to confine the best match point to a s- parameter box that extends by a fraction of value beyond the s-range limits along all axis dimensions. Example: if the s-range is a cube [-1,+1]^d, then -limit 1.2 will confine the best match point to the unit cube [-1.2,1.2]^d. In particular, value=1 confines the bestmatch point precisely to the specified s-range (a zero or negative value will turn this feature off). The abslimit option sets the S-limit abolute to plus/minus value.

SEE ALSO:

iXdimGetCurrPsom loadPsomBlock savePsomBlock uPsomFromBlock image_psom

FILE

/local/homes/rhaschke/nst7/man/../o.linx86//../foldersrc/nst_psom2.c