NAME
psom2 - parametric self-organizing map
PROTOTYPE
unitptr psom2( int iXDim, int iMDim, int *piKnotMax, float *ppfKnotVal[], int *piKnotMaxUsed, char *tOptions, unitptr host)
ARGUMENTS
- int iXDim
- dimension of embedding space and input/output vector
- int iMDim
- dimension of map manifold (iMdim<=iXdim).
- int *piKnotMax
- iMdim dimensional array holding for each map axis i, i=0..iMdim-1, the number of knots at which knot values are specified. If piKnotMax==NULL, the minimal values piKnotMax[i]=2 are assumed for each axis. Below, this array will be referred to as knots[].
- float *ppfKnotVal[]
- pointer to iMdim arrays holding knot values used for each axis direction (NOTE: you cannot use a two-dimensional array ppfKnotVal[iMdim][m] here!). If NULL, ppfKnotVal[] are assumed equidistant. Below, this array will be referred to as knot_val[][]
- int *piKnotMaxUsed
- maximal number of respected knots per axis (or NULL)
- char *tOptions
- option string, see below
- unitptr host
- host unit of generated unit.
INTERFACE OF CREATED UNIT:
- X_in[iXdim]:
- (input field 0) input vector
- Y_in[iMdim]:
- (input field 1) normalized map coordinates
nates to start best match search (for
mode=CTL_in[3]>0) or to specify adaptation
center (see below)
- Z_in[iXdim]:
- (input field 2) mask for input vector
- Eps_in[2]:
- (input field 3) adaptation gains:
lambda (initial Levenberg Marquart
parameter) and eps2 (adjustment learning
constant).
- Ctl_in[5]:
- (input field 4) control inputs: time
constant c (presently ignored), (1)
minimal iteration step size threshhold
maxerr (in normalized map coord), # (2)
maxit maximal number of internal best-
match search iterations per exec-call
and (3) mode- parameter (default=0, see
below).
- X_out[iXdim]:
- (output field 0) output vector =
closest projection of input vector onto
map manifold
- Y_out[iMdim]:
- (output field 1) normalized map coord
belonging to output vector
- Z_out[num*iXdim]:
- (packed output field 2) components of
internal weight vectors w[a], a=0..num- 1.
- out_4[iXdim]:
- (unpacked output field 4) stores
scaling per x-dimension. If
fmPSOM_KnotsXvolDiag is negative this
will be recomputed to the range
(max-min)**-2 in order to weigth each dimension
equaly.
EXECUTION OF CREATED UNIT:
Finds for an input vector X_in the the closest point w=W(s)
that lies on a user-specified "parametrized self-organizing
map" ("PSOM") represented by a iMdim- dimensional manifold W
embedded in a iXdim- dimensional embedding space.
The computed point w is returned at output field X_out and
satisfies the equation
d(x,w;Z_in) = min_s d(x,W(s); Z_in)
where
d(x,w;Z_in) = sum_j Z_in[j]*(x[j]-w[j])^2
and W(s) is from a iMdim- dimensional manifold that is
defined by values on a lattice of
num=knots[0]*knots[1]*..*knots[iMdim-1] knot points. The
knot points are given by the num points
(knot_val[0][a_0],knot_val[1][a_1],..,knot_val[iMdim-1][a_(num-1)]),
a_k=0..knots[k]-1. Each knot point can
therefore be identified by the integer iMdim-tuple
a_0...a_(n-1) ("lattice tuple"), or, alternatively, by the
single integer
a = a_0 + a_1*knots[0] + a_2*knots[0]*knots[1] + ...
+ a_(n-1)*knots[0]*..*knots[iMdim-2].
The map manifold W is constructed using num reference vectors
w[a], together with an equal number of internally generated
erated basis functions B(a,s). Here, a=0..num-1 denotes a
lattice tuple a_0..a_(n-1) as indicated by the previous
equation, and B(a,s) is such that
> B(a,s_b) = 1 if a=b
> B(a,s_b) = 0 if a<>b
and s_a,s_b are knot points belonging to lattice tuples a,
b, respectively (for a fuller description of the B(a,s) see
below).
The manifold of values W(s) is then defined by
W(s) = sum_a B(a,s) w_a
Here, s is a iMdim-dimensional vector of "normalized map
coordinates" and has the iMdim-dimensional unit cube [0,1]^n
as its range.
Due to the choice of the B(a,s) the value of W at a knot
point s_a is given by W(s_a)=w_a. Therefore, the map manifold
W passes through the points w_a. These points -- and
therefore the shape of the map -- can be assigned with the
adapt-routine of this unit.
The exec-routine uses the previous equation to determine for
a given input x a vector s of normalized map coordinates
that makes the distance d(W(s),x;Z_in) as small as possible.
This is achieved by iterating the following equation
("best match search"):
s(t+1) = s(t) + eps1 * (DW/Ds)^T * (x - W(s(t)))
If CTL_in[3]=0 (the default) this iteration sequence is
started with the normalized map coordinates s(0) that belong
to the best matching reference vector (using euclidean
metric and the metric coefficients of input field Z_in as
weighting factors for the iXdim axes). If CTL_in[3]>0, the
vector specified at input field Y_in is used as starting
value instead.
The exec-routine returns when either fmPSOM_MaxIter(u)
iterations of this equations have been made or
||s(t+1)-s(t)||< fmPSOM_ErrLim(u) occurs.
ADAPTATION OF CREATED UNIT:
a. AUTOCOUNT-LOADING-MODE::
The adaptation routine of this unit allows to adjust the
reference vectors w_a. If a call ctrl_unit(mode,u) with
mode=NST_INIT has been made, the unit will be for the next
num=knots[0]*knots[1]..knots[iMdim-1] adapt_unit-calls in
"autocount-loading mode". In this mode, the exec_unit-call
is without effect, and, starting with the first internal
psom-node, each adapt_unit-call has the effect to store the
current input vector at input field X_in as reference vector
for one of the internal psom-nodes. The internal nodes are
considered as arranged on a iMdim- dimensional lattice with
knots[i] lattice points along its i-th side. Nodes are
visited in the following order: the i=0-lattice coordinate
varies fastest, the i=1-lattice coordinate second fastest
and so on, up to the i=n-1-st lattice coordinate.
The autocount-loading mode will end in the RAM-mode either
after exactly num adapt_unit-calls (after which each node
has obtained a reference vector), or when ctrl_unit(mode,u)
with one of the other mode-parameter values NST_I_ZERO,
or NST_I_RND is issued.
b. RAM-ADAPT:
If a call ctrl_unit(mode,u) with mode=NST_I_RND or
NST_I_ZERO has been made (and there was no
intervening exec_unit(u) -call since then), the unit will be
in random-access mode. In this RAM mode, the unit uses the
values at input fields x=X_in, s=Y_in to adapt its internal
reference vectors according to the formula:
w_a = w_a + eps2 * B(a,s) *(X_in - w_a)
This is the learning rule for a self-organizing map with
reference vectors w_a (defined on the lattice of knot
points) and neighborhood function B(a,s).
c. POST-ADAPT-MODE:
If an exec_unit(u) call has been issued (not beeing in the
autocount-loading-mode) then the computation is as before,
except that now B(s,a), computed by the previous exec-call
(in response to the input vector X_in ), is reused.
IMPORTANT NOTE: adaptation in this mode will operate
only on those dimensions of the embedding space that have
been chosen with the -inp mask option, specifying a letter
'O' in the mask position of each dimension that shall
become adapted (example: -inp OO for a 2d-embedding
space makes all embedding dimensions adaptive).
DESCRIPTION:
Uses a set of num=knots[0]*knots[1]*..*knots[iMdim-1] of
iXdim -dimensional reference vectors v[0]..v[num-1]
"attached" to a lattice of as many knot points to establish
a iMdim- dimensional "parametrized self-organizing map"
("PSOM") W that is embedded in a iXdim- dimensional space V
and that passes through the given reference vectors. The
manifold that represents the map is set up by assigning
suitable values to the reference vectors (using the adapt-
routine of the created unit).
Once the manifold has been set up, the exec-routine of the
created unit will for a given input vector x determine as a
result vector the closest vector w=W(s) of the manifold. In
addition, the "normalized map coordinates" s are returned
(output field Y_out[iMdim]).
For the distance computation the user can specify a diagonal
metric at input field Z_in[iXdim]. In particular, if the
k-th entry of that field is zero, the corresponding component
of x will be ignored. However, after return of the exec-
routine, w will contain a value in its k-th element that is
determined by the shape of the manifold W and that can be
considered as a kind of "associative completion" of the
input vector x, using those elements for which nonzero mask
values were specified.
The result vector w is computed by an iterative procedure.
If the value at input field CTL_in(u,3)=0 (this is the
default), the starting value is the lattice point of the
best-match reference vector, otherwise the current value at
X(u,1) is used. The user can specify both an error threshold
maxerr (at input field CTL_in(u,1)) and a limit maxit (at
input field CTL_in(u,2)) on the number of performed iterations.
The exec-routine will return when either maxit
iterations have been made, or the (mean square root) change
of the result (in normalized map coordinates) per step has
become smaller than maxerr.
Once the manifold has been set up, it still can be adap-
tively changed. For details, see section ADAPTATION. The
internal weight values w[a], a=0..num-1, are available at
the packed output field Z_out[iXdim*num] in the
order w[a=0][0..iXdim-1], w[a=1][0..iXdim-1],..w[a=num-1][0..iXdim-1].
BASIS FUNCTIONS:
To define the map, the unit uses a set of num internally
generated basis functions B(a,s) (each such basis function
can be imagined as specifying the shape of an 'activity 'bubble
centered at location s_a in the map).
Polynomial Basis Functions:
These are the default; they obey the 'orthogonality 'property
B(a,s_a) = 1, B(a,s_b) = 0 if a<>b
The multivariate basis functions B(a,.) are
derived from products of single-variable basis functions
f[i,j](.): [0,1]->R, i=0..iMdim-1, j=0..knots[i]-1 by forming
the products
B(a,s) = prod_i f[i,a_i](s_i)
Here, the product runs over all map axes i=0..iMdim-1, a_i
are the components of the lattice tuple associated with
a=0..num-1, and s_i is the i-th component of the normalized
map coordinate vector s.
The functions f[i,j] obey f[i,j](knot_val[i][j])=1 and
vanish at all knot values knot_val[i][k] for which k<>j. They
are defined by
f[i][j](x) = product_(k & k<>j)(x-t_ik)/(t_ij-t_ik)
t_ik = knot_val[i][k]
d_ij = t_i1 for j = 0
= 1 - t_(i,j-1) for j = knots[i]-1
= (t_(i,j+1) - t_(i,j-1))/2 else.
Gaussian Basis Functions:
To choose gaussian basis functions of width sigma requires
to specify in tOptions the option -gauss sigma. Then,
the B(a,s_a) will be derived (see below) from the
gaussian basis functions
G(a,s) = exp(-(s-s_a)^2/2sigma^2)
Similarly as for the polynomial basis functions, these
are derived from products of 1d-gaussians along each
axis. However, the G(a,s) are 'not 'orthogonal; therefore,
internally the psom computes from the G(a,s) functions
a dual function set B(a,s) that obeys the same orthogonality
property as in the polynomial case. The B(a,s) are
a linear combination of the G(a,s):
B(a,s) = sum_b M_ab G(b,s)
where matrix M obeys sum_b M_abG(b,s_c)=1
if a=c and zero otherwise ( M is the inverse of G
when restricted to the lattice).
In the current implementation, the transformation to the
B(a,s) functions is actually carried out by equivalently
applying the transposed transformation (which is again given by
M, since M is symmetric) to the weights w. However,
this transformation step is only carried out after the
weights have been set with AUTOCOUNT-LOADING Mode.
As a result, the constructed manifold will then pass
through the originally (during autocount loading mode)
given data points w. All other adaptation modes operate
on the transformed weight values and thus act as if
orthogonal basis functions B had been used from the
outset.
NOTE: the use of gaussian basis functions is currently
not compatible with HYPO PSOM mode (cf. below).
OPERATOR MODE:
The PSOM unit can be made to call operands on each iteration step.
To turn this feature on, use the -operands and -operMode token to
the option text input (also tCtrlPsom), see below. The -operMode code
recognizes the following bits:
- 0x1:
- complete full Xout on (else s=Y_out is valid only)
- 0x2:
- call twice on back steps
(otherwise successful steps look the same as unsuccessful ones),
- 0x4:
- do not issue NST_W_PENUP, NST_W_PENDOWN to operands.
I.e. the defaults will signal the operands via the
NST_W_PENDOWN code the first iteration and using NST_W_PENUP,
the last iteration.
CONTROL MODES:
NST_I_USER compute the embedding map location X_out[] based on the
given position in the mapping manifold s=Y_in[].
NST_INIT switch to autoload adaptation mode, see above.
NST_I_RND or NST_RND. turn it off, see above.
HYPO PSOM (Preliminary):
The number of respected knots per axis i can be restricted
to piKnotMaxUsed[i]. This is ignored if
piKnotMaxUsed==NULL. If if one component is 0, this and all
following are replaced by the previous max number. See also
psomSetHypo().
DEFAULTS:
all mask values Z_in[i]=1. No autoscaling mode
(THIS IS A CHANGE!).
PIN MACROS:
The following macros are defined in nst_psom.h
fmPSOM2_Lambda0(u) X((u),3,0) /* 1st Levenberg-Marquart <1.0> */
fmPSOM_Eps2_in(u) X((u),3,1) /* eps2 <1>*/
X((u),4,0) /* n.u. <default value> */
fmPSOM_ErrLim(u) X((u),4,1) /* min step limit <1.0e-5> */
fmPSOM_MaxIter(u) X((u),4,2) /* max num of iterations <50> */
fmPSOM2_ExecMode(u) X((u),4,3) /* exec mode <0=best match> */
fmPSOM2_AdjLambda(u) X((u),4,4) /* auto-adjust Lambda [0.05..0.5] <0.2>*/
mPSOM2_LambdaOut(u) Y((u),3,1) /* last Lambda used */
fmPSOM_Iter(u) Y((u),3,0) /* num of iterations */
fmPSOM_xyDist(u) Y((u),3,2) /* dist in input subspace */
fmPSOM_DeltaSabs(u) Y((u),3,3) /* last step in map coord */
fmPSOM_KnotsXvolDiag(u) Y((u),3,4) /* hyper box diagonal of vectors <recompute if negative> */
If fmPSOM_KnotsXvolDiag(u) is set negative, the stored
reference vector will be scanned and field out_4 will be
recomputed. After each successful iteration the Levenberg-
Marquart parameter Lambda will be decreased by the factor
fmPSOM2_AdjLambda(u) or increased by the factor
1.0/fmPSOM2_AdjLambda(u)+0.123.
An additional field out_4[iXdim] contains information of the
hyperbox volume spanned by the set of reference vectors:
(max-min)**-2 of knot values along
x-directions[(0..iXdim-1)].
FURTHER OPTIONS:
Note: only a subset of the options below will work
for gaussian basis functions!
- -gauss <sigma>:
- use gaussian basis functions exp(-x^2/(2*sigma^2))
- -limit <value>:
- set extrapolation limit value (relative to s-range)
- -abslimit <value>:
- set extrapolation limit value absulute [-value...+value]
- -dumpPsom:
- dump to trace stderr
- -iMode <mode>:
- switch adapt mode
- -defaultSpacing:
- switch to default spacing
- -linearMapSpacing:
- switch to linear spacing
- -chebyshevSpacing:
- do chebyshev spacing
- -hypoAll <useMaxNum>:
- use useMaxNum for all axis
- -BehaveOff_HypoOddCentered:
- set global behave
- -BehaveNocare_HypoOddCentered:
- set global behave
- -BehaveWantUpper_HypoOddCentered:
- set global behave
- -BehaveWantLower_HypoOddCentered:
- set global behave
- -maxShiftHypoPsom <maxNum>:
- set unit centered behave
- -hypoAll <useMaxNum>:
- use useMaxNum for all axis
- -knotvalExp <axisNum> <val>.. <val>:
- set knotval explicit (axis<0=all)
- -metricEuclidian:
- set Euclidian distance metric
- -metricEquiStrong:
- set distance metric to equivalent strong axis (estimated from spanned interval)
- -retractFromBorder <factor>:
- best match search will not start at marginal nodes, restract within the node volume instead
- -inputDef <--ii-->:
- Z_in def string 'iIoOaA-'
- -loadData <fileName>:
- load data from FILE
- -saveDesc <fileName>:
- save PSOM description data for unit creation
-.-operands <num> number of operands called during iteration
- -operMode <code> operator mode; Bits:
- 0x1: comp full Xout, 0x2: call twice on back steps, 0x4 do not issue NST_W_PENUP, NST_W_PENDOWN to operands
- -setTrace <fileName>:
- open and start writing to tracefile ('-'=stdout, close on'NULL')
- -exec1:
- exec one iteration
- -completeS:
- complete s (X_out[] by Y_in[], same as NST_I_USER)
- -help:
- help text
EXTRA TRACEFEATURES:
Each psom will print additional trace information to a given
filepointer (e.g. stdout), default is a quiet operation.
setTracePsomFP(FILE * fp, unitptr uPsom)
will activate tracing and direct the trace info into
stream fp. (fp=NULL will turn it off).
FILE* getTracePsomFP(unitptr uPsom)
will return the set trace info file pointer.
CHANGES:
Default for the distance computation is now
-metricEuclidean (i.e., the plain euclidean distance
is used). To get back the former autoscaling default,
specify option -metricEquiStrong ).
The option -limit value allows to confine the
best match point to a s- parameter box that extends
by a fraction of value beyond the s-range limits
along all axis dimensions. Example: if the s-range
is a cube [-1,+1]^d, then -limit 1.2 will
confine the best match point to the unit cube
[-1.2,1.2]^d. In particular, value=1 confines
the bestmatch point precisely to the specified
s-range (a zero or negative value will turn this feature off).
The abslimit option sets the S-limit abolute to plus/minus value.
SEE ALSO:
iXdimGetCurrPsom loadPsomBlock savePsomBlock uPsomFromBlock
image_psom
FILE
/local/homes/rhaschke/nst7/man/../o.linx86//../foldersrc/nst_psom2.c