USAGE:

within prog_unit

    #loadlib "spectral.so"

Contents:

Routines for FFT Routines for DWT (discrete wavelet transform) Routines for Filtering Routines for audio Miscellaneous ____________

Routines for FFT:

rfft(pfC,pfX)
-- real-complex FFT. |pfX| is best prod of 2,3,5,7,11,13. |pfC|/2=|pfX|/2+1, complex F coeef. as real/im pairs.
irfft(pfX,pfC)
-- inverse real-complex FFT. |pfX| is best prod of 2,3,5,7,11,13. |pfC|/2=|pfX|/2+1, complex F coeef. as real/im pairs. normalize such that irfft(rfft(x)==x;
rPfft(pfS,pfC,pfX)
-- return real-complex FFT in pfC[] and power spectrum in pfS[]. |pfX| is best prod of 2,3,5,7,11,13. |pfS|=|pfC|/2=|pfX|/2+1, complex F coeef. as real/im pairs. pfS is normalized by 1/|pfX|^2. Return total power sum(pfS)
window(pfY,pfX,window)
-- window vector pfX[] in pfY[]. Requires date length |pfX|>= window length |pfY|. Return mean window value. Use one of the following windows types defined by |window|:

     window=0: square window (default)
     window=1: triangle window ("Parzen window")
     window=2: square window (1-x^2, "Welch window")
     window=3: cosine window (1-cos, "Hanning window")
     window=4: sinus window
     |window|>=5: trapezoidal window with slope length = |window|
     positive windows types are normalised to mean value=1, negative ]0..1].

rWPfft(pfS,pfC,pfX,window)
-- windowed real-complex FFT in .pfC[] and power spectrum in pfS[]. Windows of length s: |pfS|*2=|pfC|=s+2, complex F coeef. as real/im pairs. |pfX|>=s. pfS is normalized by 1/|pfX|^2. Return total power sum(pfS). Window slection same as window() above.
rfftnd(pfC,pfX,piDim)
-- multi-dimensional real-complex FFT; like rfft() but in |piDim| dimensions. piDim[] describes the length of each dimension which is best prod of 2,3,5,7,11,13. Requires |pfX|=prod_i(piDim[i]), |pfC|=prod_i-1(piDim[i])*(piDim[|piDim|-1]/2+1) complex F coeef. as real/im pairs.
irfftnd(pfX,pfC,pfDim)
-- multi-dimensional inverse real-complex FFT like irfft() but in |piDim| dimensions. piDim[] describes the length of each dimension which is best prod of 2,3,5,7,11,13. Requires |pfX|=prod_i(piDim[i]), |pfC|=prod_i-1(piDim[i])*(piDim[|piDim|-1]/2+1) complex F coeef. as real/im pairs. normalize such that irfft(rfft(x)==x;
Performance:
The 1D-real-complex fft routines are factor 5-6 faster than the fft1d unit of the math folder, and about factor 3-4 faster than the fft2d routines. 1D rfft case needs 2-3us per float (on Pentium 75+ with 39.8 bogomips), 1D irfft case needs 3-4us per float, 2D rfftnd/irfftnd calls take a little less.
____________

Routines for Discrete Wavelet Transform:

dwt(pfY,pfX,iMode)
-- compute Discrete Wavelet Transform. abs value of iMode selects one of the availabele wavelet kernels below. If (iMode<0) perform the inverse transform. Requires: |pfX|=|pfY|= power of 2; pfX=pfY allowed. Return the number of pyramidal transforms operations and the result in pfY[];

      WT_DAUB4 : Daubechies 4-coefficient filter
      WT_DAUB12 : Daubechies 12-coefficient filter
      WT_DAUB20 : Daubechies 20-coefficient filter

dwtn(pfY,pfX,piDim,iMode)
-- n-dimensional discrete wavelet transform. Analog to 1-dim dwt() but in |piDim| dimensions. piDim[i] describes the length of each dimension i. Each must be a power of 2. Requires: |pfX|=|pfY|. pfX=pfY is allowed. Return the number of pyramidal transforms operations and the result in pfY[];
____________

Routines for Filtering:

filtzero(pfY,pfX,piRadius,iMode)
-- elementwise zero radius filter. Cancels (absolute) small elements in vector (e.g. spectrum). iMode is one of FILTER_CUT, FILTER_RED, or FILTER_EXP. Requires: |pfX|=|pfY|; pfX=pfY allowed. Return the number of affected elements.
FILTER_CUT = 0:
cut off ( |x|<radius set to 0; not continuous);
FILTER_RED = 1:
cut off and reduce elements by piRadius (continuous);
FILTER_EXP = 2:
contract locally: y=(1-exp(-x*x))*x (continuous, smooth);
normalize_get(pfP,pfX,mode,yMax)
-- compute parameters .pfP[] for normalizing a given array of values. pfX[], is filled, according to the selected mode and the given data vector pfX[]. The normalization is done by calling normalize_apply() afterwards, or by the compound function normalize() (see below). Requires |pfP|>=4. Currently the following modes are implemented.

    Linear Modes 0,1,2,3,4,5 Y[i]=c * X[i] + d:
       xMean = sum_i(X[i])/dim(X)
       xMax = max(X[i])
       xMin = min(X[i])

    0: (max-strech)
      The mean (or dc-bias) is not removed the vector is streched by
            c = min(abs(yMax/xMax), abs(yMax/xMin))
            d = 0

    1: (bias-max-strech)
      after removing xMean, the vector is streched as in mode 0.
      The same is done in one step with
                          | yMax | | yMax |
                 c = min( |----------|, |----------|)
                          |xMax-xMean| |xMin-xMean|
   
                 d = -xMean * c
   
    2: (strech_fit)
      after transformation, max(Y[i]) is yMax and min(Y[i]) is -yMax, so
                 c = (2*yMax)/(xMax-xMin)
                 d = yMax - c * xMax;

    3: (mean_std)
      after the linear transformation, Y has mean(Y)=0 and std(Y)=yMax.
      
    4: (mean_adev)
      after transformation, Y has zero mean and average deviation
      from mean will be yMax.
      
    5: (positive_stretch)
      Linear Transformation: (xMin, xMax) -> (0, yMax), so
                   c = yMax/(xMax-xMin)
                   d = -c * xMin
    
    90: dB relative to peak signal (Y=10*log10(X/xMax))
    91: dB relative to 1 (Y=10*log10(X))
    95: dB of square signal, relative to peak signal (Y=20*log10(X/xMax))
    96: dB of square signal, relative to 1 (Y=20*log10(X))
    
    100..200: (LOG)
        logarithmic scaling Y[i]= c * log10(abs(X[i])) + d;
        (or -999.99 if X[i]==0.)
        c and d are set such, that dec decades are covered:
           [xMax/10^dec,xMax] --> [0,yMax]
           with dec = mode-100

    -200..-100: (LOG_polarity)
        logarithmic scaling, same are LOG-mode, but the polarity
        is preserved in the following manner:
           [ - xMax , - xMax / 10^dec] --> [0,yMax/2]
           [-xMax/10^dec,+xMax/10^dec] --> yMax/2
           [ xMax / 10 ^ dec , xMax ] --> [yMax/2,yMax]
           with dec = -mode+100

    The Data format of .pfP[] is
       P[0] = mode (copy of mode)
       P[1] = yMax (copy of yMax)
       P[2] = c (parameter),
       P[3] = d (parameter)

normalize_apply(pfY,pfX,pfP, iClipMode)
-- apply linear or nonlinear normalization to vector pfX, as previously computed by normalize_get(). If the clip bit (iClipMode&1) is set, appropriate clipping is performed, see below. Requires |pfY|=|pfX|. Return number of clipped values .

       mode=pfP[0]=
       0,1,2 -- clip to [-yMax,yMax];
       3,4 -- no-clipping;
       5,LOG,LOG_polarity -- clip to [0,yMax])

normalize(pfY,pfX,mode,yMax)
-- normalize vector pfX. Same as calling normalize_get() and normalize_apply(). Linear and non-linear transforms according to the selected mode, see normalize_get() above.

         
           


iirfilter(pfR,pfV,pfA,pfB,pfBX,pfBY)
-- apply IIR-Filter on Vector V

   Calculate for all k

      pfR[k] = sum_(n=0..Lb){b_n * x_(k-n)} - sum_(n=1..La){a_n * x_(k-n)}

   The last values from inputs, resp. outputs are buffered in pfBX, resp. pfBY.
   To start properly, pfBX, pfBY must be initialized with zeroes.
   pfR and pfV must be different!
   
   A FIR-Filter can be realized by the convolve_e (with right alignment)
   So, given {a_n} in A with dim. N you get the result by
     reverse(A, A);
     convolve_e(Res, X, A, N);
   alternatively, you take a 0-Vector in pfB.

iirgain(pfR,pfA,pfB,freq)
-- calculate amplitude gain of IIR-Filter

   An IIR-Filter given by Filter-Coefficients pfA, pfB has an
   transfer-function:

               b_0 + b_1 z^(-1) + b_2 z^(-2) + ... + b_L z^(-L)
        H(z) = ------------------------------------------------
               1 + a_1 z^(-1) + ... + a_L z^(-L)
    
   The complex amplitude gain for the Freq. freq is given by H(exp(i omega))
   with omega=2*pi*freq.
   freq is the normalized frequency = real_frequency / Sampling_Rate.
   The real gain |H(exp(i omega))| = |g| is calculated and returned,
   the phase-shift is given by atan(Im(g)/Re(g)).
   pfR must be at least 2-dim. Result will be
         pfR[0] = Re(g);
         pfR[1] = Im(g);
         pfR[2] = |g|;
         pfR[3] = phase-shift (in rad);

iircoeffs(pfA,pfB,pfP)
-- get IIR-Filter-coefficients for special filters

   The filter coefficients pfA (denominator), pfB (nominator)
   will be calculated according to the content of the parameter vector pfP
   pfP has several elements which have the following meaning:

         0: type of filter 1 = Chebyshev Type I (not yet impl.)
                           2 = Chebychev Type II (not yet impl.)
                           3 = Butterworth (only 2nd-order yet)
         1: type of characteristic: 1 - Lowpass, 2 - Hipass,
                                    3 - Bandpass,4 - Bandstop
         2: Number of filter sections (only 1 up to now)
         3: unused
       4-7: cutoff and stopband frequencies (f1, f2, f3, f4)
              for Low-Pass: pfP[4] = cutoff-freq, pfP[5]=stopband-freq
              for Band-Pass: pfP[5] = cut-on, pfP[6] = cut-off
              for Stop-Band: pfP[5] = cut-off,pfP[6] = cut-on
              for Hi-Pass: pfP[4] = stopband-freq, pfP[5] = cut-on,
         8: stopband rejection (must be > 3.0 dB);

   The coefficients are determined via the bilinear transformation.
   
pareqcoeffs(pfA,pfB,freq,gain,qfactor)
-- get coefficients for Bell-Shaped Parametric Equilization

     freq: The centre frequency/Sampling-Rate of the bandpass filter
           (0.5 = f_nyquist).
        g: The gain of the bandpass filter at the centre
           frequency in decibels (dB).
           Range is about -15dB to +15dB.
        q: The Q-Factor (or steepness) of the bandpass filter.
           The smaller the value, the steeper the filter
           (and the more unstable the filter).
           Practical values range from about 1 to about 0.01.
   
    Originally the Filter is a cascade of in_i -> x_i -> out_i with
         x[i] = in[i] - b1*x[i-1] - b2*x[i-2];
       out[i] = a0*x[i] + a1*x[i-1] + a2*x[i-2];

    Note that this filter will become unstable
    if desgined badly, for example if the Q-Factor is too low,
    or the gain too high, or the centre frequency too low.

fftfilter(pfR,pfV,pfH,mode)
-- Frequency Filter via FFT.

   Result is F^{-1}(F(V)(f)*H').
   H' is the resampled version of H, so that dim(H')=dim(V)+2;
   Action will be according to mode:
             0: The whole signal buffer will be FFT'ed in one step.

mk_fftfilter(pfR,pfA,mode)
-- Compute Filter Kernel for fftfilter.

Operation is depending on mode:


            1: Lowpass -Filter A[0]= 95%-Freq. A[1] = 5%-Freq

            2: HiPass -Filter A[0]= 5%-Freq. A[1] = 95%-Freq

            3: Bandpass-Filter A[0]= 5%-Freq. A[1] = 95%-Freq
                               A[2]= 95%-Freq. A[3] = 5%-Freq

            4: Bandstop-Filter A[0]= 95%-Freq. A[1] = 5%-Freq
                               A[2]= 5%-Freq. A[3] = 95%-Freq

   Array R will be filled with Fermi-Fkt or product of 2 Fermis, so that
   each Fermi fullfills the filter-condition.
   Using negative modes, the arguments will be interpreted differently:
      A[0] = Center-Freq, A[1] = Frequency-Width of transition, etc.
   






____________

Routines for Audio Data Transformation:

db2rt(fv)
-- transform decibel to ratio = db2rt(fv) = 10**(fv/10) = exp(ln(10)*fv/10); fv should have dimension 1
rt2db(fv)
-- transform ratio to dB-value = rt2db(fv) = 10 * ln(fv)/ln(10) fv should have dimension 1
rmDCBias(pfR,pfV)
-- remove DC-Bias from Data-Vector pfV, DC-Bias is the 0. Term of FFT = Projection on 1-Fkt = Sum(V[i])/NV. Result will be in pfR and may be the same as pfV. Return: Bias-Value
compression(pfR,pfV,max,level)
-- apply dynamic compression to vector pfV. result is in pfR. Compression means applying the logarithmic transfer-function

          R[i]=max*sign(V[i])*(fabs(V[i])/max)^(1-level/10).
          level is in [0..10].

   Dynamic Expansion can be done by using level from [-0..-10[,

          R[i]=max*sign(V[i])*(fabs(V[i])/max)^(10/(10-|level|)).

   Applying C_-l( C_l(V) ) is the Identity.
   Note: Use noise gate before compression, for not to amplify noise !

noise_gate(pfR,pfV,gatelevel,holdsmp,relsmp,attsmp)
-- noise gate the data-vector. The input signal is passed through if the signal level is above a certain threshold, else there is no output. This algorithm is improved by the following points:

     - to avoid distortion (due to clipping the zero-Passings of a
       waveform) the signal is only clipped if more than holdtime
       samples in series are smaller than the gatelevel.

     - reverbs may stop to apprupt, so after the release-time,
       the signal will be released linearly for relsmp samples.

     - new signals that interrupt the gating will lead to clicks...
       that is avoided by a fade-in for attsmp Samples.

delay(pfR,pfV,delay,decay)
-- recursive delay, put "echo" onto audio-vector. The output (in pFR) is a sum of the input and a delayed sample. The delayed sample is attenuated by a decay factor. This system is called a comb filter. It is not ideal because it causes "colourations" of the input signal, since the transmission function has minima and maxima. The quality of the comb filter can be improved by subtracting the fraction 1/(1-decay)^2 of the input from the output. This results in what is called the all-pass reverberator. This algorithm implements just the comb filter.

   The delayed samples die out according to the decay variable
   which should lie between 0 (no echo) and 1 (echoes never die out).
   Note that when decay is close to one,
   the samples may begin clipping and the output may saturate!

   The delay parameter is the number of samples to delay by.
   The delay time is thus given by delay time = delay/samplingrate

   For example, at 44.1kHz sampling rate,
   delay = 1000 implies a delay time of 22.7 ms.

reverb(pfR,pfV,SR,decaytime,rvbsend,mix)
-- put reverb on audio-data reverb will put "Hall" onto the audio-track pfV. The result vector is stored in pfR. SR is the Sampling-Rate decaytime is the time, the Reverb-Signal takes to fade to -60 dB. rvbsend is the fraction [0..1] of the vector pfV which is sent to the effect-unit. Effect and V are mixed together, so that: mix = 0 -> dry, no effect, mix = 1 -> pure effect-signal.

   The digital reverberator is implemented by using
   a number of comb filters (delays) in parallel
   with varying delay times (between 0ms and 50ms).
   The outputs of the comb filters are summed together to produce the output.
   
resample(pfR,pfV,rate,mode)
-- resample the data of the audio-track. Length of resampled sequence is dim[pfV]*rate, dim[pfR] may be smaller than dim[pfV]*rate. if dim[ofR] is larger, the rest is filled with zeroes. Operation is depending on mode:

   0: Hold_Sample-Mode
   1: Linear Interpolation

   - Interpolation always distorts the spectrum, the only correct way
     to resample in a higher rate is to put zeros in between each sample,
     than lowpass the new signal with 1/2 of the new nyquist frequency.
     On the other case, when a signal is downsampled, a lowpass filter has
     to be applied with the new nyquist frequency as cut-off to avoid
     aliasing... :::mode 2 will take this effects into account but is
     not yet implemented.

spectrogram(pfR,pfV,FFTSize,frame-offset,windowtype,mode)
-- calculate spectrogram, FFTSize must be even, mode can be

               0: Power-Spectrogram (linear)
               1: Power-Spectrogram (in dB)
               2: Re/Im-Pairs (Result-Vector-Size will be doubled)
               3: |z|, phase-Pairs (Result-Vector-Size will be doubled)

   Last frame is skipped, if not enough samples are defined. What is done is:

    - Calculate FFT( Window(WindowType)*pfV(k*Frame-Offset..(k+1)*Frame-Offset-1) )
      k = 0..floor(dim(pfV)/ FFT-Size).
    - Put Values of Power-Spectrum in pfR.
   
specsynth(pfR,pfV,FFTSize,frame-offset,mode)
-- calculate waveform from spectrogram this is the inverse operation to building a spectrogram. FFTSize must be even, mode can by

               0: pfV contains Power-Spectrogram (linear)
               1: pfV contains Power-Spectrogram (in dB)
               2: pfV contains Re/Im-Pairs (phase correct reconstruction)
               3: pfV contains value/Phase - Pairs
                     ...
   Last frame is skipped, if not enough samples are defined. What is done is:

    - Calculate inverse FFT for each Time-Slice.
    - Put Values in Wave-Vector pfR.

____________ Miscellaneous:]
colorize(pfR,pfG,pfB, pfX, min,max, mode)
-- colorize data-vector by mapping of the elements vector pfX[] to RGB triples in pfR[],pfG[],pfB[]. The values pfX[i] are linearly interpolated from the range [min,max] onto a polygon in the RGB color model. Polygon vertices are selected by |mode|, see below. If mode is negative, the used min and max are computed from the data set min=min(pfF), max=max(pfF). Currently valid modes are

      1: CMAP_TEMP: temperature-map (black-blue-pink-red-yellow-white)
      2: CMAP_HUE1: (rainbow + black to min and white to max)
      3: CMAP_GREY: grey scale from black (min) to white (max)
      4: CMAP_SIGN: blue-black-red (may be used for signed values)
       5..10: default set to CMAP_GREY
      11..17: CMAP_USER .. CMAP_USER+10 user definable modes, they
              can be set by mk_colormap, by default they point to the CMAP_GREY
   Usage of the names (instead of numerical modes) is recommended because
   the numbers may change for future extensions.
      
   Non-linear modes:
   All modes+p: with remainder p in [-0.5,0.5[,
   same as mode'=int(mode+0.5) but previously to color indexing the
   power law x'=x^(1/(0.5-p)-1) [0,1]->[0,1] is applied.
   This results for negative .p in a root shaped function
   (with emphasis on the area around min),
   for p=0 to the linear function, and for positive .p
   to a square type function with emphasis on the area around max.
   Return value is the exponent;

mk_colormap(.M, target,mode,arg)
-- make/modify/delete colormap for colorize. The colormap is given by a vector M of RGB triples, that is e.g. M = [R1,G1,B1, R2,G2,B2, .., Rn,Gn,Bn] for a RGB-Polygon with n vertices. The first nonvalid color ([R|G|B] >255 || [R|G|B]<0 || end_of_vector) determines the length of the color map. The target is the number/name of the color_map to modify. Only the target = [CMAP_USER..CMAP_USER+10] can be modified. The constants CMAP_SET (:= 0) and CMAP_POINT2 ( := 1) may be used for mode

   Examples:
   A call to .mk_colormap(M,CMAP_USER+4,CMAP_SET,0) sets the color table given
   in M for the 4th CMAP_USER mode.

   A call .mk_colormap(M,CMAP_USER,CMAP_POINT2,CMAP_TEMP) deletes the colormap
   for mode CMAP_USER (if it exists). The color map used furtheron
   for this mode is the one given in the .arg (here: CMAP_TEMP)

   more doku: foldersrc/nst_dsp.c -> th_mk_colorize
      
colorlegend(pfX,.mode)
-- generate a legend to the last applied colormapping. X contains quadruples (value, r, g, b). |pfX|/4 determines the number of steps. Return value is 0 if nothing changed inbetween this and the last call, 1 if the colorlegend changed.

   Operation is depending on mode:
     0: generate a linear segmentation in the value range.
     1: generate a linear segmentation in the color range,
   -> a tcl_prog_unit to visualize the legend can be found in
   ~thermann/neo/th2/circuits/color_legend.NST

See also:

prog_unit, vector.so, and a few examples in

      neo -i +/homes/walter/p/nst5/jwexpl

Id: nst_spectral.c,v 1.13 2000/08/13 09:18:30 rhaschke Exp

FILE

/homes/jontrup/nst5/man/../o.sol2//../foldersrc/nst_spectral.c