# Kalman Filtering Neural Networks - Haykin S.

ISBNs: 0-471-36998-5

**Download**(direct link)

**:**

**22**> 23 24 25 26 27 28 .. 72 >> Next

since ck â€” sNâ€”1 (xk)2 â€” sNâ€”1[(ek)1/2]2 â€” Sâ€”14 â€” e.

The remainder of the derivation is straightforward. The EKF recursion in the case of the entropic cost function requires that derivatives of zk â€” (â€”4)1/2 be computed for all No outputs and all weight parameters, which are subsequently stored in the matrices Hk. Applying the chain rule, these derivatives are expressed as a function of the derivatives of network outputs with respect to weight parameters:

_ _@zL k ^

@(Â£k)1/2 1 k

Aâ€”Â±1 3wkj (4)1/2(4 + i)k) awf

(2.64)

Note that the effect of the relative entropy cost function on the calculation of derivatives is handled entirely in the initialization of the backpropagation process, where the term 1 /[(4)1/2(4 + yk)] is used for each of the No output nodes to start the backpropagation process, rather than starting with a value of unity for each output node as in the nominal formulation.

In general, the EKF procedure can be modified in the manner just described for a wide range of cost functions, provided that they meet at least three simple requirements. First, the cost function must be a differentiable function of network outputs. Second, the cost function should be expressed as a sum of contributions, where there is a separate target value for each individual component. Third, each component of the cost function must be non-negative.

2.7.3 EKF Training with Scalar Errors

When applied to a multiple-output training problem, the EKF formulation in Eqs. (2.3)-(2.6) requires a separate backpropagation for each output and a matrix inversion. In this section, we describe an approximation to

3The idea of using a modified target value of zero with the actual targets appearing in expressions for system outputs can be applied to the EKF formulation of Eqs. (2.3)-(2.6) without any change in its underlying behavior.

56 2 PARAMETER-BASED KALMAN FILTER TRAINING

the EKF neural network training procedure that allows us to treat such problems with single-output training complexity. In this approximation, we require only the computation of derivatives of a scalar quantity with respect to trainable weights, thereby reducing the backpropagation computation and eliminating the need for a matrix inversion in the multiple-output EKF recursion.

For the sake of simplicity, we consider here the prototypical network training problem for which network outputs directly encode signals for which targets are defined. The square root of the contribution to the total cost function at time step k is given by

1/2

/N, V/2

yk = C1=2 = (Â£ lyk - y7|2J ; (2.65)

where we are again treating the simple case of uniform scaling of network errors (i.e., Sk â€” I). The goal here is to train a network so that the sum of squares of this scalar error measure is minimized over time. As in the case of the entropic cost function, we consider the target for training to be zero for all training instances, and the scalar error signal used in the Kalman recursion to be given by Xk â€” 0 â€” yk. The EKF recursion requires that the derivatives of the scalar observation yk be computed with respect to all weight parameters. The derivative of the scalar error with respect to the jth weight of the ith node is given by

= v= Nozi-yk_@yL n 66

k awk7' zti 3yk awk7 zti Xk awk7'. (. )

In this scalar formulation, the derivative calculations via backpropagation are initialized with the terms (yk â€” yk)/Xk for all No network output nodes (as opposed to initializing the backpropagation calculations with values of unity for the nominal EKF recursion of Eqs. (2.3)-(2.6)). Furthermore, only one quantity is backpropagated, rather than No quantities for the nominal formulation. Note that this scalar approximation reduces exactly to the nominal EKF algorithm in the limit of a single-output problem:

= (ly1 - J1|2)1/2 = [1 - y1|> Xk = 0 -|i1 -y1|;

H7â€™1 ____

Hk â€”

_ â€žâ„¢/..1 .-.u a-y1

@wkJ

= -sgn(y/t - ,y1)

awk7

(2.67)

(2.68)

(2.69)

2.8 AUTOMOTIVE APPLICATIONS OF EKF TRAINING 57

Consider the case < j)[. Then, the error signal is given by Xk â€” â€” j^.

Similarly, k/3w â€” 3y1/3w, since @~k/3y1 â€” 1. Otherwise, when > j^, the error signal is given by Xk â€” â€”04 â€” >i), and @yk/3w â€” â€” Sy^/Sw, since k/Sy1 â€” â€”1. Since both the error and the derivatives are the negatives of what the nominal EKF recursion provides, the effects of negation cancel one another. Thus, in either case, the scalar formulation for a single-output problem is exactly equivalent to that of the EKF procedure of Eqs. (2.3)-(2.6).

Because the procedure described here is an approximation to the base procedure, we suspect that classes of problems exist for which it is not as effective; further work will be required to clarify this question. In this regard, we note that once criteria are available to guide the decision of whether to scalarize or not, one may also consider a hybrid approach to problems with many outputs. In this approach, selected outputs would be combined as described above to produce scalar error variables; the latter would then be treated with the original procedure.

**22**> 23 24 25 26 27 28 .. 72 >> Next