DP-001.htm_DYNAMIC PROGRAMMING

DYNAMIC PROGRAMMING

1. Introduction

Let us discuss the following problem to obtain the shortest route from A to K.

Problem 1. Obtain the shortest route from A to K, in which four stages (I, II, III, IV), three states as like B, C, D and routes shown by arrows with required time are included.

　

　

Fig. 1 Farm work series

　

1-1. Trial 1. Unsystematic method

Standing at point A, we have a information that route to B is shortest, and select B.

At next step, standing at B, we may select to E, and standing at E, we can select to I.

We may select finally A-B-E-I-K route which is 15 as shortest path.

This method will sometimes give optimal answer, but not always correct answer.

　

1-2. Trial 2. Brute method

There are 27 possible decision paths to determine all possible solutions. Thus, a total of 108 computations is required to determine that path A-C-G-H-K returns the optimum time of 14.

If number of stages increase as n, then possible paths increase 3**(n-1).

Example: n = 10, 3 ** 9 = 19,683 and a total of 196,830 computations is required.

Suppose a computation requires a second, 196,830 second (=3280 min =54.7 h =2.3 d) will be necessary.

　

1-3. Trial 3. Dynamic Programming method

This farm work process problem is a decision problem of multi-stage really, and is not decision problem of a stage but of multi-stage, that is total stage.

Trial 3-1

From point A, there is three paths as A-B, A-C, A-D

To point E, there is three paths as ABE, ACE, ADE and their time is 6,7,7 and ABE is shortest(6).

To point F, there is three paths as ABF, ACF, ADF and their time is 7,7,6 and ADF is shortest(6).

To point G, there is three paths as ABG, ACG, ADG and their time is 8,5,8 and ACG is shortest(5).

In next stage, there is three paths as EH, FH, GH to point H, and we know the optimal time ABE, ADF, ACG is 6,6,5, therefore ABEH, ADFH, ACGH is 11,10,8 and ACGH is shortest(8).

In same way, there is three paths as EI, FI, GI to point I, and we know the optimal time ABE, ADF, ACG is 6,6,5, therefore ABEI, ADFI, ACGI is 10,11,11 and ABEI is shortest(10).

In same way, there is three paths as EJ, FJ, GJ to point J, and we know the optimal time ABE, ADF, ACG is 6,6,5, therefore ABEJ, ADFJ, ACGJ is 12,13,11 and ACGJ is shortest(11).

Finally to point K, there is three paths as HK, IK, JK and we know the optimal time A-H, A-I, A-J is 8, 10, 11 , therefore ACGHK, ABEIK, ACGJK is 14,15, 15 and ACGHK is shortest(14).

This procedure is the method of Dynamic Programming, and the procedure is shown in figure 2. (24 computations)

Fig. 2 Procedure of calculation

　

See DP-01.xls

　

1-4. Forward method

T(B) = 2 Eq. 1

T(C) = 3 Eq. 2

T(D) = 4 Eq. 3

T(E)=min[T(B)+4, T(C)+4, T(D)+3]

=min[2+4, 3+4, 4+3]=6 Eq. 4

T(F)=min[T(B)+5, T(C)+4, T(D)+2]

=min[2+5, 3+4, 4+2]=6 Eq. 5

T(G)=min[T(B)+6, T(C)+2, T(D)+4]

=min[2+6, 3+2, 4+4]=5 Eq. 6

T(H)=min[T(E)+5, T(F)+4, T(G)+3]

=min[6+5, 6+4, 5+3]=8 Eq. 7

T(I)=min[T(E)+4, T(F)+5, T(G)+6]

=min[6+4, 6+5, 5+6]=10 Eq. 8

T(J)=min[T(E)+6, T(F)+7, T(G)+6]

=min[6+6, 6+7, 5+6]=11 Eq. 9

T(K)=min[T(H)+6, T(I)+5, T(J)+4]

=min[8+6, 10+5, 11+4]=14 Eq. 10

　

1-5. Backward method

In same way, calculation from point K to point A will be called as backward method.

　

1-6. General Expression of DP

F1(X) = f(x1) Eq. 11

where, F1(X) is optimal value in stage 1

f(x1) is time required of farm work x1.

F2(X) = Min{F1(X) + f(x2)} Eq. 12

Fn(X) = Min{Fn-1(X) + f(xn)} Eq. 13

where, Fn(X) is optimal value in stage n

f(xn) is time required of farm work xn.

Fn(X) is a gradual equation, which is called Bellman's principle of optimality.

　

2. Exercise

2-1. Exercise 1

Optimize the allocation of 5 work crews to 4 field harvest jobs. The payoff matrix in thousands of dollars for the number of crews assigned to a specific job is as shown below:

　

WORK JOB

crew

J1

J2

J3

J4

1

20

10

30

20

2 40 20 40 30

3 60 30 50 40

4 60 50 70 50

5 50 70 80 60

　

Find the optimal assignment by calculating the return for all 56 possible decision paths which utilize all 5 crews. [121 paths of 0-5 crews for 4 jobs] (This procedure is aptly called the "brute force" method.)

Maximum profit is 110$ and optimal alloction is [3, 0, 1, 1]. See DP-02.xls

　

2-2. DP by backward method by using equations

In following equation, RJij is profit of job i and allocated crews j.

RJ41=20

RJ42=30

RJ43=40

RJ44=50

RJ45=60

　

RJ31=max(J31, RJ41)

=max(30, 20)=30

RJ32=max(J32, J31+RJ41, J30+RJ42)

=max(40, 30+20, 0+30)=50

RJ33=max(J33, J32+RJ41, J31+RJ42, J30+RJ43)

=max(50, 40+20, 30+30, 0+40)=60

RJ34=max(J34, J33+RJ41, J32+RJ42, J31+RJ43, J30+RJ44)

=max(70, 50+20, 40+30, 30+40, 0+50)=70

RJ35=max(J35, J34+RJ41, J33+RJ42, J32+RJ43, J31+RJ44, J30+RJ45)

=max(80, 70+20, 50+30, 40+40, 30+50, 0+60)=90

　

RJ21=max(J21, RJ31)

=max(10, 30)=30

RJ22=max(J22, J21+RJ31, J20+RJ32)

=max(20,10+30,0+50)=50

RJ23=max(J23, J22+RJ31, J21+RJ32, J20+RJ33

=max(30, 20+30, 10+50, 0+RJ33)=60

RJ24=max(J24, J23+RJ31, J22+RJ32, J21+RJ33, J20+RJ34)

=max(50, 30+30, 20+50, 10+60, 0+70)=70

RJ25=max(J25, J24+RJ31, J23+RJ32, J22+RJ33, J21+RJ34 ,J20+RJ35)

=max(70, 50+30, 30+50, 20+60, 10+70, 0+90)=90

　

RJ11=max(J11, RJ21)

=max(20, 30)=30

RJ12=max(J12, J11+RJ21, J10+RJ22)

=max(40, 20+30, 0+50)=50

RJ13=max(J13, J12+RJ21, J11+RJ22, J10+RJ23)

=max(60, 40+30, 20+50, 0+60)=70

RJ14=max(J14, J13+RJ21, J12+RJ22, J11+RJ23, J10+RJ24)

=max(60, 60+30, 40+50, 20+60, 0+70)=90

RJ15=max(J15, J14+RJ21, J13+RJ22, J12+RJ23, J11+RJ24, J10+RJ25)

=max(50,60+30,60+50,40+60,20+70,0+90)=110

　

J(3,0,1,1)=110

　

2-3. DP by forward method

see in appendix: DP-apdx.doc

　

2-4. Exercise 2: Distribution of irrigation water

see SA-water.xls

　return

2004/8/19

WORK		JOB
crew	J1	J2	J3	J4
1	20	10	30	20
2	40	20	40	30
3	60	30	50	40
4	60	50	70	50
5	50	70	80	60