深入理解动态规划的一系列问题(11)
动态规划本质上的问题基本就如之前所述,之前的问题也都比较简单。复杂的动态规划问题有几种表现,其中之一就是成本函数计算复杂,状态转移过程需要考虑的细节比较多。今天讲的这个问题就满足这个条件:DPP问题(Discounted Profits Problem)
问题是这样的:假设有一个湖,湖里有一些鱼,在第一年有b1条鱼,之后在第t年开始有bt条鱼。渔民靠捕鱼卖鱼为生,在第t年卖掉xt条鱼的话,可以有r(xt)的收益。当然捕获这些鱼也会造成c(xt,bt)的成本。鱼群每年会自然繁殖,假定繁殖规律是固定的,每年以s的比例自然增长。同样渔民手里的钱存在银行每年也会有固定利息y。请问从第一年开始到第T年,每年应该怎么捕鱼才能使利益最大化,不能竭泽而渔,当然也不能等着饿死。
乍一看这个描述,目标很明确——最大化利益,但是条件N多,在DP求解时,要仔细考虑清楚状态转移方程中核心的优化条件。首先我们定义状态,年份t是个天然的stage维度,那么要知道利益,已知条件有收益函数、成本函数、增长比例和固定利息,共同的需求就是鱼的数量b,于是状态就是(t,b),因为t在1~T之间(这本身也是递归收敛条件),所以DPFE方程为分段方程:f(t,b)=0当且仅当t>T,当t≤T时,f(t,b)=max xt∈{0,…,b} {r(xt)-c(xt,b)+1/(1+y) * f(t+1,s(b-xt)} 。具体解释一下,就是我们算t年b条鱼时的利益时,假设捕xt条鱼,那么利益等价于卖鱼收益r(xt)减去捕鱼成本c(xt,b),再加上明年收益及利息。
看起来蛮复杂的问题,其实核心的状态定义清楚后,逻辑理清,条件虽然多,但是复杂度只增加在了目标计算上,这样的问题对于DP里来说个人认为算是不太复杂的。
实际例子,我们假设T=2,y=0.05,s=2,初始b1=10(单位是千条),收益函数也线性化表示为3xt,成本函数简化为2xt,这样收益我们实际是在求f(1,10),答案输出是19.05,其最优化决策是x1=0,x2=20,即第一年不捕鱼,第二年捕完所有鱼~~(这个变态的决策,有点。。。。。。)
源码:
1: /*
2: * Copyright (C) 2013 changedi
3: *
4: * Licensed under the Apache License, Version 2.0 (the "License");
5: * you may not use this file except in compliance with the License.
6: * You may obtain a copy of the License at
7: *
8: * http://www.apache.org/licenses/LICENSE-2.0
9: *
10: * Unless required by applicable law or agreed to in writing, software
11: * distributed under the License is distributed on an "AS IS" BASIS,
12: * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13: * See the License for the specific language governing permissions and
14: * limitations under the License.
15: */
16: package com.jybat.dp;
17:
18: //Discounted Profits Problem
19: //A discounted DP problem from Winston/Venkataramanan
20: //pp.779--780
21: //Used a reproductionFactor of 2 instead of 1.2 so number of
22: //fish takes on integral values, without the need to use a
23: //round or a floor function.
24: public class DPP {
25: private static int T = 2; // planning horizon T=2 years
26: private static double interestRate = 0.05; // assume 5% interest rate
27: private static int initialFishAmount = 10; // initially 10,000 fish in lake
28: private static int reproductionFactor = 2; // in 1 year 100% more fish
29:
30: private static double revenue(int xt) {
31: // simply assume that the sale price is $3 per fish, no matter what
32: return 3.0 * xt;
33: }
34:
35: private static double cost(int xt, int b) {
36: // simply assume that it costs $2 to catch (and process) a fish, no
37: // matter what
38: return 2.0 * xt;
39: }
40:
41: // t is stage number, each stage represents one year
42: // b is current number of fish in lake (scaled to thousands)
43: public static double f(int t, int b) {
44: if (t == T + 1)// T+1 is outside planning horizon
45: return 0.0;
46: // xt is the number of fish to catch and sell
47: // during year t
48: int xt;
49: double max = Double.MIN_VALUE;
50: for (xt = 0; xt <= b; xt++) {
51: double earn = revenue(xt) - cost(xt, b) + 1 / (1 + interestRate)
52: * f(t + 1, reproductionFactor * (b - xt));
53: if (earn > max)
54: max = earn;
55: }
56: return max;
57: }
58:
59: /**
60: * @param args
61: */
62: public static void main(String[] args) {
63: System.out.println(f(1, initialFishAmount));
64: }
65:
66: }