深入理解动态规划的一系列问题(11)

深入理解动态规划的一系列问题(11)

动态规划本质上的问题基本就如之前所述,之前的问题也都比较简单。复杂的动态规划问题有几种表现,其中之一就是成本函数计算复杂,状态转移过程需要考虑的细节比较多。今天讲的这个问题就满足这个条件:DPP问题(Discounted Profits Problem)

问题是这样的:假设有一个湖,湖里有一些鱼,在第一年有b1条鱼,之后在第t年开始有bt条鱼。渔民靠捕鱼卖鱼为生,在第t年卖掉xt条鱼的话,可以有r(xt)的收益。当然捕获这些鱼也会造成c(xt,bt)的成本。鱼群每年会自然繁殖,假定繁殖规律是固定的,每年以s的比例自然增长。同样渔民手里的钱存在银行每年也会有固定利息y。请问从第一年开始到第T年,每年应该怎么捕鱼才能使利益最大化,不能竭泽而渔,当然也不能等着饿死。

乍一看这个描述,目标很明确——最大化利益,但是条件N多,在DP求解时,要仔细考虑清楚状态转移方程中核心的优化条件。首先我们定义状态,年份t是个天然的stage维度,那么要知道利益,已知条件有收益函数、成本函数、增长比例和固定利息,共同的需求就是鱼的数量b,于是状态就是(t,b),因为t在1~T之间(这本身也是递归收敛条件),所以DPFE方程为分段方程:f(t,b)=0当且仅当t>T,当t≤T时,f(t,b)=max xt∈{0,…,b} {r(xt)-c(xt,b)+1/(1+y) * f(t+1,s(b-xt)} 。具体解释一下,就是我们算t年b条鱼时的利益时,假设捕xt条鱼,那么利益等价于卖鱼收益r(xt)减去捕鱼成本c(xt,b),再加上明年收益及利息。

看起来蛮复杂的问题,其实核心的状态定义清楚后,逻辑理清,条件虽然多,但是复杂度只增加在了目标计算上,这样的问题对于DP里来说个人认为算是不太复杂的。

实际例子,我们假设T=2,y=0.05,s=2,初始b1=10(单位是千条),收益函数也线性化表示为3xt,成本函数简化为2xt,这样收益我们实际是在求f(1,10),答案输出是19.05,其最优化决策是x1=0,x2=20,即第一年不捕鱼,第二年捕完所有鱼~~(这个变态的决策,有点。。。。。。)

源码:

   1: /*
   2:  * Copyright (C) 2013 changedi
   3:  *
   4:  * Licensed under the Apache License, Version 2.0 (the "License");
   5:  * you may not use this file except in compliance with the License.
   6:  * You may obtain a copy of the License at
   7:  *
   8:  * http://www.apache.org/licenses/LICENSE-2.0
   9:  *
  10:  * Unless required by applicable law or agreed to in writing, software
  11:  * distributed under the License is distributed on an "AS IS" BASIS,
  12:  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  13:  * See the License for the specific language governing permissions and
  14:  * limitations under the License.
  15:  */
  16: package com.jybat.dp;
  17:  
  18: //Discounted Profits Problem
  19: //A discounted DP problem from Winston/Venkataramanan
  20: //pp.779--780
  21: //Used a reproductionFactor of 2 instead of 1.2 so number of
  22: //fish takes on integral values, without the need to use a
  23: //round or a floor function.
  24: public class DPP {
  25:     private static int T = 2; // planning horizon T=2 years
  26:     private static double interestRate = 0.05; // assume 5% interest rate
  27:     private static int initialFishAmount = 10; // initially 10,000 fish in lake
  28:     private static int reproductionFactor = 2; // in 1 year 100% more fish
  29:  
  30:     private static double revenue(int xt) {
  31:         // simply assume that the sale price is $3 per fish, no matter what
  32:         return 3.0 * xt;
  33:     }
  34:  
  35:     private static double cost(int xt, int b) {
  36:         // simply assume that it costs $2 to catch (and process) a fish, no
  37:         // matter what
  38:         return 2.0 * xt;
  39:     }
  40:  
  41:     // t is stage number, each stage represents one year
  42:     // b is current number of fish in lake (scaled to thousands)
  43:     public static double f(int t, int b) {
  44:         if (t == T + 1)// T+1 is outside planning horizon
  45:             return 0.0;
  46:         // xt is the number of fish to catch and sell
  47:         // during year t
  48:         int xt;
  49:         double max = Double.MIN_VALUE;
  50:         for (xt = 0; xt <= b; xt++) {
  51:             double earn = revenue(xt) - cost(xt, b) + 1 / (1 + interestRate)
  52:                     * f(t + 1, reproductionFactor * (b - xt));
  53:             if (earn > max)
  54:                 max = earn;
  55:         }
  56:         return max;
  57:     }
  58:  
  59:     /**
  60:      * @param args
  61:      */
  62:     public static void main(String[] args) {
  63:         System.out.println(f(1, initialFishAmount));
  64:     }
  65:  
  66: }


你可能感兴趣的:(深入理解动态规划的一系列问题(11))