python数据分析个人学习读书笔记-目录索引
第11章涵盖了代理人基的模拟;我们模拟的场景有:加油站的加油过程,电动车耗尽电量以及狼——羊的掠食。
本章中,会学习以下技巧:
·使用SimPy模拟加油站的加油过程
·模拟电动车耗尽电量的场景
·判断羊群面对群狼时是否有团灭的风险
11.1导论
获取数据总是一件麻烦事:收集数据就很累,然后要提取需要的特征,还得费很多功夫。而且,收集过来的数据容易导致短视:你只看到眼前几片树叶,忽略了整个森林。
不过,某些情况下你可以进行模拟。当不可能观察每个局部时,或者想在多种情况下测试模型时,或者要验证假设时,模拟就很有用了。
很多书介绍了如何进行金融数据的模拟。本书中不打算这么做。相反,我们将集中精力关注代理人基模型。这种模拟创建了一个虚拟世界(或环境),我们可以将代理人放置其中。代理人几乎可以是你考虑的任何物体:在我们的模拟中,代理人会是加油站、车、充电站、羊或狼。在模拟中,代理人与代理人、代理人与环境都互相影响。
不过,非常有效的代理人基模拟有些局限。最主要的是代理人之间互动的局限;错失一个重要的行为可能就会导致错误的结论。比如,路上的司机总是遵守章法,这么一个假设就会导致一个结论,路上只需要一条车道,而在现实中,人们的行为是多种多样的,只有一条车道就会引发长长的拥堵。如果想进一步了解代理人基模型及其限制,我推荐这篇文章:http://ijcsi.org/papers/IJCSI-9-1-3-115-119.pdf。
本章将说一说如何用SimPy设计并运行模拟。
原作者强烈推荐SimPy官网上的建模介绍:https://simpy.readthedocs.org/en/latest/topical_guides/index.html。
你可以从http://pypi.python.org/pypi/SimPy/下载软件包。
如果你用的是Anaconda,可以使用下面的命令:
/* pip install -U simpy */
本章我们会接触Python更高深的部分:编写和继承类,以及使用异常中断模拟。别紧张,我们会详细讲解的。
/* pip install simpy Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting simpy Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5a/64/8f0fc71400d41b6c2c6443d333a1cade458d23d4945ccda700c810ff6aae/simpy-3.0.11-py2.py3-none-any.whl Installing collected packages: simpy Successfully installed simpy-3.0.11 FINISHED */
11.2使用SimPy模拟加油站的加油过程
你如果有车,那去加油站就是常规活动了:取下泵,刷卡,加油,离开。
不过,如果你要建一个加油站,那就要提前考虑一系列问题了:
·车辆加油的频率是多少?
·这对每个司机的等待时长有什么样的影响?(我们可不希望有人因为等太久而去了竞争者那里)
·能不能满足需求?
·多久需要让供应商装满油箱?
·能产生多少利润?
这些问题都得问问自己。
还有那些如果。如果加两个加油机呢?如果用更大的油箱呢,这样不用经常叫供应商过来了?如果车辆来加油站的频率不是30秒一辆,二是45秒一辆呢?这会如何影响我的利润?
在你投入数百万之前,试试模拟,测试各种场景,这是一个便宜但宝贵的工具。
注意下,这个章节的流程和本书之前的不同。
准备:需要装好SimPy和NumPy。
代码实现:模拟的代码很长,但既然你书都看到这里了,对你来说,代码的逻辑应该是相当直接的。GitHub上的源代码——sim_gasStation.py文件——可能有些地方不清晰。我们下一节会一步一步地过一遍:
1 import numpy as np 2 import simpy 3 import itertools 4 if __name__ == '__main__': 5 # what is the simulation horizon (in seconds) 6 #模拟时限,以秒计 7 SIM_TIME = 10 * 60 * 60 # 10 hours 8 9 # create the environment创建环境 10 env = simpy.Environment() 11 12 # create the gas station 13 gasStation = GasStation(env) 14 15 # print the header 16 print('\t\t\t\t\t\t Left') 17 header = 'CarID\tArrive\tStart\tFinish\tGal' 18 header += '\tType\tPetrol\tDiesel' 19 print(header) 20 print('-' * 62) 21 22 # create the process of generating cars 生成车 23 env.process(Car.generate(env, gasStation)) 24 25 # and run the simulation运行模拟 26 env.run(until = SIM_TIME)
原理:照例先引入需要的模块。
我们的代码以if__name__=='__main__'(这是Python的约定)。先决定模拟程序要跑多久。本例中,我们将一轮当成1秒。然后创建环境(.Environment(...))。这是模拟的基础。环境封装了时间,并处理模拟中代理人之间的互动。
下面,创建第一个代理人,GasStation:
1 class GasStation(object): 2 ''' 3 Gas station class. 4 ''' 5 def __init__(self, env): 6 ''' 7 The main constructor of the class 8 9 @param env -- environment object 10 ''' 11 # keep a pointer to the environment in the class 保存一个指向环境的指针 12 self.env = env 13 14 # fuel capacity (gallons) and reservoirs 15 # to track level 16 #油箱(加仑)和贮油器 17 self.CAPACITY = {'PETROL': 8000, 'DIESEL': 3000} 18 self.RESERVOIRS = {} 19 self.generateReservoirs(self.env, self.CAPACITY) 20 21 # number of pumps for each fuel type 每种染料的泵数 22 self.PUMPS_COUNT = {'PETROL': 3, 'DIESEL': 1} 23 self.PUMPS = {} 24 self.generatePumps(self.env, self.CAPACITY, 25 self.PUMPS_COUNT) 26 27 # how quickly they pump the fuel 抽出的速度 28 self.SPEED_REFUEL = 0.3 # 0.3 gal/s 29 30 # set the minimum amount of fuel left before 31 # replenishing 32 #设置在补足前的最少油量 33 self.MINIMUM_FUEL = {'PETROL': 300, 'DIESEL': 100} 34 35 # how long does it take for the track to get 36 # to the station after placing the call 37 self.TRUCK_TIME = 200 38 self.SPEED_REPLENISH = 5 39 40 # add the process to control the levels of fuel 41 # available for sale 42 self.control = self.env.process(self.controlLevels()) 43 44 print('Gas station generated...')
Python中的类都有__init__(self,...)方法。创建GasStation对象时会调用这个方法。
你可以创建一个只有静态方法的类,那就不需要__init__(self,...)了。静态方法的执行不需要对象。也就是说,调用这个方法不依赖于任何对象的性质,但这个方法依然与类的主题有关。
比如,你有一个Triangle类,创建了使用勾股定理计算弦长的方法length。你也可以用这个方法计算平面上两点之间的距离(笛卡尔坐标系)。这时,你就可以将这个方法声明为静态的,并在需要时调用,以实现代码的重用。
你可能已经注意到了,self(几乎)出现在所有方法里。self参数是指向实例对象自身的引用(也由此得名)。
在__init__(self,...)方法内,你应该列出所有实例对象都会拥有的内部属性。我们的加油站会提供两种燃料:汽油和柴油。油箱分别可以容纳8000加仑和3000加仑。所以self.RESERVOIRS属性有两种.Container(...):对应这两种燃料。.Container(...)对象其实就是许多处理过程可以共享的资源;每个过程都可以访问这个公用的源,直到结束,或.Container(...)中用光了。.generateReservoirs(...)方法实现了这些逻辑:
1 def generateReservoirs(self, env, levels): 2 ''' 3 Helper method to generate reservoirs 4 ''' 5 for fuel in levels: 6 self.RESERVOIRS[fuel] = simpy.Container( 7 env, levels[fuel], init=levels[fuel])
.Container(...)方法以指向.Environment(...)对象的指针作为第一个参数,贮油器的容量作为第二个参数,初始量作为第三个参数。
用generatePumps(...)方法创建FuelPump对象:
1 def generatePumps(self, env, fuelTypes, noOfPumps): 2 ''' 3 Helper method to generate pumps 生成泵的辅助函数 4 ''' 5 for fuelType in fuelTypes: 6 self.PUMPS[fuelType] = FuelPump( 7 env, noOfPumps[fuelType])
FuelPump对象唯一的属性self.resource放的是SimPy的.Resource(...)对象。可以将.Resource(...)对象看成某种资源的守门员。可用来限制并行访问某种.Container(...)的过程的数目。
在我看来,区分.Container(...)和.Resource(...)的最简单办法是想想加油站的运转方式。同一个(地下)油箱连有多个贮油器。任何时候,一辆车可使用一个贮油器;所以加油站的吞吐量由贮油器的数目限定。这就是我们将FuelPump作为.Resource(...)的原因。另外,所有的贮油器都连接油箱,我们使用.Container(...)对象为这种行为建模。来了一辆车,我们使用.Resource(...)的.request(...)方法访问一个贮油器;如果所有的贮油器都被占用了,就得等待。
然后指定输油的速度——本例中是0.3 gal/s——以及重新装满前的最小油量;对于汽油,我们看到少于300加仑时就要叫油车来,对于柴油,这条线在100加仑。计算中也要考虑油车在路上花的时间(假设是300秒),以及5加仑每秒的速度下,重新充满需要的时间。
__init__(self,...)方法的最后一个指令创建环境中第一个过程;将.controlLevels()方法放到环境中:
1 def controlLevels(self): 2 ''' 3 A method to check the levels of fuel (every 5s) 4 and replenish when necessary 5 ''' 6 while True: 7 # loop through all the reservoirs 循环所有油箱 8 for fuelType in self.RESERVOIRS: 9 10 # and if the level is below the minimum 11 #如果油量不是最小值 12 if self.RESERVOIRS[fuelType].level \ 13 < self.MINIMUM_FUEL[fuelType]: 14 15 # replenishes补满 16 yield env.process( 17 self.replenish(fuelType)) 18 # wait 5s before checking again--5秒后检查 19 yield env.timeout(5)
这个方法在整个模拟期间持续循环。循环内使用yield env.timeout(5),每5秒检查油箱的量。
整个模拟期间,各种代理人都会创建与挂起过程。yield命令挂起过程,并在之后由环境或另一个过程唤起。我们的例子中,yield env.timeout(5)将当前过程挂起5秒;5秒后,环境会触发另一次遍历。
循环中,我们检查所有油箱是否不足最小量。如果不足,我们挂起过程,叫来油车,即replenish(...)方法。当前遍历只有在完成补充工作后才会恢复:
1 def replenish(self, fuelType): 2 ''' 3 A method to replenish the fuel 4 ''' 5 # print nicely so we can distinguish when the truck 6 # was called 7 print('-' * 62) 8 print('CALLING TRUCK AT {0:4.0f}s.' \ 9 .format(self.env.now)) 10 print('-' * 62) 11 12 # waiting for the truck to come (lead time)等油车来 13 yield self.env.timeout(self.TRUCK_TIME) 14 15 # let us know when the truck arrived 16 print('-' * 62) 17 print('TRUCK ARRIVING AT {0:4.0f}s' \ 18 .format(self.env.now)) 19 20 # how much we need to replenish 要补多少 21 toReplenish = self.RESERVOIRS[fuelType].capacity - \ 22 self.RESERVOIRS[fuelType].level 23 24 print('TO REPLENISH {0:4.0f} GALLONS OF {1}' \ 25 .format(toReplenish, fuelType)) 26 print('-' * 62) 27 28 # wait for the truck to dump the fuel into 29 # the reservoirs 等油车补满 30 yield self.env.timeout(toReplenish \ 31 / self.SPEED_REPLENISH) 32 33 # and then add the fuel to the available one 34 yield self.RESERVOIRS[fuelType].put(toReplenish) 35 36 print('-' * 62) 37 print('FINISHED REPLENISHING AT {0:4.0f}s.' \ 38 .format(self.env.now)) 39 print('-' * 62)
注意有些情况下,用\分隔的多行命令会报错。你可以将多行命令放进括号(...)来避免错误。
replenish(...)方法中,我们先挂起过程,等油车到达。等过程恢复时,我们看看要补充多少。.Container(...)对象有两个我们要调用的属性:.capacity和.level。.capacity属性返回的是油箱的容量,而.level属性返回的是当前的存量。差值就是要补充的量。知道了这个,又知道了输油的速度,我们就能算出要花多少时间。完成后,我们将量加到对应的种类上,并返回到调用函数。
最后,定义Car类:
1 class Car(object): 2 ''' 3 Car class. 4 ''' 5 def __init__(self, i, env, gasStation): 6 ''' 7 The main constructor of the class 8 9 @param i -- consecutive car id 10 @param env -- environment object 11 @param gasStation -- gasStation object 12 ''' 13 # pointers to the environment and gasStation objects 14 #指向环境和加油站对象的指针 15 self.env = env 16 self.gasStation = gasStation 17 18 # fuel type required by the car 需要的燃油种类 19 self.FUEL_TYPE = np.random.choice( 20 ['PETROL', 'DIESEL'], p=[0.7, 0.3]) 21 22 # details about the car 车的明细 23 self.TANK_CAPACITY = np.random.randint(12, 23) # gal 24 25 # how much fuel left 剩余油量 26 self.FUEL_LEFT = self.TANK_CAPACITY \ 27 * np.random.randint(10, 40) / 100 28 29 # car id 车ID 30 self.CAR_ID = i 31 32 # start the refueling process 开始加油过程 33 self.action = env.process(self.refuel())
我们的车需要特定的油,以及预先定义好的油箱容量。创建一个car对象时,会随机分配一个值作为剩余的油量——在油箱容量的10%到40%之间。
加油过程开始:
1 def refuel(self): 2 ''' 3 Refueling method 4 ''' 5 # what's the fuel type so we request the right pump 6 #燃油种类,以便我们请求正确的泵 7 fuelType = self.FUEL_TYPE 8 9 # let's get the pumps object对应的泵对象 10 pump = gasStation.getPump(fuelType) 11 12 # and request a free pump 请求空闲的泵 13 with pump.request() as req: 14 # time of arrival at the gas station到达加油站的时间 15 arrive = self.env.now 16 17 # wait for the pump等待泵 18 yield req 19 20 # how much fuel does the car need 需要多少燃油 21 required = self.TANK_CAPACITY - self.FUEL_LEFT 22 23 # time of starting refueling 开始加油的时间 24 start = self.env.now 25 yield self.gasStation.getReservoir(fuelType)\ 26 .get(required) 27 28 # record the fuel levels记录油量 29 petrolLevel = self.gasStation\ 30 .getReservoir('PETROL').level 31 dieselLevel = self.gasStation\ 32 .getReservoir('DIESEL').level 33 34 # and wait for it to finish等待结束 35 yield env.timeout(required / gasStation \ 36 .getRefuelSpeed()) 37 38 # finally, print the details to the screen 39 refuellingDetails = '{car}\t{tm}\t{start}' 40 refuellingDetails += '\t{fin}\t{gal:2.2f}\t{fuel}' 41 refuellingDetails += '\t{petrol}\t{diesel}' 42 43 print(refuellingDetails \ 44 .format( 45 car=self.CAR_ID, 46 tm=arrive, 47 start=int(start), 48 fin=int(self.env.now), 49 gal=required, fuel=fuelType, 50 petrol=int(petrolLevel), 51 diesel=int(dieselLevel) 52 ) 53 )
我们先从GasStation对象调用特定燃油的泵。燃油种类是我们之前创建的.Resource(...):这个对象对每种燃油,都有对应的可用的泵的数目——汽油三个,柴油一个。每来一辆车,都用.request(...)对象访问泵。然后就等待请求完成。
一旦泵可用,就恢复执行了。我们先计算要加多少油,并开始加油过程。这个过程会挂起补足油所花的时间。
这样就是整个加油过程啦。
generate(...)静态方法在模拟过程中随机生成车(每5到45秒一辆):
1 @staticmethod 2 def generate(env, gasStation): 3 ''' 4 A static method to generate cars 生成车的静态方法 5 ''' 6 # generate as many cars as possible during the 7 # simulation run在模拟过程中随机生成车 8 for i in itertools.count(): 9 # simulate that a new car arrives between 5s 10 # and 45s 每5-45秒一辆 11 yield env.timeout(np.random.randint(5, 45)) 12 13 # create a new car 14 Car(i, env, gasStation)
我们用env.process(Car.generate(env,gasStation))命令将生成车的过程加到环境中。
如你所见,静态方法同对象方法不同,不需要self关键字。
最后调用.run(...)方法。until参数指定了模拟要跑多久。
一旦执行程序,你会看到类似这样的界面:
整个模拟中,你会看到油车何时被召唤,以及整个过程结束的时间:
更多:那利润呢?这里我们假设加油站,每加仑汽油是1.95美元买入、2.45美元卖出,柴油是1.67美元买入、2.23美元卖出。我们也加上了充满油箱的初始花费(sim_gasStation_alternative.py文件):
1 # cash registry 增加记账 2 self.sellPrice = {'PETROL': 2.45, 'DIESEL': 2.23} 3 self.buyPrice = {'PETROL': 1.95, 'DIESEL': 1.67} 4 self.cashIn = 0 5 self.cashOut = np.sum( 6 [ self.CAPACITY[ft] \ 7 * self.buyPrice[ft] 8 for ft in self.CAPACITY]) 9 self.cashLost = 0
在replenish(...)方法中,我们也加上了一行付费行为:
1 # and pay for the delivery 增加付费行为 2 self.pay(toReplenish * self.buyPrice[fuelType])
.pay(...)方法将金额加到(支付给供应商的)self.cashOut变量上。
我们也假设有些消费者,在等了一段时间后,会不耐烦走掉(假设是5分钟——参看下面代码中的waitedTooLong布尔变量)。最后,Car的refuel(...)方法就变成了这样:
1 def refuel(self): 2 ''' 3 Refueling method 4 ''' 5 # what's the fuel type so we request the right pump 6 fuelType = self.FUEL_TYPE 7 8 # let's get the pumps object 9 pump = gasStation.getPump(fuelType) 10 11 # and request a free pump 12 with pump.request() as req: 13 # time of arrival at the gas station 14 arrive = self.env.now 15 16 # wait for the pump 17 yield req 18 19 # how much fuel does the car need 20 required = self.TANK_CAPACITY - self.FUEL_LEFT 21 22 # how long have been waiting for 23 waitedTooLong = self.env.now - arrive > 5 * 60 24 25 if waitedTooLong: 26 # leave一怒而去 27 print('-' * 70) 28 print('CAR {0} IS LEAVING -- WAIT TOO LONG'\ 29 .format(self.CAR_ID) 30 ) 31 print('-' * 70) 32 gasStation.lost(required * self.gasStation\ 33 .getFuelPrice(fuelType)) 34 else: 35 # time of starting refueling 36 start = self.env.now 37 yield self.gasStation.getReservoir(fuelType)\ 38 .get(required) 39 40 # record the fuel levels 41 petrolLevel = self.gasStation\ 42 .getReservoir('PETROL').level 43 dieselLevel = self.gasStation\ 44 .getReservoir('DIESEL').level 45 46 # and wait for it to finish 47 yield env.timeout(required / gasStation \ 48 .getRefuelSpeed()) 49 50 # time finished refueling 51 fin = self.env.now 52 53 # pay 付费 54 toPay = required * self.gasStation\ 55 .getFuelPrice(fuelType) 56 self.gasStation.sell(toPay) 57 58 yield env.timeout(np.random.randint(15, 90)) 59 60 # finally, print the details to the screen 61 refuellingDetails = '{car}\t{tm}\t{start}' 62 refuellingDetails += '\t{fin}' 63 refuellingDetails += '\t{gal:2.2f}\t{fuel}' 64 refuellingDetails += '\t{petrol}\t{diesel}' 65 refuellingDetails += '\t${pay:3.2f}' 66 67 print(refuellingDetails \ 68 .format(car=self.CAR_ID, tm=arrive, 69 start=int(start), 70 fin=int(self.env.now), 71 gal=required, fuel=fuelType, 72 petrol=int(petrolLevel), 73 diesel=int(dieselLevel), 74 pay=toPay 75 ) 76 )
车辆等得太久,就会开走:
/* ---------------------------------------------------------------------- CAR 1776 IS LEAVING -- WAIT TOO LONG ---------------------------------------------------------------------- ---------------------------------------------------------------------- CAR 1778 IS LEAVING -- WAIT TOO LONG ---------------------------------------------------------------------- CAR 1780 IS LEAVING -- WAIT TOO LONG ---------------------------------------------------------------------- ...... */
你也会看到我们往GasStation类中加了一些新方法,.getFuelPrice(...)和.sell(...)。这些是用于计算并记录燃油销售额的,.sell(...)将金额加到self.cashIn变量上。
在整个模拟结束后,我们要打印结果:
1 # 打印记账结果 2 def printCashRegister(self): 3 print('\nTotal cash in: ${0:8.2f}'\ 4 .format(self.cashIn)) 5 print('Total cash out: ${0:8.2f}'\ 6 .format(self.cashOut)) 7 print('Total cash lost: ${0:8.2f}'\ 8 .format(self.cashLost)) 9 print('\nProfit: ${0:8.2f}'\ 10 .format(self.cashIn - self.cashOut)) 11 print('Profit (if no loss of customers): ${0:8.2f}'\ 12 .format(self.cashIn - self.cashOut \ 13 + self.cashLost))
前面的方法产生如下输出:
可见,如果加油站有更多的贮油器,挣到的就能超过8500美元。根据这些模拟参数,头10个小时,利润只有154.74美元。模拟20个小时,利润涨到1943.06美元,但是损失的机会成本也上升到8508.01美元。看起来加油站需要更多的贮油器。
11.3模拟电动车耗尽电量的场景
电动车如今越来越流行。不过尽管廉价,这种车在长途旅行中使用却有些限制,至少半路充电是个问题。
本技巧中,我们将模拟电动车耗尽电量的场景。我们在路上随机放置充电站。并且,我们允许司机没充满电就开出车。
准备:需要装好SimPy和NumPy。
代码实现:和前一技巧类似,我们先定义环境和所有代理人(sim_recharge.py文件):
1 import numpy as np 2 import simpy 3 4 5 if __name__ == '__main__': 6 # what is the simulation horizon (in minutes) 7 SIM_TIME = 10 * 60 * 60 # 10 hours 8 9 # create the environment 10 env = simpy.Environment() 11 12 # create recharge stations 13 rechargeStations = RechargeStation \ 14 .generateRechargeStations(SIM_TIME) 15 16 # create the driver and the car 17 driver = Driver(env) 18 car = Car(env, driver, rechargeStations) 19 20 # print the header 21 print() 22 print('-' * 30) 23 print('Time\tBatt.\tDist.') 24 print('-' * 30) 25 26 # and run the simulation 27 env.run(until = SIM_TIME)
原理:我们指定模拟的参数:最多10小时,只模拟一辆车。
首先,创建环境和充电站:
1 @staticmethod 2 def generateRechargeStations(simTime): 3 ''' 4 A static method to create gas stations along 5 the route. Gas stations are located every 6 80 - 140 miles 7 8 @param simTime -- time of the simulation 9 ''' 10 # we assume an average speed of 35MPH to calculate 11 # a maximum distance that might be covered during 12 # the simulation timespan 13 #假设速度为35MPH,计算模拟过程 中,最多开多远 14 maxDistance = simTime / 60 * 35 * 2 15 16 # generate the recharge stations 生成充电 站 17 distCovered = 0 18 rechargeStations = [RechargeStation(env, 0)] 19 20 while(distCovered < maxDistance): 21 nextStation = np.random.randint(80, 140) 22 23 distCovered += nextStation 24 rechargeStations.append( 25 RechargeStation(env, distCovered)) 26 27 return rechargeStations
.generateRechargeStations(...)方法设置模拟的时间,并在路线上随机放置充电站。这个方法假设均速是35mph,计算模拟时间内最大的行驶距离。充电站之间的距离在80英里到140英里之间。
在这个模拟中,我们加入司机的概念。司机经过充电站时,如果无法保证能到下一个充电站,就会停下来。司机也有权力中断充电过程,直接开出。
drive(...)方法引入一个新概念——中断:
1 def drive(self, car, timeToFullRecharge): 2 ''' 3 Method to control how long the recharging 4 is allow to last 5 6 @param car -- pointer to the car 7 @timeToFullRecharge -- minutes needed to full 8 recharge 9 ''' 10 # decide how long to allow the car to recharge 11 #允许充电站充多久 12 interruptTime = np.random.randint(50, 120) 13 14 # if more than the time needed to full recharge 15 # wait till the full recharge, otherwise interrupt 16 # the recharge process earlier 17 #如果足够充满点,就等到充满,否则提前中断 18 yield self.env.timeout(int(np.min( 19 [interruptTime, timeToFullRecharge]))) 20 21 if interruptTime < timeToFullRecharge: 22 car.action.interrupt()
车停靠充电站后,司机可以中断充电,直接开走。这是由.drive(...)方法随机决定的。这个方法以Car对象作为第一个参数,充满电的时间作为第二个参数。如果生成的随机数大于充满需要的时间,那就等到充满。否则,充电过程就停止,继续旅途。
在SimPy中中断进程很直接——调用.interrupt()方法即可。这会抛出一个异常。
模拟的大部分代码和处理都在Car类中。我们先定义car参数。在两种容量的电池中选择,70kWh或85kWh。电池的起始电量在80%到100%之间。.AVG_SPEED决定了能行驶多远,范围在36.4mph到49.2mph之间。消耗是以每英里kWh描述的;每100英里,34kWh到38kWh,换算下来大约是100mpg-e到89mpg-e(miles per gallon equivalent,每加仑等价英里数)。旅途之初,我们在位置0。
模拟从driving(...)方法内的过程开始:
1 def driving(self): 2 # updates every 15 minutes 每15分钟更新 3 interval = 15 4 5 # assuming constant speed -- how far the car travels 6 # in each 15 minutes 假设匀速 7 distanceTraveled = self.AVG_SPEED / 60 * interval 8 9 # how much battery used to travel that distance共耗多少电量 10 batteryUsed = distanceTraveled * self.AVG_ECONOMY
首先,我们看看电量能支持多远;如果电量不足0.0%,我们中断模拟,并给一个提示:
1 while True: 2 # update the location of the car 更新车的位置 3 self.LOCATION += distanceTraveled 4 5 # how much battery power left 剩余 电量 6 batteryLeft = self.BATTERY_LEVEL \ 7 * self.BATTERY_CAPACITY - batteryUsed 8 9 # update the level of the battery 更新电池电量 格数 10 self.BATTERY_LEVEL = batteryLeft \ 11 / self.BATTERY_CAPACITY 12 13 # if we run out of power -- stop 耗尽电量---终止 14 if self.BATTERY_LEVEL <= 0.0: 15 print() 16 print('!~' * 15) 17 print('RUN OUT OF JUICE...') 18 print('!~' * 15) 19 print() 20 break
驾驶过程中,我们要看最近的两个充电站,检查最近的距离:
1 # along the way -- check the distance to 在路上,查看两个最近的充电站 2 # the next two recharge stations 3 nearestRechargeStations = \ 4 [gs for gs in self.rechargeStations 5 if gs.LOCATION > self.LOCATION][0:2] 6 7 distanceToNearest = [rs.LOCATION \ 8 - self.LOCATION 9 for rs in nearestRechargeStations] 10 11 # are we currently passing a recharging station?正经过一个充电站? 12 passingRechargeStation = self.LOCATION \ 13 + distanceTraveled > \ 14 nearestRechargeStations[0].LOCATION 15 16 # will we get to the next one on the charge left?剩余电量能支撑到下一个充电站吗? 17 willGetToNextOne = self.check( 18 batteryLeft, 19 nearestRechargeStations[-1].LOCATION)
如果正经过一个充电站(passingRechargeStation),我们可以选择停车,也可以选择看看剩余电量能不能支撑到下个充电站(willGetToNextOne)。如果不太可能,我们就停下来充电:
1 if passingRechargeStation \ 2 and not willGetToNextOne: 3 4 # the charging can be interrupted by the 5 # driver 充电可由司机中断 6 try: 7 # how long will it take to fully recharge?充满要多久? 8 timeToFullRecharge = \ 9 (1 - self.BATTERY_LEVEL) \ 10 / nearestRechargeStations[0] \ 11 .RECHARGE_SPEED 12 13 # start charging开始充电 14 charging = self.env.process( 15 self.charging(timeToFullRecharge, 16 nearestRechargeStations[0] \ 17 .RECHARGE_SPEED)) 18 19 # and see if the driver will drive off 20 # earlier than the car is fully recharged 司机是否提前开走 21 yield self.env.process(self.driver \ 22 .drive(self, timeToFullRecharge)) 23 24 # if the he/she does -- interrupt charging 如果提前了,-中断充电 25 except simpy.Interrupt: 26 27 print('Charging interrupted at {0}' \ 28 .format(int(self.env.now))) 29 print('-' * 30) 30 31 charging.interrupt() 32 33 # update the progress of the car along the way 34 toPrint = '{time}\t{level:2.2f}\t{loc:4.1f}' 35 print(toPrint.format(time=int(self.env.now), 36 level=self.BATTERY_LEVEL, loc=self.LOCATION)) 37 38 # and wait for the next update 等待下一次更新 39 yield self.env.timeout(interval)
注意我们是如何用try-except包裹代码的;try的部分,我们先决定timeToFullRecharge,并开始充电。同时调用Driver对象(具体是drive(...)方法),开始决定这次充电命运的过程——中不中断?
假设不中断,charging(...)方法中try部分的代码会全部执行。注意我们充电是每秒增量式地更新电量,以应对中断:
1 def charging(self, timeToFullRecharge, rechargeSpeed): 2 ''' 3 Method to recharge the car 4 ''' 5 # we are starting the recharge process 开始充电进程 6 try: 7 # let's print out to the screen when this 8 # happens 9 print('-' * 30) 10 print('Charging at {0}'.format(self.env.now)) 11 12 # and keep charging (every minute) 13 for _ in range(int(timeToFullRecharge)): 14 self.BATTERY_LEVEL += rechargeSpeed 15 yield self.env.timeout(1) 16 17 # if the recharge process is not interrupted 18 print('Fully charged...') 19 print('-' * 30) 20 21 # else -- just finish then 22 except simpy.Interrupt: 23 pass
如果过程被司机中断了,driving(...)方法中的异常会传到charging(...)。然后不做任何处理,再返回driving(...)的执行。
执行代码后,你会看到这样的输出:
如果中断充电,会有一条消息:
如果模拟中,车的电量耗尽了,你会看到这样的输出:
11.4判断羊群面对群狼时是否有团灭的风险
一个著名的代理人基模型就是羊-狼捕食的例子。
模型模拟两群动物:一块区域上共同生活的羊和狼。羊群食草补充能量。狼群捕食羊群得到能量。在区域内移动消耗能量。一旦哪个动物能量低于0,就死了。
本技巧中,我们将构建一个300×300的区域(网格)并(初始)填充6000只羊和200只狼。我们也引入继承的概念:我们将创建一个通用的Animal类,然后派生出Sheep类和Wolf类。背后的思想很简单:动物是有共性的(都要在区域上移动),我们就不用在两个地方重复同样的代码了。
准备:需要装好SimPy和NumPy。
代码实现:
本技巧的代码很长,但跟着逻辑走应该很清楚(sim_sheepWolvesPredation.py文件):
1 import numpy as np 2 import simpy 3 import collections as col 4 5 if __name__ == '__main__': 6 # create the environment 创建环境 7 env = simpy.Environment() 8 9 # create the plane 创建区域 10 plane = Plane(env, LAND, GRASS_COVERAGE, 11 INITIAL_SHEEP, INITIAL_WOLF) 12 13 # print the header 14 print('\tSheep\t\tDied\t\tWolves\t\tDied\t') 15 print('T\tLive\tBorn\tEnergy\tEaten\tLive\tBorn\tEnergy') 16 17 # and run the simulation 运行模拟 18 env.run(until = SIM_TIME)
原理解释:和之前一样,我们先创建.Environment()。
然后创建Plane对象,我们的神奇动物们要在上面散步(下面的代码经过了缩略):
1 class Plane(object): 2 ''' 3 Plane class 4 ''' 5 def __init__(self, env, bounds, grassCoverage, 6 sheep, wolves): 7 ''' 8 Default constructor 9 ''' 10 # pointer to the environment 11 self.env = env 12 13 # bounds of the plane 14 self.bounds = bounds 15 16 # grass 17 self.grassCoverage = grassCoverage 18 self.grass = [ 19 [0 for _ in range(self.bounds[0])] 20 for _ in range(self.bounds[1]) 21 ] 22 23 # we keep track of eaten grass 24 self.grassEatenIndices = col.defaultdict(list) 25 26 # create the animals 27 self.noOfSheep = sheep 28 self.noOfWolves = wolves 29 30 self.sheep = [] 31 self.wolves = [] 32 33 # keep track of counts 34 self.counts = { 35 'sheep': { 36 'count': 0, 37 'died': { 38 'energy': 0, 39 'eaten': 0, 40 'age': 0, 41 }, 42 'born': 0 43 }, 44 'wolves': { 45 'count': 0, 46 'died': { 47 'energy': 0, 48 'age': 0, 49 }, 50 'born': 0 51 } 52 } 53 54 # generate the grass and animals 生成草和动物 55 self.generateGrass() 56 self.generateSheep() 57 self.generateWolves() 58 59 # and start monitoring and simulation processes 开始监控和模拟过程 60 self.monitor = self.env.process( 61 self.monitorPopulation()) 62 self.action = self.env.process(self.run())
Plane类生成区域并先种草;我们使用.generateGrass()方法来实现绿化:
1 def generateGrass(self): 2 ''' 3 Method to populate the plane with grass 4 ''' 5 # number of tiles on the plane 区域内单位面积的块数 6 totalSize = self.bounds[0] * self.bounds[1] 7 8 # how many of them will have grass当中有多少要种草 9 totalGrass = int(totalSize * self.grassCoverage) 10 11 # randomly spread the grass on the plane 在区域上随机种草 12 grassIndices = sorted( 13 choice(totalSize, totalGrass, replace=False)) 14 15 for index in grassIndices: 16 row = int(index / self.bounds[0]) 17 col = index - (self.bounds[1] * row) 18 19 self.grass[row][col] = 1
这个方法,基于模拟的参数,检查区域上有多少单位块要种草。然后使用NumPy的.random.choice(...)方法随机选择单位块。最后把草种上去。
下一步是生成动物:
1 def generateSheep(self): 2 ''' 3 Method to populate the plane with sheep 4 ''' 5 # place the sheep randomly on the plane 随机放一些羊 6 for _ in range(self.noOfSheep): 7 pos_x = rint(0, LAND[0]) 8 pos_y = rint(0, LAND[1]) 9 energy = rint(*ENERGY_AT_BIRTH) 10 11 self.sheep.append( 12 Sheep( 13 self.counts['sheep']['count'], 14 self.env, energy, [pos_x, pos_y], self) 15 ) 16 self.counts['sheep']['count'] += 1
我们先决定区域上Sheep的主要属性:放哪里,初始有多少能量。决定了之后,简单附加到self.sheep列表就可以。
在Python的循环里,如果你用不到i这种循环次数;那就直接改用_。
生成狼群的方式类似:
1 def generateWolves(self): 2 ''' 3 Method to populate the plane with wolves 4 ''' 5 # place the wolves randomly on the plane 随机放一些狼 6 for _ in range(self.noOfWolves): 7 pos_x = rint(0, LAND[0]) 8 pos_y = rint(0, LAND[1]) 9 energy = rint(*ENERGY_AT_BIRTH) 10 11 self.wolves.append( 12 Wolf( 13 self.counts['wolves']['count'], 14 self.env, energy, [pos_x, pos_y], self) 15 ) 16 17 self.counts['wolves']['count'] += 1
Sheep类和Wolf类都派生自Animal类;毕竟都是动物:
1 class Animal(object): 2 ''' 3 Animal class 4 ''' 5 def __init__(self, i, env, energy, pos, plane): 6 ''' 7 Default constructor 8 ''' 9 # attributes属性 10 self.energy = energy 11 self.pos = pos # current position 12 13 # is the animal still alive动物是否还活着 14 self.alive = True 15 self.causeOfDeath = None 16 17 # when did the animal ate the last 上一顿是啥时候 18 self.lastTimeEaten = 0 19 20 # range of movements移动范围 21 self.movements = [i for i in range(-50,51)] 22 23 # pointer to environment and the plane 指向环境和区域的指针 24 self.env = env 25 self.plane = plane 26 self.id = i
Animal类控制所有主要的(Sheep类和Wolf类共有的)性质:移动move(...)(及其位置),死亡die(...),以及检查死因——你可以查看动物是否存活isAlive(...)以及还有多少能量。显然,这些性质是所有动物共有的。Animal类还存储了所有动物都有的属性:位置、能量以及是否活着。
为了移动(move(...))动物,我们随机选择水平轴和垂直轴,并沿着这个方向移动:
1 def move(self): 2 ''' 3 Changing the animal's position 4 ''' 5 # determining the horizontal and vertical moves决定水平轴和垂直轴 6 h = choice(self.movements) 7 v = choice(self.movements) 8 9 # adjusting the position调整位置 10 self.pos[0] += h 11 self.pos[1] += v 12 13 # making sure we do not go outside the predefined land确保没有越界 14 self.pos[0] = np.min( 15 [np.max([0, self.pos[0] - 1]), LAND[0] - 1] 16 ) 17 18 self.pos[1] = np.min( 19 [np.max([0, self.pos[1] - 1]), LAND[1] - 1] 20 ) 21 22 # and subtracting the energy due to move减去移动消耗的能量 23 self.energy -= (h+v) / 4
这个方法确保没有越界,也减去了移动消耗的能量。
Sheep类和Wolf类的不同点在于它们的食物——羊吃草(eatGrass(...))而狼吃羊(eatSheep(...)):
1 def eatGrass(self): 2 ''' 3 Sheep eat grass 4 ''' 5 if self.plane.hasGrass(self.pos): 6 # get the energy from grass 获得能量 7 self.energy += ENERGY_FROM_GRASS 8 self.lastTimeEaten = self.env.now 9 10 # and flag that the grass has been eaten标记草已被吃 11 self.plane.grassEaten(self.pos) 12 13 if self.energy > 200: 14 self.energy = 200
吃草的时候,我们先检查当前位置有没有。有的话,能量增加ENERGY_FROM_GRASS变量指定的数值。我们规定羊的能量不能超过200;毕竟不可能无限增长。我们会更新最近一次进食时间。这个时间会用来决定动物生存的概率;这里的想法就是,长时间不进食,死亡的可能性就上升了。
注意从Animal类派生Sheep(或Wolf)时,我们仍然使用self(比如self.energy),尽管逻辑是在父类(Animal类)中实现的。
而eatSheep(...)方法先获取当前位置所有的羊,并决定狼要吃多少只:
1 def eatSheep(self): 2 ''' 3 Wolves eat sheep 4 ''' 5 # get the sheep at the particular position on the 6 # plane获取当前位置所有的羊 7 sheep = self.plane.getSheep(self.pos) 8 9 # decide how many will be eaten决定狼要吃多少只 10 howMany = np.random.randint(1, 11 np.max([len(sheep), 2])) 12 13 # and feast 盛宴 14 for i, s in enumerate(sheep): 15 # we're checking if the sheep is still alive 16 # as the removal of sheep that died happens later 17 if s.isAlive() and i < howMany: 18 self.energy += s.getEnergy() / 20 19 s.die('eaten') 20 21 if self.energy > 200: 22 self.energy = 200 23 24 # update the time of the last meal更新最近一次进食时间 25 self.lastTimeEaten = self.env.now
遍历Sheep对象,看羊是否还活着,活着的话,吃掉。狼获得了对应的能量,并限制在200个单位以内。
回到Plane对象,生成所有的动物后,我们开始模拟。
第一个过程是monitorPopulation(...):
1 def monitorPopulation(self): 2 ''' 3 Process to monitor the population 4 ''' 5 # the process checks for animals that run out of 6 # energy and removes them from simulation检查并移除所有能量耗尽的动物 7 while True: 8 for s in self.sheep: 9 if s.energy < 0: 10 s.die('energy') 11 12 for w in self.wolves: 13 if w.energy < 0: 14 w.die('energy') 15 16 # clean up method 清理 17 self.removeAnimalsThatDied() 18 19 yield self.env.timeout(1)
这个方法遍历所有动物(包括Sheep和Wolf),检查能量水平;低于0就死了。当所有动物都标记过后,调用removeAnimalsThatDied(...)方法:
1 def removeAnimalsThatDied(self): 2 ''' 3 Clean up method for removing dead animals 清理死亡的动物 4 ''' 5 # get all animals that are still alive and those 6 # that died 列出活着的动物与死去的动物 7 sheepDied = [] 8 wolvesDied = [] 9 10 sheepAlive = [] 11 wolvesAlive = [] 12 13 for s in self.sheep: 14 if s.isAlive(): 15 sheepAlive.append(s) 16 else: 17 sheepDied.append(s) 18 19 for w in self.wolves: 20 if w.isAlive(): 21 wolvesAlive.append(w) 22 else: 23 wolvesDied.append(w) 24 25 # keep only those that are still alive只保留还活着的 26 self.sheep = sheepAlive 27 self.wolves = wolvesAlive 28 29 # while for those that died -- update why they died 30 cod = {'energy': 0, 'eaten': 0, 'age': 0} 31 for s in sheepDied: 32 cod[s.getCauseOfDeath()] += 1 33 34 for cause in cod: 35 self.counts['sheep']['died'][cause] = cod[cause] 36 37 cod = {'energy': 0, 'age': 0} 38 for w in wolvesDied: 39 cod[w.getCauseOfDeath()] += 1 40 41 for cause in cod: 42 self.counts['wolves']['died'][cause] = cod[cause] 43 44 # and finally -- release the memory by deleting the 45 # animal objects 释放内存 46 for s in sheepDied: 47 del s 48 49 for w in wolvesDied: 50 del w
该方法遍历所有动物,区分出是否还活着;活着的参加下一轮循环。最后删除对象是为了释放占据的内存。
另一个过程是主模拟循环,run(...)方法。这个方法控制模拟的流程:
1 def run(self): 2 ''' 3 Main loop of the simulation 4 ''' 5 while True: 6 # first, move the animals on the plane移动区域内的动物 7 self.updatePositions() 8 9 # and let them eat 进食 10 self.eat() 11 12 # then let's see how many of them will reproduce繁殖 13 self.reproduceAnimals() 14 15 # and keep track of the grass regrowth记录草的生长 16 self.env.process(self.regrowGrass()) 17 18 # finally, print the telemetry to the screen 19 toPrint = '{tm}\t{sheep_alive}\t{sheep_born}' 20 toPrint += '\t{sheep_died_energy}' 21 toPrint += '\t{sheep_died_eaten}' 22 toPrint += '\t{wolves_alive}\t{wolves_born}' 23 toPrint += '\t{wolves_died_energy}' 24 25 26 print(toPrint.format( 27 tm=int(self.env.now), 28 sheep_alive=int(len(self.sheep)), 29 sheep_born=self.counts['sheep']['born'], 30 sheep_died_energy= \ 31 self.counts['sheep']['died']['energy'], 32 sheep_died_eaten= \ 33 self.counts['sheep']['died']['eaten'], 34 sheep_died_age= \ 35 self.counts['sheep']['died']['age'], 36 wolves_alive=int(len(self.wolves)), 37 wolves_born=self.counts['wolves']['born'], 38 wolves_died_energy= \ 39 self.counts['wolves']['died']['energy'], 40 wolves_died_age= \ 41 self.counts['wolves']['died']['age']) 42 ) 43 44 # and wait for another iteration等下一个轮回 45 yield self.env.timeout(1)
模拟的每一次循环,所有动物都可以在区域内随意走动(一步)。updatePositions(...)方法遍历所有动物,调用move(...)方法:
1 def updatePositions(self): 2 ''' 3 Method to update the positions of animals 4 ''' 5 for s in self.sheep: 6 s.move() 7 8 for w in self.wolves:
下一步是进食。类似地,这个方法遍历所有动物,对羊调用eatGrass(...)方法,对狼调用eatSheep(...)方法:
1 def eat(self): 2 ''' 3 Method to feed animals 4 ''' 5 for s in self.sheep: 6 s.eatGrass() 7 8 for w in self.wolves: 9 w.eatSheep()
吃饱之后,模拟会进行繁殖(reproduceAnimals(...)):
1 def reproduceAnimals(self): 2 ''' 3 Method to reproduce animals 4 ''' 5 # counting the number of births 6 births = {'sheep': 0, 'wolves': 0} 7 8 # reproduce sheep 9 for s in self.sheep: 10 # determine if the animal will reproduce 是否愿意繁殖 11 willReproduce = np.random.rand() < \ 12 (SHEEP_REPRODUCE * 3 / \ 13 (self.env.now - s.lastTimeEaten + 1)) 14 15 # if will reproduce and is still alive -- 16 # give birth at the same position as the mother 17 #如果愿意且活着,原地生一个 18 if willReproduce and s.isAlive(): 19 energy = rint(*ENERGY_AT_BIRTH) 20 self.sheep.append( 21 Sheep(self.counts['sheep']['count'], 22 self.env, energy, 23 s.getPosition(), self)) 24 25 # increase the overall count of sheep 增加羊的数量 26 self.counts['sheep']['count'] += 1 27 28 # and the birth counter 29 births['sheep'] += 1 30 31 # reproduce wolves 32 for w in self.wolves: 33 # determine if the animal will reproduce 是否愿意繁殖 34 willReproduce = np.random.rand() < \ 35 ( WOLF_REPRODUCE / \ 36 (self.env.now - w.lastTimeEaten + 1)) 37 # if will reproduce and is still alive -- 38 # give birth at the same position as the mother 39 #如果愿意且活着,原地生一个 40 if willReproduce and w.isAlive(): 41 energy = rint(*ENERGY_AT_BIRTH) 42 self.wolves.append( 43 Wolf(self.counts['wolves']['count'], 44 self.env, energy, 45 w.getPosition(), self)) 46 47 # increase the overall count of wolves增加狼的数量 48 self.counts['wolves']['count'] += 1 49 50 # and the birth counter 51 births['wolves'] += 1 52 # update the counts variable 53 for animal in births: 54 self.counts[animal]['born'] = births[animal]
这方法遍历所有动物,并随机决定要不要生。繁殖后代的概率依赖于最近一次进食的时间:越久,可能性越低。如果愿意繁殖且活着,一个新的Sheep或Wolf会在原来的位置创建。
增加数量后,我们重新种草:
1 def regrowGrass(self): 2 ''' 3 Regrow the grass 4 ''' 5 # time to regrow the grass长草的时间 6 regrowTime = 2 7 yield self.env.timeout(regrowTime) 8 9 # then we make the grass available at the position标记这个位置有草 10 for pos in self.grassEatenIndices[ 11 self.env.now - regrowTime]: 12 self.grass[pos[0]][pos[1]] = 1
regrowGrass(...)方法遍历吃过的所有位置,并重新长草。Sheep吃草后,.grass变量设为0,表示草被吃掉了,这个位置也加到.grassEatenIndices defaultdict对象里,这个对象记录每个位置草被吃掉的时间。这样,你可以调整regrowTime,看看对羊群数量有什么影响。
执行代码后,你会看到类似的输出:
T是模拟的时间,Live是存活的动物数目,Born是出生数,Died记录多少动物已死亡——也许是由于能量不足,(就Sheep而言)也许是由于被吃了。
第11章完。整书完。
略微休息,计划读下一本书《python数据分析(第2版)-阿曼多.凡丹戈》。
该书是一本介绍如何用Python进行数据分析的学习指南。全书共12章,从Python程序库入门、NumPy数组和Pandas入门开始,陆续介绍了数据的检索、数据加工与存储、数据可视化等内容。同时,本书还介绍了信号处理与时间序列、应用数据库、分析文本数据与社交媒体、预测性分析与机器学习、Python生态系统的外部环境和云计算、性能优化及分析、并发性等内容。在本书的最后,还采用3个附录的形式为读者补充了一些重要概念、常用函数以及在线资源等重要内容。
python数据分析个人学习读书笔记-目录索引
随书源码官方下载:
http://www.hzcourse.com/web/refbook/detail/7821/92