上回我们留下一个未解的问题,就是当一个IRP的CancelRoutine没有被设置时,CancelIo操作会失败,系统中有可能会留下永远都不会被complete的IRP。在Threaded IRP和non-threaded IRP一节中我们有谈到irp分为线程相关和非线程相关两种。倘若一个永远不complete的irp是非线程相关的,情况会稍微好一点,顶多系统中泄露了一个资源。倘若该irp是线程相关的,那事情就大了。thread IRP由IoManager生成并保留在线程的IRP队列里,负责处理该IRP的驱动在收到下层驱动的Complete事件后不会主动收回IRP的资源而是继续complete给IoManager,由IoManager负责回收,并从线程IRP列表中删除该IRP。一个线程在退出前会遍历等待IRP队列里所有的IRP,直到它们全部被complete为止。倘若其中有一个irp永远不complete,那么线程就永远不退出,无论是ExitThread也好还是_endthreadex也好还是什么邪恶的暴力擦除数据强退也好,全都不顶用。线程不退出,进程也不能销毁(题外话:进程资源的回收动作由最后一个线程退出后发起,所谓的杀进程,其实是用apc给所有线程发起退出操作)。更糟糕的是,操作系统的关机过程都会被堵住,除了关电源,没有其他办法恢复,这一点简直比BSOD还糟糕。我们知道由user mode发起的IO操作最后都会翻译成threaded irp,这就是为什么我在7.1大谈特谈user mode线程的原因:这个陷阱连user mode程序也会掉进去。Bad dog!
要解决这一点方法很简单目标很明确,那就是防止“永远不complete的irp”这种东西出现。一般的做法是加个线程或者timer并设置超时时间,时间一到就cancel这个irp。如果irp由user mode程序发起,那么就调用CancelIo;如果irp由驱动发起,则是调用IoCancelIrp。所有这些动作要生效的大前提是你的irp有CancelRoutine的存在,否则一切都是白搭。所以这里我有个经验要跟大家分享:任何时候都给你的irp设置CancelRoutine,并在CancelRoutine里Complete你的IRP!为方便起见我们选non-threaded irp做个例子,所有的代码都在内核态,免得各位看官看示例代码还要做上下文切换。以下便是代码:
Sending thread:
IoSetCancelRoutine(Irp, MyCancelRoutine); devext->SentIrp = Irp;
Canceling thread:
if (devext->AllocatedIrp != NULL) { IoCancelIrp(devext->SentIrp); }
cancel routine里的内容都是标准步骤,不赘述。看起来已经完美无缺了,可惜拿到测试组一跑就BSOD,系统抱怨说一个irp被free了两次,肯定是有地方被疏忽了,对,我们很好的处理的例外情况,却漏掉了常规情况:irp也是可以正常complete的!假如我们的CompleteRoutine是这样的:
Completion routine:
PIRP irp; irp = devext->SentIrp; devext->SentIrp = NULL; IoFreeIrp(irp);
KeAcquireSpinLock(&devext->SentIrpLock, ...); devext->SentIrp = Irp; KeReleaseSpinLock(&devext->SentIrpLock, ...);
KeAcquireSpinLock(&devext->SentIrpLock, ...); if (devext->AllocatedIrp != NULL) { IoCancelIrp(devext->SentIrp); } KeReleaseSpinLock(&devext->SentIrpLock, ...);
PIRP irp; KeAcquireSpinLock(&devext->SentIrpLock, ...); irp = devext->SentIrp; devext->SentIrp = NULL; KeReleaseSpinLock(&devext->SentIrpLock, ...); IoFreeIrp(irp); return STATUS_MORE_PROCESSING_REQUIRED;
// No cancellation: // Cancelable-->Completed // // Cancellation, IoCancelIrp returns before completion: // Cancelable --> CancelStarted --> CancelCompleted --> Completed // // Canceled after completion: // Cancelable--> Completed -> CancelStarted // // Cancellation, IRP completed during call to IoCancelIrp(): // Cancelable --> CancelStarted -> Completed --> CancelCompleted
if (InterlockedExchange((PVOID)&touched, IRPLOCK_CANCEL_STARTED) == IRPLOCK_CANCELABLE) { // // You got it to the IRP before it was completed. You can cancel // the IRP without fear of losing it, because the completion routine // does not let go of the IRP until you allow it. // IoCancelIrp(irp); // // Release the completion routine. If it already got there, // then you need to complete it yourself. Otherwise, you got // through IoCancelIrp before the IRP completed entirely. // if (InterlockedExchange(&touched, IRPLOCK_CANCEL_COMPLETE) == IRPLOCK_COMPLETED) { IoCompleteRequest(irp, IO_NO_INCREMENT); } }
if (InterlockedExchange((PVOID)&touched, IRPLOCK_COMPLETED) == IRPLOCK_CANCEL_STARTED) { // // Main line code has got the control of the IRP. It will // now take the responsibility of completing the IRP. // Therefore... IoFreeIrp(Irp); return STATUS_MORE_PROCESSING_REQUIRED; }