GPGPU代表general purpose computing on graphics procesing unit,就是“图形处理器通用计算技术”。这种新型的加速技术试图把个人计算机上的显卡当作CPU这样的通用处理器来用,使显卡的强劲动力不仅仅发挥在图形处理上。从2009年开始,利用显卡进行计算已经渐成主流。
为了提高效率,通常通知GPU执行采用的机制为doorbell, HOST CPU会写command queue对应的 PCIe BAR空间的doorbell寄存器,一个驱动在多次将请求入队ringbuffer后产生一次doorbell。类似于家庭中的门铃。
in a push button analogy applied to computer systems, the term doorbell or doorbell interrupt is often used to describe a mechanism whereby a software system can signal or notify a computer hardware device that there is some work to be done. Typically, the software system will place data in some well-known and mutually agreed upon memory locations, and “ring the doorbell” by writing to a different memory location. This different memory location is often called the doorbell region, and there may even be multiple doorbells serving different purposes in this region. It is this act of writing to the doorbell region of memory that “rings the bell” and notifies the hardware device that the data are ready and waiting. The hardware device would now know that the data are valid and can be acted upon. It would typically write the data to a hard disk drive, or send them over a network, or encrypt them, etc.For GPGPU, it will push gpu to work for compute.
The term doorbell interrupt is usually a misnomer. It is similar to an interrupt, because it causes some work to be done by the device; however, the doorbell region is sometimes implemented as a polled region, sometimes the doorbell region writes through to physical device registers, and sometimes the doorbell region is hardwired directly to physical device registers. When either writing through or directly to physical device registers, this may cause a real interrupt to occur at the device’s central processor unit (CPU), if it has one.
Doorbell interrupts can be compared to Message Signaled Interrupts, as they have some similarities.
门铃,就是按钮按下以后, 门内铃声想起。doorbell理解,host操作指定位置(按钮), 客户端会立即触发(响铃)
1、 比如, AMDGPU的ring buffer 同步机制, 早期是mmio操作share regs, 后来使用doorbell, 可以达到省电和快速反应
2、 而且, doorbell是bar单出, 有很多ip都可以用, 增加了可操作数量.
intel_gvt_init_workload_scheduler->kthread_run(workload_thread,, engine, ...);
workload_thread->complete_current_workload->update_guest_context->vgpu_vreg_t(vgpu, RING_TAIL(ring_base)) = tail;vgpu_vreg_t(vgpu, RING_HEAD(ring_base)) = head;