1. 非阻塞(non-blocking)通信
(a)几乎可以立刻返回去干别的事情。不管数据是否从application buffer到了system buffer, 或者数据到了receiver端的application buffer或者system buffer.
(b)发送操作由MPI lib选择合适的时间去完成。
(c)sender最好不要去立刻修改刚发送数据的内存单元(send buffer)(不安全啊!),但可以操作其他的内存单元。
(d)若要操作刚发送数据的内存单元(send buffer),须通过*wait*函数确认发送完成。
(e)非阻塞的发送可以实现计算与通信重叠(overlap computation with communication and exploit possible performance gains)。
(2) 非阻塞receiver(原理同非阻塞的sender)
2. 函数
MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
Identifies an area in memory to serve as a send buffer. Processing continues immediately without waiting for the message to be copied out from the application buffer. A communication request handle is returned for handling the pending message status. The program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that the non-blocking send has completed.
MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)
Identifies an area in memory to serve as a receive buffer. Processing continues immediately without actually waiting for the message to be received and copied into the the application buffer. A communication request handle is returned for handling the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking receive operation completes and the requested message is available in the application buffer.
3. 举例
#include"mpi.h" #include<stdio.h> int main(int argc, char *argv[]){ int totalNumTasks, rankID; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &totalNumTasks); MPI_Comm_rank(MPI_COMM_WORLD, &rankID); //get the host where this process is running int nameLength; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Get_processor_name(processor_name,&nameLength); int prevRankID = rankID - 1; int nextRankID = rankID + 1; if(rankID == 0) prevRankID = totalNumTasks - 1; if(rankID == (totalNumTasks - 1)) nextRankID = 0; int count = 1; MPI_Request request[4]; char recvBuf1; char sendBuf1 = 'R'; int tag1 = 1; MPI_Irecv(&recvBuf1, count, MPI_CHAR, prevRankID, tag1, MPI_COMM_WORLD, &request[0]); MPI_Isend(&sendBuf1, count, MPI_CHAR, nextRankID, tag1, MPI_COMM_WORLD, &request[1]); char recvBuf2; char sendBuf2 = 'L'; int tag2 = 2; MPI_Irecv(&recvBuf2, count, MPI_CHAR, nextRankID, tag2, MPI_COMM_WORLD, &request[2]); MPI_Isend(&sendBuf2, count, MPI_CHAR, prevRankID, tag2, MPI_COMM_WORLD, &request[3]); //after, non-blocking send and receive, process can do something except modifying the application buffer //Here, application buffer is recvBuf1, sendBuf1, recvBuf2, sendBuf2 //which can overlap the communication and computing //Indeed, you can use other memory areas printf("My rankID = %d on Processor = %s, I can do something here.....\n", rankID, processor_name); //Now to check after MPI_Waitall, after it, the application buffer is safe to reuse MPI_Status status[4]; MPI_Waitall(4, request, status); printf("My rankID = %d, recvBuf1 = %c && source = %d && tag = %d\n", rankID, recvBuf1, status[0].MPI_SOURCE, status[0].MPI_TAG); printf("My rankID = %d, recvBuf2 = %c && source = %d && tag = %d\n", rankID, recvBuf2, status[2].MPI_SOURCE, status[2].MPI_TAG); printf("My rankID = %d, Now, my application buffer is safe to reuse.\n", rankID); MPI_Finalize(); return 0; }
4. 编译执行
[amao@amao991 mpi-study]$ mpicc p2pNonBlockingOnWhichProcessor.c [amao@amao991 mpi-study]$ mpiexec -n 3 -f machinefile ./a.out My rankID = 0 on Processor = amao991, I can do something here..... My rankID = 2 on Processor = amao992, I can do something here..... My rankID = 1 on Processor = amao991, I can do something here..... My rankID = 0, recvBuf1 = R && source = 2 && tag = 1 My rankID = 0, recvBuf2 = L && source = 1 && tag = 2 My rankID = 0, Now, my application buffer is safe to reuse. My rankID = 1, recvBuf1 = R && source = 0 && tag = 1 My rankID = 1, recvBuf2 = L && source = 2 && tag = 2 My rankID = 1, Now, my application buffer is safe to reuse. My rankID = 2, recvBuf1 = R && source = 1 && tag = 1 My rankID = 2, recvBuf2 = L && source = 0 && tag = 2 My rankID = 2, Now, my application buffer is safe to reuse.
5. 总结