inline long Strike(bool composite[],long i,long stride,long limit) { for (;i<limit;i+=stride) composite[i]=true; return i; } long CacheUnfriendlySieve(long n) { long count=0; long m=(long)sqrt((double)n); bool* composite=new bool[n+1]; memset(composite,0,n); for (long i=2;i<=m;++i) { if (!composite[i]) { ++count; Strike(composite,2*i,i,n); } } for (long i=m+1;i<=n;++i) { if (!composite[i]) { ++count; } } delete[] composite; return count; }
外层用于找到素数,Strike内层循环删除合数
重构该程序后
long CacheFriendlySieve(long n) { long count=0; long m=(long)sqrt((double)n); bool* composite=new bool[n+1]; memset(composite,0,n); long *factor=new long[m]; long *striker=new long[m]; long n_factor=0; for (long i=2;i<=m;++i) { if (!composite[i]) { ++count; striker[n_factor]=Strike(composite,2*i,i,m); factor[n_factor++]=i; } } for (long window=m+1;window<=n;window+=m) { long limit=min(window+m-1,n); for (long k=0;k<n_factor;++k) { striker[k]=Strike(composite,striker[k],factor[k],limit); for (long i=window;i<=limit;++i) { if (!composite[i]) { ++count; //下面两句是我自己补充的,感觉是作者漏掉了,不知道是不是~~ striker[n_factor]=Strike(composite,2*i,i,limit);//更新 factor[n_factor++]=i; } } } delete striker; delete factor; delete composite; return count; } }
上面的代码中
factor数组用于存放m中的素数。striker数组用于存放最后一个未删除的合数。将程序划分为大小sqrt(n)的 窗口,每个中重复处理多余的合数,并重新保存未删除的合数,后面还要更新factor中的素数。因为素数数量不多,后面那个循环中的自循环也不大,factor数组里面的前面部分得到很大的重用,cache效率高~~