faiss-index

basic-index

faiss库包含线性检索方法（BLAS库优化）、哈希方法的实现（LSH）及矢量量化方法的实现（PQ、IVFPQ）。faiss库的一大优势是支持索引的动态增删。
这里有一个关于大数据量下计算最近邻的工具调研： http://wiki.baidu.com/pages/viewpage.action?pageId=479989012
所有的索引都可以通过index_factory统一api通过传字符串参数来构建

Method	Class name	index_factory	Main parameters	Bytes/vector	Exhaustive	Comments
Exact Search for L2	IndexFlatL2	"Flat"	d	4*d	yes	brute-force
Exact Search for Inner Product	IndexFlatIP	"Flat"	d	4*d	yes	also for cosine (normalize vectors beforehand)
Hierarchical Navigable Small World graph exploration	IndexHNSWFlat	'HNSWx,Flat`	d, M	4d + 8 M	no
Inverted file with exact post-verification	IndexIVFFlat	"IVFx,Flat"	quantizer, d, nlists, metric	4*d	no	Take another index to assign vectors to inverted lists
Locality-Sensitive Hashing (binary flat index)	IndexLSH	-	d, nbits	nbits/8	yes	optimized by using random rotation instead of random projections
Scalar quantizer (SQ) in flat mode	IndexScalarQuantizer	"SQ8"	d	d	yes	4 bit per component is also implemented, but the impact on accuracy may be inacceptable
Product quantizer (PQ) in flat mode	IndexPQ	"PQx"	d, M, nbits	M (if nbits=8)	yes
IVF and scalar quantizer	IndexIVFScalarQuantizer	"IVFx,SQ4" "IVFx,SQ8"	quantizer, d, nlists, qtype	d or d/2	no	there are 2 encodings: 4 bit per dimension and 8 bit per dimension
IVFADC (coarse quantizer+PQ on residuals)	IndexIVFPQ	"IVFx,PQy"	quantizer, d, nlists, M, nbits	M+4 or M+8	no	the memory cost depends on the data type used to represent ids (int or long), currently supports only nbits <= 8
IVFADC+R (same as IVFADC with re-ranking based on codes)	IndexIVFPQR	"IVFx,PQy+z"	quantizer, d, nlists, M, nbits, M_refine, nbits_refine	M+M_refine+4 or M+M_refine+8	no

composite-index

如上面的例子，构建索引可以先做L2或者其它量化相似度的索引方式，然后PQ或者Flat

coarse_quantizer = faiss.IndexFlatL2 (d)
index = faiss.IndexIVFPQ (coarse_quantizer, d,
                          ncentroids, code_size, 8)
index.nprobe = 5

nbits_mi = 12  # c
M_mi = 2       # m
coarse_quantizer_mi = faiss.MultiIndexQuantizer(d, M_mi, nbits_mi)
ncentroids_mi = 2 ** (M_mi * nbits_mi)

index = faiss.IndexIVFFlat(coarse_quantizer_mi, d, ncentroids_mi)
index.nprobe = 2048
index.quantizer_trains_alone = True

MultiIndexQuantizer 相比 IndexFlat fast/low-precision

faiss-index

basic-index

composite-index

你可能感兴趣的:(faiss-index)