php使用xapian扩展时获取某篇文档的分词及词频


$terms = array();
$prefix = 'Z';
for ($termi = $doc->termlist_begin(); !$termi->equals($doc->termlist_end()); $termi->next()) {
$term = array(
'wdf' => $termi->get_wdf(),
'freq' => $termi->get_termfreq(),
'name' => $termi->get_term(),
);
if ($term['name'][0] === $prefix) {
$term['name'] = substr($term['name'],1,strlen($term['name'])-1);
$terms[] = $term;
}
}

你可能感兴趣的:(php,数据挖掘)