点云滤波--LoOP Local Outlier Probabilities一种基于概率的异常值检测方法

LoOP Local Outlier Probabilities
此方法为lof的改进版本:
lof有一个显著的缺点:即最终的得分不能判断异常值的程度,loOp将异常值表示在[0,1]的范围内,可以根据概率对异常值判断。

原理

相比于欧氏距离,引入了概率距离这一概念,相对于点 o o o而言,设其邻域集合为 S S S P [ d ( o , s ) \mathcal{P}[d(o, s) P[d(o,s)为点 o o o到近邻点的距离,概率距离 pdist ⁡ ( o , S ) \operatorname{pdist}(o, S) pdist(o,S)定义如下:
∀ s ∈ S : P [ d ( o , s ) ≤ pdist ⁡ ( o , S ) ] ≥ φ (1) \forall s \in S: \quad \mathcal{P}[d(o, s) \leq \operatorname{pdist}(o, S)] \geq \varphi\tag{1} sS:P[d(o,s)pdist(o,S)]φ(1)
简单地解释,只有大多数近邻点的距离在概率距离范围之内,也就是说不在概率距离范围之内的点就是异常值。
如果使用 λ = 2 ⋅ erf ⁡ − 1 ( φ ) \lambda=\sqrt{2} \cdot \operatorname{erf}^{-1}(\varphi) λ=2 erf1(φ)代替 λ \lambda λ,erf为高斯误差函数。将离群值定义为偏离平均值超过给定 λ 乘以标准差 σ 的对象(类似sor滤波)。根据3sigma法则:λ = 1 ⇔ φ ≈ 68%, λ = 2 ⇔ φ ≈ 95%, and λ = 3 ⇔ φ ≈ 99.7%。
假设o是S的中心,并且s ∈ S到o的距离集合近似为半高斯分布,则可以计算出S中物体到o的标准距离,其类似于标准差:
σ ( o , S ) = ∑ s ∈ S d ( o , s ) 2 ∣ S ∣ (2) \sigma(o, S)=\sqrt{\frac{\sum_{s \in S} d(o, s)^2}{|S|}}\tag{2} σ(o,S)=SsSd(o,s)2 (2)
注意这里的标准距离和标准差不太一样(没有使用距离减去距离的均值)。我们不考虑在全局范围内距离呈正态分布,而是假设在集合 S S S呈正态分布。因此,S 以 o 为中心的假设通常是合理的。基于这些考虑,我们将 o 到 S 的显着性 λ 的概率集合距离定义为:
pdist ⁡ ( λ , o , S ) : = λ ⋅ σ ( o , S ) (3) \operatorname{pdist}(\lambda, o, S):=\lambda \cdot \sigma(o, S)\tag{3} pdist(λ,o,S):=λσ(o,S)(3)
异常值的局部概率建模基于以下假设 (i.)以近邻集合 S S S为基础进行的(ii)距离符合正半部分的正态分布(x>0)。对于假设合理性的说明省略(见原文)。
Probabilistic Local Outlier Factor (PLOF)可以定义如下:
PLOF ⁡ λ , S ( o ) : = pdist ⁡ ( λ , o , S ( o ) ) E s ∈ S ( o ) [ pdist ⁡ ( λ , s , S ( s ) ) ] − 1 (4) \operatorname{PLOF}_{\lambda, S}(o):=\frac{\operatorname{pdist}(\lambda, o, S(o))}{E_{s \in S(o)}[\operatorname{pdist}(\lambda, s, S(s))]}-1\tag{4} PLOFλ,S(o):=EsS(o)[pdist(λ,s,S(s))]pdist(λ,o,S(o))1(4)
E ( ) E() E()表示期望。
对上式进行归一化:
n P L O F : = λ ⋅ E [ ( P L O F ) 2 ] (5) \mathrm{nPLOF}:=\lambda \cdot \sqrt{E\left[(\mathrm{PLOF})^2\right]}\tag{5} nPLOF:=λE[(PLOF)2] (5)
该值可以看作是 PLOF 值的一种标准差,即假设平均值为 0 的 λ·Stddev(PLOF)。
为了将尚未归一化的 PLOF 值转换为概率值,我们假设这些值在 1 附近呈正态分布,标准差为 nPLOF。然后,我们应用高斯误差函数来获得概率值,即局部离群点概率 (LoOP),表示点 o ∈ D 是离群点的概率:
LoOP ⁡ S ( o ) : = max ⁡ { 0 , erf ⁡ ( PLOF ⁡ λ , S ( o ) nPLOF ⁡ ⋅ 2 ) } (6) \operatorname{LoOP}_S(o):=\max \left\{0, \operatorname{erf}\left(\frac{\operatorname{PLOF}_{\lambda, S}(o)}{\operatorname{nPLOF} \cdot \sqrt{2}}\right)\right\}\tag{6} LoOPS(o):=max{0,erf(nPLOF2 PLOFλ,S(o))}(6)

codes



#include 
#include
#include
#include
#include
int main()
{
	using pointcloud = pcl::PointCloud<pcl::PointXYZ>;
	//params,values are from paper
	const double lambda = 3.0;
	const int k = 20;
	pointcloud cloud_in{};

	pcl::io::loadPCDFile("D:/scaled_cloud.pcd", cloud_in);
	std::cout << "raw point size:" << cloud_in.size() << std::endl;
	pcl::KdTreeFLANN<pcl::PointXYZ>kdtree{};
	kdtree.setInputCloud(cloud_in.makeShared());
	//Local Outlier Probabilities
	std::vector<double>loop(cloud_in.size(), 0.0);
	//1 compute the square dist between query point and nearest point
	std::vector<float>nearest_square_dist(cloud_in.size(), 0.0);
	//the k neighbor about query point
	std::vector<std::vector<int>> point_neighbor_indices(cloud_in.size(), std::vector<int>(k,0));
	for (size_t i = 0; i < cloud_in.size(); i++)
	{
		std::vector<int>indices{};
		std::vector<float>dist{};
		if (kdtree.nearestKSearch(cloud_in.points[i], k, indices, dist)>1)
		{
			//indices[0] is the query point index
			nearest_square_dist[i] = dist[1];
			point_neighbor_indices[i] = indices;
		}
	}
	//2 compute the standard dist about every point
	std::vector<double>std_dist(cloud_in.size(), 0.0);
	//3probabilistic  set dist
	std::vector<double>p_dist(cloud_in.size(), 0.0);
	for (size_t i = 0; i < cloud_in.size(); i++)
	{
		float sqr_sum = 0;
		//sum of d(o,s)^2
		for (size_t j = 0; j < point_neighbor_indices[i].size(); j++)
		{
			sqr_sum += nearest_square_dist[point_neighbor_indices[i][j]];
		}
		
		std_dist[i] = std::sqrt(sqr_sum / float(k));
		p_dist[i] = lambda * std_dist[i];
	}
	//4probabilistic lof
	std::vector<float>plof(cloud_in.size(), 0.0);
	for (size_t i = 0; i < cloud_in.size(); i++)
	{
		float sqr_sum = 0;
		//expected value of pdist
		for (size_t j = 0; j < point_neighbor_indices[i].size(); j++)
		{
			sqr_sum += p_dist[point_neighbor_indices[i][j]];
		}
		float ev_pdist = sqr_sum / float(k);
		plof[i] = (p_dist[i] / ev_pdist) - 1.0;

	
	}
	//4normalization plof
	//
	std::vector<float>nplof(cloud_in.size(), 0.0);
	for (size_t i = 0; i < cloud_in.size(); i++)
	{
		float sqr_sum = 0;
	
		for (size_t j = 0; j < point_neighbor_indices[i].size(); j++)
		{
			sqr_sum +=std::pow( plof[point_neighbor_indices[i][j]],2);
		}
		//expected value of square plof
		float ev_plof_2 = sqr_sum / float(k);
		nplof[i] = lambda * std::sqrt(ev_plof_2);


	}
	//5loOp
	std::vector<double>loOp(cloud_in.size(), 0.0);
	for (size_t i = 0; i < cloud_in.size(); i++)
	{
		double loOp1 = std::erf(plof[i] / (nplof[i] * std::sqrt(2)));
		loOp[i] = loOp1 > 0.0 ? loOp1 : 0.0;
	}
	pointcloud cloud_out{};
	std::ofstream out("./noise_cloud.xyz");

	for (int i = 0; i < loOp.size(); ++i)
	{
		if (loOp[i]<0.5)
		{
			cloud_out.push_back(cloud_in.points[i]);

		}
		else
		{
			out << loOp[i] << " " << cloud_in.points[i].x << " " << cloud_in.points[i].y << " " << cloud_in.points[i].z << "\n";

		}
	}
	out.close();

	std::cout << "filtered point size:" << cloud_out.size() << std::endl;

	cloud_out.width = cloud_out.size();
	cloud_out.height = 1;
	cloud_out.resize(cloud_out.width*cloud_out.height);
	pcl::io::savePCDFileBinary("filtered_cloud.pcd", cloud_out);

	
    std::cout << "Hello World!\n"; 
}



原始点云:
点云滤波--LoOP Local Outlier Probabilities一种基于概率的异常值检测方法_第1张图片
过滤后的点云:
点云滤波--LoOP Local Outlier Probabilities一种基于概率的异常值检测方法_第2张图片

你可能感兴趣的:(点云,点云滤波,机器学习,人工智能,算法)