Junk-Mail Filter 并查集

Recognizing junk mails is a tough task. The method used here consists of two steps:

  1. Extract the common characteristics from the incoming email.
  2. Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.
Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3

3 1
M 1 2

0 0
Sample Output
Case #1: 3
Case #2: 2

这题涉及到并查集节点的删除,大体题意是说给出n个点,M是将后面两个点合并为一类,S则是将后面那个点单独分出去,并不能单纯在删除时把一个点修改,因为如果1、2、3是一类且2和3都指向1,要删除1,这时如果只修改1是不行的,需要将2和3一起修改。这里就需要用到并查集的删除和虚节点,就是说一开始让1、2、3都指向4,删除时把1指向5,2和3依然指向4,这样就可以保持关系的不变了。

区别于一般的并查集,一般的并查集都是在元素之间互相指来指去,而涉及删除的时候,并不能用传统的方式,这里用到的是虚节点的方法,让原来互相指来指去的并查集换成统一指向特定的元素,这样指向同一个元素的所有数都在一个集合中,删除时,只需要把要删除的节点的pre值单独出来,就可以实现删除且不影响其他元素的相互关系。

值得一提的是,这个题在做的时候出现了一个玄学问题,如果把数组大小开到1000050,这种情况下memset只能放在第二个加注释的地方,加在上面会WA,下面就可以AC,但是如果把数组开到2000050就两个位置都可以AC,研究了半天也没弄明白。

AC代码

#include
#include
#include
using namespace std;
int k=1,cnt;
int n,m;
int num[2000050];
int vis[2000050];
int find(int x)
{
	if(num[x]==x)
	return x;
	else 
	return num[x]=find(num[x]);
}
void merge(int x,int y)
{
	int fx=find(x);
	int fy=find(y);
	if(fx!=fy)
		num[fy]=fx;
}
void del(int x)
{
	num[x]=cnt;
	cnt++;
}
int main()
{
	while(scanf("%d %d",&n,&m)!=EOF&&(m||n))
	{
		
		cnt=n+n; 
		memset(vis,0,sizeof(vis));
		//数组改到1000050时,memset放在这里不能过,但放在下面可以过,2000050都可以过
		for(int i=0;i

你可能感兴趣的:(并查集)