1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.
We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:
a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.
b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.
Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
有n个点,编号0~n,对这些点进行分类,提供两种操作:
1. M X Y, 将X和Y划分为一类
2. S X 将X从所属分类中剔除,成为单独的一类
问经过一些这样的操作以后,最终分成了多少类
通常并查集提供的操作只有合并,并没有删除,而本题则要求删点,删点不能随便删,如果要删的点正好是一个集合中的根节点,且该集合中的元素不止一个怎么办,把根删了,其他子节点怎么办?
做法就是对于删点操作,那个点并不真正的删掉,比如该节点是3号,要删3号,那么就重新开辟一个节点,编号为n++,让它作为3号节点,也就是为3号节点换了一个位置,只不过它的编号变了,原来的3号节点还在,不过已经不起作用了,成为虚的,用id[i],作为第i号节点的编号,普通的并查集中是不用开辟这个数组的,第几个节点就是几号。当进行删点操作,改变该点的id, id[i]=n++,重新开辟一个编号给它,且father[id[i]]=id[i].这样就把该点分离成为单独的一类,原来所在集合的结构也没有改变,当最后查询有多少分类是,普通的并查集是判断是否 father[i]==i,这里要单独对每个节点进行find(id[i])进行操作,不能if(father[id[i]]==id[i]),因为比如4和3在一个集合中,3为根节点,然后把3单独分离,id[3]=n,father[n]=n; 如果按照parent[id[i]]==id[i],进行查询的话,当i=4时,id[4]=4,parent[4]==3!=4,所以它一定和其他元素在一个集合中,不对其计数,但实际上,4的确是和3在一个集合中,但此时的3是一个虚的了,3有了新的id,所以用parent[id[i]]==id[i]查询不行。
用find(id[i]),也就是find(4)可以查到是3,father[3]==3这个数据没有变,所以4是一个独立的集合。查询3时,id[3]=n,find(n),father[n]==n,也是一个独立的集合。
因为要重新开辟节点,所以father[i]数组要开辟maxn*2个,id[i]也是
#include
#include
#include
using namespace std;
int N,M,ID;
int id[2000010],father[2000010];
int find(int x)
{
if(x==father[x])
return x;
else
return father[x]=find(father[x]);
}
void Union(int x,int y)
{
x=find(x);
y=find(y);
if(x==y)return;
else father[x]=y;
}
void del(int x)
{
id[x]=ID;
father[id[x]]=ID++;
}
int main()
{
int cases=0;
while(scanf("%d%d",&N,&M)!=EOF&&(N||M))
{
int cnt=0,x,y;
ID=N;
char cmd;
for(int i=0;i
id[i]=i;father[i]=i;
}
while(M--)
{
cin>>cmd;
if(cmd=='M')
{
cin>>x>>y;
Union(id[x],id[y]);
}
else
{
cin>>x;
del(x);
}
}
bool vis[ID];
memset(vis,0,sizeof(vis));
for(int i=0;i
x=find(id[i]);
if(!vis[x])
{
cnt++;
vis[x]=1;
}
}
cout<<"Case #"<<++cases<<": "<
return 0;
}