HDU - 2473 - Junk-Mail Filter (并查集)

Recognizing junk mails is a tough task. The method used here consists of two steps:

  1. Extract the common characteristics from the incoming email.
  2. Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.
Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3

3 1
M 1 2

0 0
Sample Output
Case #1: 3
Case #2: 2

现在我们生活中,垃圾邮件时常出现,我们要想办法能够判别垃圾邮件,于是从n个邮件中提取出公共的特征,如果符合该特征那么就将两个邮件就把它们并到一类,但也有一种情况,就是我们判断错误了,那么就需要把这一封邮件从中分离出来,成为一个孤立点。于是,对n封邮件我们进行m种操作。一种是 M a b 操作,就是将a,b邮件归为一类,还有就是S a,就是将a从分类中拿出来,成为孤立点。

明显的并查集,因为这里涉及到了并查集中的删除节点,所以要开一个数组记录这个节点当前真正的编号是多少,因为我们的并查集,有可能出现树状的情况(因为只有Find()函数进行路径压缩并且查询的时候才会将叶节点直接挂到根节点上),所以不能将该点直接在树上删除,否则有可能失去一个分叉,所以我们就构造一个虚拟的节点,原来的节点当前真正所在的位置就是这个虚拟的节点。这样,我们在合并的时候合并真实节点就可以了。可能这样说还是很抽象,按着代码想一遍就差不多了。

AC代码:

#include
#include
#define maxn 1000010
using namespace std;

int f[maxn], r[maxn], flag[maxn];    //f数组代表他的祖先,r数组代表改点当前真正的标号
int n, m, bb;
void init()
{
    for(int i = 0; i < maxn; i++)
        f[i] = r[i] = i;
    return ;
}

int getf(int x)
{
    return (f[x] == x) ? x : (f[x] = getf(f[x]));
}

void unit(int a, int b)
{
    int x = getf(a), y = getf(b);
    f[y] = x;
}

void del(int x)
{
    r[x] = bb;
    f[bb] = r[bb];
    bb++;
}

int main()
{
    int num = 0;
    while(scanf("%d%d", &n, &m), n + m)
    {
        init();
        bb = n;
        for(int i = 0; i < m; i++)
        {
            char c;
            getchar();
            c = getchar();
            if(c == 'M')
            {
                int a, b;
                scanf("%d%d", &a, &b);
                unit(r[a], r[b]);
            }
            else if(c == 'S')
            {
                int a;
                scanf("%d", &a);
                del(a);
            }
            //for(int i = 0; i < n; i++)
                //printf("f[%d]: %d, r[%d]: %d\n", i, f[i], i, r[i]);
        }

        int cnt = 0;
        memset(flag,0,sizeof(flag));
        for(int i=0;i

你可能感兴趣的:(数据结构)