LeetCode-Repeated DNA Sequences (位图算法减少内存)

Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",



Return:

["AAAAACCCCC", "CCCCCAAAAA"].

 

 
用位图算法可以减少内存,代码如下:
int map_exist[1024 * 1024 / 32];

int map_pattern[1024 * 1024 / 32];



#define set(map,x) \

    (map[x >> 5] |= (1 << (x & 0x1F)))



#define test(map,x) \

    (map[x >> 5] & (1 << (x & 0x1F)))



int dnamap[26];



char** findRepeatedDnaSequences(char* s, int* returnSize) {

    *returnSize = 0;

    if (s == NULL) return NULL;

    int len = strlen(s);

    if (len <= 10) return NULL;



    memset(map_exist, 0, sizeof(int)* (1024 * 1024 / 32));

    memset(map_pattern, 0, sizeof(int)* (1024 * 1024 / 32));



    dnamap['A' - 'A'] = 0;  dnamap['C' - 'A'] = 1;

    dnamap['G' - 'A'] = 2;  dnamap['T' - 'A'] = 3;



    char ** ret = malloc(sizeof(char*));

    int curr = 0;

    int size = 1;

    int key;

    int i = 0;



    while (i < 9)

        key = (key << 2) | dnamap[s[i++] - 'A'];

    while (i < len){

        key = ((key << 2) & 0xFFFFF) | dnamap[s[i++] - 'A'];

        if (test(map_pattern, key)){

            if (!test(map_exist, key)){

                set(map_exist, key);

                if (curr == size){

                    size *= 2;

                    ret = realloc(ret, sizeof(char*)* size);

                }

                ret[curr] = malloc(sizeof(char)* 11);

                memcpy(ret[curr], &s[i-10], 10);

                ret[curr][10] = '\0';

                ++curr;

            }



        }

        else{

            set(map_pattern, key);

        }

    }



    ret = realloc(ret, sizeof(char*)* curr);

    *returnSize = curr;

    return ret;

}

该算法用时 6ms 左右, 非常快

 

你可能感兴趣的:(LeetCode)