Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCAAAAAGGGTTT",



Return:

["AAAAACCCCC", "CCCCCAAAAA"].


这题其实挺简单的:
1.[AAAAAAAAAAAA] 也有[AAAAAAAAAA]作为return。 substrings之间可以overlap;
2.其实就是怎么对一个length=10的string找一个hashcode。(如果hashmap 以string 为key, 则out of memory)
再就是 如果每次去算hashcode的时候 用substring 算 也会ofm。。。。

 1 public class Solution {

 2     public List<String> findRepeatedDnaSequences(String s) {

 3         List<String> result = new ArrayList<String>();

 4         if(s == null || s.length() < 10) return result;

 5         HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();

 6         Integer val = 0;

 7         for(int i = 0; i < 10; i ++){

 8             val = val << 2;

 9             val |= toInt(s.charAt(i));

10         }

11         map.put(val, 1);

12         for(int i = 10; i < s.length(); i ++){

13             val = ((val & 0x3ffff) << 2) | toInt(s.charAt(i));

14             if(map.containsKey(val)) map.put(val, map.get(val) + 1);

15             else map.put(val, 1);

16         }

17         for(Integer v : map.keySet())

18             if(map.get(v) > 1) result.add(toDNA(v));

19         return result;

20     }

21     

22     private Integer toInt(char c){

23         if(c == 'A') return 0;

24         else if(c == 'C') return 1;

25         else if(c== 'G') return 2;

26         else return 3;//T

27     }

28     

29     private String toDNA(Integer i){

30         StringBuilder sb = new StringBuilder();

31         for(int j = 0; j < 10; j ++){

32             int tmp = i % 4;

33             i = i / 4;

34             char c = 'T';

35             if(tmp == 0) c = 'A';

36             else if(tmp == 1) c = 'C';

37             else if(tmp == 2) c ='G';

38             sb.insert(0, c);

39         }

40         return sb.toString();

41     }

42 }

 



你可能感兴趣的:(sequence)