All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
Show Tags
public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> res = new ArrayList<>();
if (s == null || s.length() <= 10) {
return res;
}
HashSet<Integer> set = new HashSet<Integer>();
HashSet<Integer> set2 = new HashSet<Integer>();
for (int i = 0; i <= s.length() - 10; i++) {
int v = 0;
for (int j = i; j < i + 10; j++) {
char c = s.charAt(j);
if (c == 'A') {
v |= 0;
} else if (c == 'T') {
v |= 1;
} else if (c == 'C') {
v |= 2;
} else {
v |= 3;
}
v <<=2 ;
}
if (!set.add(v) && set2.add(v)) {
res.add(s.substring(i, i + 10));
}
}
return res;
}
}
没有评论:
发表评论