题目: 给定一段产品的英文描述,包含M个英文单词,每个单词以空格分隔,无其他标点,再给定N个英文单词关键字。请说明思路并编程实现方法 String extractSummary(String description,String [ ] Keywords):目标是找出此产品描述中包含N个关键词(每个关键词至少出现一次)的长度最短的子串,作为产品简介输出,编程语言不限。
实现方法:在确保所有关键字都包含的情况下,每次从content尾向前挪动一个位置,都从content的头部到尾遍历一遍,碰上小的就付给result,直到完全遍历完
代码:
package test; import java.util.ArrayList; import java.util.List; /** * @author hy * 2011/6/13 */ public class FindAbstract { static String content[] = { "a", "c", "d", "a", "c", "b", "d", "e", "a","a","b"}; static String keyword[] = { "b", "c", "d" }; static List<String> contentList = new ArrayList<String>(); public static void main(String args[]) { List<String> result = new ArrayList<String>(); int begin = 0; int end = content.length; // 将content内容从数组形式变换成List型 for (int i = 0; i < end; i++) contentList.add(i, content[i]); // 输出给定的content和keyword System.out.print("[content]: "); for (int i = 0; i < content.length; i++) System.out.print(content[i] + " "); System.out.println(); System.out.print("[keyword]: "); for (int i = 0; i < keyword.length; i++) System.out.print(keyword[i] + " "); System.out.println(); // 输出最短摘要 result = contentList; System.out.println("[AllMatch]:"); for (end = content.length; end - begin >= keyword.length; end--) { for (begin = 0; end - begin >= keyword.length; begin++) { if (isAllHave(contentList.subList(begin, end), keyword) && result.size() > contentList.subList(begin, end) .size()){ result = contentList.subList(begin, end); System.out.println(" "+result); } } begin = 0; } System.out.println("[ShortestMatch]: "+result); } // 是否都包含所有关键字 static boolean isAllHave(List<String> arr, String key[]) { boolean is = false; int temp = 0; for (int i = 0; i < key.length; i++) if (isKeywordIn(arr, key[i])) temp++; if (temp == key.length) is = true; return is; } // 是否包含单个关键字 static boolean isKeywordIn(List<String> arr, String key) { int i; for (i = 0; i < arr.size(); i++) if (arr.get(i) == key) return true; return false; } }
结果:
[content]: a c d a c b d e a a b
[keyword]: b c d
[AllMatch]:
[c, d, a, c, b, d, e, a, a, b]
[d, a, c, b, d, e, a, a, b]
[a, c, b, d, e, a, a, b]
[c, b, d, e, a, a, b]
[c, b, d, e, a, a]
[c, b, d, e, a]
[c, b, d, e]
[c, b, d]
[ShortestMatch]: [c, b, d]