【LeetCode】692. 前K个高频单词(类似NC97)

一、题目

给一非空的单词列表,返回前 k 个出现次数最多的单词。

返回的答案应该按单词出现频率由高到低排序。如果不同的单词有相同出现频率,按字母顺序排序。

示例 1:

输入: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
输出: ["i", "love"]
解析: "i""love" 为出现次数最多的两个单词,均为2次。
    注意,按字母顺序 "i""love" 之前。

示例 2:

输入: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
输出: ["the", "is", "sunny", "day"]
解析: "the", "is", "sunny""day" 是出现次数最多的四个单词,
    出现次数依次为 4, 3, 21 次。

注意:

  1. 假定 k 总为有效值, 1 ≤ k ≤ 集合元素数。
  2. 输入的单词均由小写字母组成。

扩展练习:

  1. 尝试以 O(n log k) 时间复杂度和 O(n) 空间复杂度解决。

二、解决

1、快速排序/调用函数

思路: 略。
代码: 略。
时间复杂度: O ( n l o g n ) O(nlogn) O(nlogn)
空间复杂度: O ( n ) O(n) O(n)

2、哈希+优先级队列

思路:
代码:

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        HashMap<String, Integer > map = new HashMap<>();
        for (String s : words)  map.put(s, map.getOrDefault(s,0) + 1);  // Frequent hashmap
        
        PriorityQueue<Map.Entry<String,Integer>> maxHeap = new PriorityQueue<>(k, (a,b) -> 
            a.getValue()==b.getValue() ? a.getKey().compareTo(b.getKey()) : b.getValue()-a.getValue()); 
        // if same frequency, then sort alphabetical .  
        
        for (Map.Entry<String,Integer> entry : map.entrySet() ) maxHeap.add(entry);
        
        List<String> res = new ArrayList<>();
        while (res.size() < k) res.add(maxHeap.poll().getKey());  //add top k
        return res;
    }
}

时间复杂度: O ( n l o g n ) O(nlogn) O(nlogn)
空间复杂度: O ( n ) O(n) O(n)

  • 补充:优先级队列声明方法
1、方法1
PriorityQueue<Map.Entry<String, Integer>> pq = new PriorityQueue<Map.Entry<String, Integer>>(new Comparator<Map.Entry<String, Integer>>() {
    public int compare(Map.Entry<String, Integer> entry1, Map.Entry<String, Integer> entry2) {
        return entry1.getValue() == entry2.getValue() ? entry2.getKey().compareTo(entry1.getKey()) : entry1.getValue() - entry2.getValue();
    }
});

2、方法2
List<Map.Entry<String, Integer>> l = new LinkedList<>();
for (Map.Entry<String, Integer> e:map.entrySet()) {
	l.add(e);
}
Collections.sort(l, new MyComparator());//just use our Comparator to sort

3、方法3
class MyComparator implements Comparator<Map.Entry<String, Integer>> {
    public int compare(Map.Entry<String, Integer> e1, Map.Entry<String, Integer> e2){
        String word1 = e1.getKey();
        int freq1 = e1.getValue();
        String word2 = e2.getKey();
        int freq2 = e2.getValue();
        if(freq1!=freq2){
            return freq2-freq1;
        }
        else {
            return word1.compareTo(word2);
        }
    }
}

3、Trie树

思路: 略。
代码:

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        Map<String, Integer> map = new HashMap<>();
        for (String word:words) {
            map.put(word, map.getOrDefault(word, 0)+1);
        }
        
        Trie[] buckets = new Trie[words.length];
        for (Map.Entry<String, Integer> e:map.entrySet()) {
            //for each word, add it into trie at its bucket
            String word = e.getKey();
            int freq = e.getValue();
            if(buckets[freq] == null){
                buckets[freq] = new Trie();
            }
            buckets[freq].addWord(word);
        }
        
        List<String> ans = new LinkedList<>();
        for(int i = buckets.length-1; i >= 0; i--){
        //for trie in each bucket, get all the words with same frequency in lexicographic order. Compare with k and get the result
            if (buckets[i] != null) {
                List<String> l = new LinkedList<>();                               
                buckets[i].getWords(buckets[i].root, l);
                if(l.size() < k){
                    ans.addAll(l);
                    k = k - l.size(); 
                } else {
                    for(int j = 0; j <= k-1; j++){
                        ans.add(l.get(j));
                    }
                    break;
                }
            }
        }
        return ans;
    }
}

class TrieNode {
    TrieNode[] children = new TrieNode[26];
    String word = null;
}

class Trie {
    TrieNode root = new TrieNode();
    public void addWord(String word) {
        TrieNode cur = root;
        for (char c:word.toCharArray()) {
            if(cur.children[c-'a'] == null){
                cur.children[c-'a'] = new TrieNode();
            }
            cur = cur.children[c-'a'];
        }
        cur.word = word;
    }
    
    public void getWords(TrieNode node, List<String> ans){
        //use DFS to get lexicograpic order of all the words with same frequency
        if (node == null) {
            return;
        }
        if (node.word != null) {
            ans.add(node.word);
        }
        for (int i = 0; i <= 25; i++) {
            if (node.children[i] != null) {
                getWords(node.children[i], ans);
            }
        }
    }
}

时间复杂度: O ( n ) O(n) O(n)
空间复杂度: O ( n ) O(n) O(n)

三、NC97

版本1

import java.util.*;


public class Solution {
    /**
     * return topK string
     * @param strings string字符串一维数组 strings
     * @param k int整型 the k
     * @return string字符串二维数组
     */
    public String[][] topKstrings (String[] strings, int k) {
        Map<String, Integer> cnt = new HashMap<String, Integer>();
        for (String word : strings) {
            cnt.put(word, cnt.getOrDefault(word, 0) + 1);
        }
        List<String> rec = new ArrayList<String>();
        for (Map.Entry<String, Integer> entry : cnt.entrySet()) {
            rec.add(entry.getKey());
        }
        Collections.sort(rec, new Comparator<String>() {
            public int compare(String word1, String word2) {
                return cnt.get(word1) == cnt.get(word2) ? word1.compareTo(word2) : cnt.get(word2) - cnt.get(word1);
            }
        });
        
        String[][] res = new String[k][2];
        for (int i = 0; i < k; i++) {
            String[] tmp = new String[2];
            tmp[0] = rec.get(i);
            tmp[1] = cnt.get(tmp[0]).toString();
            res[i] = tmp;
        }
        return res;
    }
}

版本2

import java.util.*;

public class Solution {
    /**
     * return topK string
     * @param strings string字符串一维数组 strings
     * @param k int整型 the k
     * @return string字符串二维数组
     */
    public String[][] topKstrings (String[] strings, int k) {
      HashMap<String, Integer > map = new HashMap<>();
        for (String s : strings)  map.put(s, map.getOrDefault(s,0) + 1);  // Frequent hashmap
        
        PriorityQueue<Map.Entry<String,Integer>> maxHeap = new PriorityQueue<>(k, (a,b) -> 
            a.getValue()==b.getValue() ? a.getKey().compareTo(b.getKey()) : b.getValue()-a.getValue()); 
        // if same frequency, then sort alphabetical .  
        
        for (Map.Entry<String,Integer> entry : map.entrySet() ) maxHeap.add(entry);

        String[][] res = new String[k][2];
        for (int i = 0; i < k; i++) {
            Map.Entry<String, Integer> curr = maxHeap.poll();
            res[i] = new String[]{curr.getKey(), curr.getValue().toString()};
        }
        return res;
    }
}

三、参考

1、Java HashMap & MaxHeap O(nlogn)
2、Summary of all the methods you can imagine of this problem
3、【宫水三叶】详解使用「哈希表」&「优先队列」进行求解

你可能感兴趣的:(Leetcode)