力扣 1519——子树中标签相同的节点数

本题主要在于对树这种数据结构的考察，以及深度优先遍历的使用，优化时可以采取空间换时间的策略。

原题

给你一棵树（即，一个连通的无环无向图），这棵树由编号从 0 到 n - 1 的 n 个节点组成，且恰好有 n - 1 条 edges 。树的根节点为节点 0 ，树上的每一个节点都有一个标签，也就是字符串 labels 中的一个小写字符（编号为 i 的节点的标签就是 labels[i] ）

边数组 edges 以 edges[i] = [ai, bi] 的形式给出，该格式表示节点 ai 和 bi 之间存在一条边。

返回一个大小为 n 的数组，其中 ans[i] 表示第 i 个节点的子树中与节点 i 标签相同的节点数。

树 T 中的子树是由 T 中的某个节点及其所有后代节点组成的树。

示例 1：

输入：n = 7, edges = [[0,1],[0,2],[1,4],[1,5],[2,3],[2,6]], labels = "abaedcd"
输出：[2,1,1,1,1,1,1]
解释：节点 0 的标签为 'a' ，以 'a' 为根节点的子树中，节点 2 的标签也是 'a' ，因此答案为 2 。注意树中的每个节点都是这棵子树的一部分。
节点 1 的标签为 'b' ，节点 1 的子树包含节点 1、4 和 5，但是节点 4、5 的标签与节点 1 不同，故而答案为 1（即，该节点本身）。

示例 2：

输入：n = 4, edges = [[0,1],[1,2],[0,3]], labels = "bbbb"
输出：[4,2,1,1]
解释：节点 2 的子树中只有节点 2 ，所以答案为 1 。
节点 3 的子树中只有节点 3 ，所以答案为 1 。
节点 1 的子树中包含节点 1 和 2 ，标签都是 'b' ，因此答案为 2 。
节点 0 的子树中包含节点 0、1、2 和 3，标签都是 'b'，因此答案为 4 。

示例 3 ：

输入：n = 5, edges = [[0,1],[0,2],[1,3],[0,4]], labels = "aabab"
输出：[3,2,1,1,1]

示例 4：

输入：n = 6, edges = [[0,1],[0,2],[1,3],[3,4],[4,5]], labels = "cbabaa"
输出：[1,2,1,1,2,1]

示例 5：

输入：n = 7, edges = [[0,1],[1,2],[2,3],[3,4],[4,5],[5,6]], labels = "aaabaaa"
输出：[6,5,4,1,3,2,1]
 ```

提示：

* 1 <= n <= 10^5
* edges.length == n - 1
* edges[i].length == 2
* 0 <= ai, bi < n
* ai != bi
* labels.length == n
* labels 仅由小写英文字母组成

原题 url：https://leetcode-cn.com/problems/number-of-nodes-in-the-sub-tree-with-the-same-label

## 解题

### 首次尝试

这道题是要让我们计算：在子树中，和当前节点字符相同的节点个数。

那么我们就必然需要构建树中各个节点的关系，那么就需要记录父子节点的关系，因为是普通的树，一个节点的子节点可能有多个，因此我用`LinkedList[] tree`这样一个数组进行存储，其中`tree[i]`代表节点 i 的所有子节点。

至于求相同节点的个数，我想着可以从根节点 0 开始逐个遍历，先获取其第一层子节点，再根据第一层子节点逐个获取，可以采用广度优先遍历的形式。

让我们看看代码：

class Solution {

public int[] countSubTrees(int n, int[][] edges, String labels) {
    // 构造树
    LinkedList[] tree = new LinkedList[n];
    for (int[] edge : edges) {
        // edge[0]的子节点
        LinkedList child = tree[edge[0]];
        if (child == null) {
            child = new LinkedList<>();
            tree[edge[0]] = child;
        }
        // 增加子节点
        child.add(edge[1]);
    }

    // 结果
    int[] result = new int[n];
    // 遍历并计算
    for (int i = 0; i < n; i++) {
        // 需要遍历的字符
        char cur = labels.charAt(i);
        // 该节点的子树中与该字符相同的节点数
        int curCount = 0;
        // 广度优先遍历
        LinkedList searchList = new LinkedList<>();
        searchList.add(i);
        while(!searchList.isEmpty()) {
            int index = searchList.removeFirst();
            if (cur == labels.charAt(index)) {
                curCount++;
            }
            // 找出该节点的子树
            if (tree[index] == null) {
                continue;
            }
            searchList.addAll(tree[index]);
        }

        result[i] = curCount;
    }

    return result;
}

}

提交之后，发现有`错误`。错误的情况是：

输入：
4
[[0,2],[0,3],[1,2]]
"aeed"

输出：
[1,2,1,1]

预期：
[1,1,2,1]


根据这样输入，我构造出的树是：

1 0
/ \
2 3


但根据预期结果反推出来的树是：

/ \
2 3
/
1


那么输入中最后给出的`[1,2]`就不是从`父节点`指向`子节点`，也就是输入中给出的边关联的节点顺序，是任意的。

那我们的树究竟该如何构造呢？

### 双向记录构造树

既然我们在构造树的时候，无法直接得出父子关系，那么就将对应两个节点同时记录另一个节点。

根据题目中给出的条件：`树的根节点为节点 0`。这样我们在遍历的时候，就从 0 开始，只要 0 关联的节点，一定是 0 的子节点。将这些节点进行标记，这样再递归访问接下来的节点时，如果是标记过的，则说明是父节点，这样就可以明确父子节点关系了。

至于遍历的时候，因为这次我们是不知道父子节点关系的，所以无法直接采用广度优先遍历，换成`深度优先遍历`。

让我们看看代码：

class Solution {

// 总节点数
int n;
// 树
Map> tree;
// 字符串
String labels;
// 最终结果
int[] result;

public int[] countSubTrees(int n, int[][] edges, String labels) {
    this.n = n;
    this.labels = labels;
    result = new int[n];

    LinkedList list;
    // 双向构造树的关系
    tree = new HashMap<>(n / 4 * 3 + 1);
    for (int[] edge : edges) {
        // 添加映射关系
        list = tree.computeIfAbsent(edge[0], k -> new LinkedList<>());
        list.add(edge[1]);
        list = tree.computeIfAbsent(edge[1], k -> new LinkedList<>());
        list.add(edge[0]);
    }

    
    // 深度优先搜索
    dfs(0);

    return result;
}

public int[] dfs(int index) {
    // 当前子树中，所有字符的个数
    int[] charArray = new int[26];

    // 开始计算，标志该节点已经计算过
    result[index] = 1;
    // 获得其关联的节点
    List nodes = tree.get(index);
    // 遍历
    for (int node : nodes) {
        // 如果该节点已经访问过
        if (result[node] > 0) {
            continue;
        }

        // 递归遍历子节点
        int[] array = dfs(node);
        for (int i = 0; i < 26; i++) {
            charArray[i] += array[i];
        }
    }
    // 将当前节点的值计算一下
    charArray[labels.charAt(index) - 'a'] += 1;
    result[index] = charArray[labels.charAt(index) - 'a'];
    return charArray;
}

}

提交OK，执行用时`136ms`，超过`36.71%`，内存消耗`104.5MB`，超过`91.38%`。

时间复杂度上，应该是要研究`dfs`方法中的两个`for`循环，外层肯定是每个节点都遍历一遍，内层还需要遍历`26`个英文字母，也就是`O(n)`。

空间复杂度上，最大的应该就是存储节点映射关系的`tree`了，里面实际上就是 2n 个节点（因为每条边对应的两个节点都会互相存一次对方），因此也就是`O(n)`。

虽然过了，但执行速度很慢，可以进一步优化。

### 用空间换时间

针对我上面的解法，其中`tree`我是用的`Map`，虽然其`get`方法理论上是`O(n)`，但毕竟涉及 hash，可以优化成数组。

至于每次取节点对应的字符所用的`charAt`方法，具体其实是：

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}

每次都会检查一次 index，其实这完全是可以省略的，因此可以提前构造好每个位置对应的值，也用一个数组存储。

让我们看看新的代码：

class Solution {

// 总节点数
int n;
// 树
LinkedList[] tree;
// 每个节点的值（用数字表示）
int[] nodeValueArray;
// 最终结果
int[] result;

public int[] countSubTrees(int n, int[][] edges, String labels) {
    this.n = n;
    nodeValueArray = new int[n];
    result = new int[n];

    // 双向构造树的关系
    tree = new LinkedList[n];
    for (int i = 0; i < n; i++) {
        tree[i] = new LinkedList<>();
    }
    for (int[] edge : edges) {
        // 添加映射关系
        tree[edge[0]].add(edge[1]);
        tree[edge[1]].add(edge[0]);
    }

    // 生成节点的值
    for (int i = 0; i < n; i++) {
        nodeValueArray[i] = labels.charAt(i) - 'a';
    }

    // 深度优先搜索
    dfs(0);

    return result;
}

public int[] dfs(int index) {
    // 当前子树中，所有字符的个数
    int[] charArray = new int[26];

    // 开始计算，标志该节点已经计算过
    result[index] = 1;
    // 获得其关联的节点
    List nodes = tree[index];
    // 遍历
    for (int node : nodes) {
        // 如果该节点已经访问过
        if (result[node] > 0) {
            continue;
        }

        // 递归遍历子节点
        int[] array = dfs(node);
        for (int i = 0; i < 26; i++) {
            charArray[i] += array[i];
        }
    }
    // 将当前节点的值计算一下
    charArray[nodeValueArray[index]] += 1;
    result[index] = charArray[nodeValueArray[index]];
    return charArray;
}

}

提交之后，执行用时是`96ms`，内存消耗是`402.2MB`。看来优化的效果并不明显。

### 研究一下目前最优解法

这个解法真的是巧妙，执行用时`20ms`，超过了`100%`，内存消耗`76.3MB`，超过了`100%`。

我在代码中增加了注释，方便大家理解。但这样的写法，研究一下是能够看懂，但让我想估计是永远不可能想出来，可以让大家也一起学习和借鉴：

public class Solution {

static class Next {

Next next;
Node node;

Next(Next next, Node node) {
  this.next = next;
  this.node = node;
}

}

static class Node {

/**
 * 当前节点的index
 */
final int index;

/**
 * 当前节点对应的字符值（减去'a'）
 */
final int ci;

/**
 * 所有关联的节点
 */
Next children;

/**
 * 该节点的父节点
 */
Node parent;

/**
 * 子树中和该节点含有相同字符的节点总个数
 */
int result;

/**
 * 是否还在队列中，可以理解为是否已访问过
 */
boolean inQueue;

public Node(int index, int ci) {
  this.index = index;
  this.ci = ci;
  this.result = 1;
}

/**
 * 从后往前，找到当前节点没有访问过的第一个子节点
 */
Node popChild() {
  for (; ; ) {
    // 当前节点的所有关联节点
    Next n = this.children;
    // 如果没有，说明子节点都遍历完了
    if (n == null) {
      return null;
    }
    // 从后往前移除关联节点
    this.children = n.next;
    // 返回第一个没有访问过的节点
    if (!n.node.inQueue) {
      return n.node;
    }
  }
}

/**
 * 访问了该节点
 */
Node enqueue(Node[] cnodes) {
  // 该节点标记为访问过
  this.inQueue = true;
  // 记录该节点的父节点
  this.parent = cnodes[ci];
  // 那么现在该字符值对应的最高节点，就是当前节点。
  // 这样如果之后也遇到相同字符的子节点，就可以为子节点赋值其父节点，也就是上面一行是有效的
  cnodes[ci] = this;
  return this;
}

/**
 * 退出该节点
 */
void dequeue(Node[] cnodes, int[] res) {
  // 之后会访问该节点的兄弟节点，因此父节点需要重新设置
  cnodes[ci] = this.parent;
  // 设置当前节点的值
  res[index] = this.result;
  // 父节点也可以进行累加
  if (this.parent != null) {
    this.parent.result += this.result;
  }
}

void link(Node x) {
  // this节点和x节点，互相绑定
  this.children = new Next(this.children, x);
  x.children = new Next(x.children, this);
}

}

public int[] countSubTrees(int n, int[][] edges, String labels) {

// 构造树
Node[] nodes = new Node[n];
// 每个节点对应的字符
for (int i = 0; i < n; i++) {
  nodes[i] = new Node(i, labels.charAt(i) - 'a');
}
// 通过边的关系，将节点互相绑定
for (int[] es : edges) {
  nodes[es[0]].link(nodes[es[1]]);
}

// 最终的结果
int[] res = new int[n];
// 当前访问的节点下标
int sz = 0;
// 26个小写英文字母对应的节点数组
Node[] cnodes = new Node[26];
// 下面三行可以合并成这一行：
// Node node = nodes[sz++] = nodes[0].enqueue(cnodes);
nodes[sz] = nodes[0].enqueue(cnodes);
// 当前访问的节点
Node node = nodes[sz];
// 因为当前节点已经访问过，自然下标需要+1
sz++;

for (; ; ) {
  // 从后往前，找到当前节点没有访问过的第一个子节点
  Node child = node.popChild();
  // 如果已经全部访问过了
  if (child == null) {
    // 开始计算
    node.dequeue(cnodes, res);
    if (--sz == 0) {
      break;
    }
    // 回溯到父节点
    node = nodes[sz - 1];
  } else {
    // 保证了相邻节点一定是父子节点
    node = nodes[sz++] = child.enqueue(cnodes);
  }
}
return res;

}
}


## 总结
   
以上就是这道题目我的解答过程了，不知道大家是否理解了。本题主要在于对树这种数据结构的考察，以及深度优先遍历的使用，优化时可以采取空间换时间的策略。

有兴趣的话可以访问我的博客或者关注我的公众号、头条号，说不定会有意外的惊喜。

[https://death00.github.io/](https://death00.github.io/)

公众号：健程之道

![](https://static01.imgkr.com/temp/dcfad764025e4ea9ab8d73ef0934efdd.png)

![](https://static01.imgkr.com/temp/003e8c4d5f5544b0bd70dfe7aa63bf60.png)

力扣 1519——子树中标签相同的节点数

原题

你可能感兴趣的:(leetcode,dfs,算法)