弄懂论文里面数学公式到底实在干什么?-chatpdf

求助chatpdf

chatpdf 虽然是个人工智障,但用好的也是一个非常不错的助手;

  • Spatial Consistency Loss for Training Multi-Label Classifiers from Single-Label Annotations
    弄懂论文里面数学公式到底实在干什么?-chatpdf_第1张图片
    我看不懂 H n t H^t_n Hnt是什么意思,我这样问他:
    弄懂论文里面数学公式到底实在干什么?-chatpdf_第2张图片
    big H指代它;

  • Learning Deep Features for Discriminative Localization
    (1) 我问它,explain the big F and big H, and what is the score heatmap ?
    它这样回答:
    In the context of the article “Learning Deep Features for Discriminative Localization,” the terms “big F” and “big H” are used to represent important concepts related to class activation mapping and discriminative localization.

  1. Big F:

    • Big F refers to the result of performing global average pooling on the feature maps of the last convolutional layer. It is represented as F k = ∑ x , y f k ( x , y ) F_k = ∑x,y f_k(x, y) Fk=x,yfk(x,y), where f k ( x , y ) f_k(x, y) fk(x,y) represents the activation of unit k in the last convolutional layer at spatial location (x, y) .
    • The big F values are then used to compute the class activation maps, which indicate the importance of the activation at spatial locations for the classification of an image to a specific class .
  2. Big H (Class Activation Map):

    • Big H, also known as the class activation map (CAM), is generated using the weighted sum of the feature maps of the last convolutional layer. It is represented as
      M c ( x , y ) = ∑ k w k c f k ( x , y ) M_c(x, y)=∑k w^c_k f_k(x, y) Mc(x,y)=kwkcfk(x,y), where w k c w^c_k wkc is the weight corresponding to class c for unit k .
    • The class activation map (big H) highlights the discriminative regions of the image for a particular class, providing a visual representation of the areas that contribute most to the classification decision .
  3. Score Heatmap:

    • The score heatmap is a visual representation of the class activation map (big H) superimposed on the input image. It highlights the regions of the input image that are most relevant to the particular category or class, based on the learned discriminative features of the convolutional neural network .
    • By upsampling the class activation map to the size of the input image, the score heatmap can be generated, allowing for the identification of the image regions most relevant to a specific category .

In summary, big F represents the result of global average pooling on the feature maps, big H (class activation map) highlights discriminative regions, and the score heatmap visually represents the activated regions on the input image for a specific class. These concepts are fundamental to the proposed approach for discriminative localization using convolutional neural networks with global average pooling.
(前面几个公式我修正了一点,它原来显示没有这么漂亮)

非常棒,解决了我的问题,主要还是要靠提问技巧:
我又这样问它:
(2) what the shape of small f, tell me something like (1, 2048, 7, 7), please
The shape of the small f, representing the activation of a unit in the last convolutional layer at a spatial location, is (k, 7, 7), where k is the number of units in the last convolutional layer . This indicates that for each unit, the activation is represented by a 7x7 feature map.

(3) why the k of shape (k, 7, 7) is the number of units in the last convolutional layer ?
The shape of small f, representing the activation of a unit in the last convolutional layer at a spatial location, is (k, 7, 7), where k is the number of units in the last convolutional layer .

The reason for this is that the last convolutional layer of a convolutional neural network typically consists of multiple units, each of which is responsible for detecting a specific feature or pattern in the input image. These units are arranged in a 3D grid, with each unit having a 2D feature map representing its activations at different spatial locations in the input image.

In the case of the proposed approach for discriminative localization using global average pooling, the activations of these units are used to compute the class activation maps, which indicate the importance of the activation at spatial locations for the classification of an image to a specific class. Therefore, the number of units in the last convolutional layer determines the number of feature maps in the activation tensor, resulting in the shape of (k, 7, 7) for small f.

查看补充材料

在这里插入图片描述
就这种Spatial_Consistency_Loss_WACV_2023_supplemental.pdf,supplemental 有时会用缩写;有时候附录直接放在了文章后面,有时候下载的版本没有附录,但是文章里面提到了确实有这个东西,还得自己去找;

你可能感兴趣的:(#,读论文,写代码,深度学习,人工智能,机器学习)