Google Storage学习 - 2023-05-30

2023暑期学习

  • GCP中如何操作Storage中的数据
    • 什么是GS
    • 本地如何访问并操作Bucket上的资源
    • 公司的bucket实际应用
      • 关键词
        • Optimizer
        • Model -- Input Part
        • Model -- Core Model -- Transformer

GCP中如何操作Storage中的数据

链接:
https://www.jianshu.com/p/137a7fcc0563

什么是GS

Cloud Storage is a managed service for storing unstructured data. Store any amount of data and retrieve it as often as you like
Storage中的数据都是按Bucket存储的。
每个Bucket中可以存储各种文件:图片、文档、音频等。

本地如何访问并操作Bucket上的资源

复制bucket的文件到local

gsutil cp gs://bucket-name/folder/filename .

删除bucket

gsutil rm gs://bucket-name/folder/filename

查看bucket的东西

gsutil ls gs://bucket-name/folder/filename 

公司的bucket实际应用

bucket名字: gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00
是2023-05-30那天的所有bucket

gsutil ls gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00

显示

gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_postln_exp_jiana/
gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_preln_deep_model/
gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_preln_exp_jiana/
gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_preln_two_layers/

这是我在05-30号跑的四个模型
选一个模型继续查看
查看这个模型的一个config file

关键词

Optimizer

Learning rate is 0.03
gradient_clipping_by_norm: 10.0

区别 cold-start和warm-start

Model – Input Part

Input layer

  1. feature name --> event type
    list: “ADD_TO_CART”
    list: “ADD_TO_WISHLIST”
    list: “HOME”
    list: “ITEM_PAGE_VIEW”
    list: “LAND”
    list: “PAGE_VIEW”
    list: “PURCHASE”
    list: “SEARCH”
  2. feature name --> event time(event_secs_since_offset)
  3. feature name --> item (“item_id”)
    每个item有对应的
    key: “categories”
    key: “discount_rate”
    key: “image_embedding_vec”
    key: “item_id”
    key: “price.amount”
    key: “sale_price.amount”
  4. feature name --> event context
    key: “local_day_of_month”
    key: “local_day_of_week”
    key: “local_hour_of_day”

Label

Impression

  1. input_type: SCORING
  2. impression_item_key: “impression_item_id”
  3. impression_label_key: “impression_label”
  4. impression_context:impression_local_day_of_week
    impression_local_day_of_week
    impression_local_hour_of_day
    inventory_id

NLP_config
key: “search_query”
key: “title”

impression_interaction_config
key: “impression/count”
key: “impression/most_recent_secs_ago”
key: “item_page_view/count”
key: “item_page_view/most_recent_secs_ago”

Model – Core Model – Transformer

  num_layers: 2
  tde_boundaries: "5s"
  tde_boundaries: "30s"
  tde_boundaries: "1m"
  tde_boundaries: "5m"
  tde_boundaries: "15m"
  tde_boundaries: "30m"
  tde_boundaries: "1h"
  tde_boundaries: "3h"
  tde_boundaries: "6h"
  tde_boundaries: "12h"
  tde_boundaries: "1d"
  tde_boundaries: "2d"
  tde_boundaries: "3d"
  tde_boundaries: "4d"
  tde_boundaries: "5d"
  tde_boundaries: "6d"
  tde_boundaries: "7d"
  tde_boundaries: "10d"
  tde_boundaries: "14d"
  tde_boundaries: "21d"
  tde_boundaries: "28d"
  drop_rate: 0.1
  epsilon: 1e-06
  d_hidden: 160
  attention_config {
    multi_head_attention {
      num_heads: 4
    }
  }
}

top_k: 200
model_dir: “gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_preln_two_layers”
meta_data_dir: “gs://tf-model-recommendation-dev/bucketplace/2023-05-30/00/v15_1_s9_preln_two_layers/metadata”

对于 曝光量的模型参数–>

    impression_config {
      hidden_layer_size: 96
      top_hidden_layer: true
      loss_multiplier: 0.01
      query_context_activation: "relu"
      placement_bias {
        feature_config {
          key: "inventory_id_x_index"
          value {
            embedding_column {
              dimension: 32
              sequence {
                hash_bucket {
                  num_buckets: 100000
                }
              }
            }
          }
        }
      }
    }

对于 CVR模型和CTR模型的参数

    conversion_config {
      hidden_layer_size: 64
      loss_multiplier: 0.1
      hidden_layer_activation: "relu"
    }
    click_config {
      hidden_layer_size: 32
      top_hidden_layer: true
      loss_multiplier: 0.01
      query_context_activation: "relu"
    }

你可能感兴趣的:(2023暑期学习,学习)