LLM-Intro to Large Language Models

LLM

some LLM’s model and weight are not opened to user

what is?

Llama 270b model

  • 2 files

    • parameters file
      • parameter or weight of neural network
      • parameter – 2bytes, float number
    • code run parameters(inference)
      • c or python, etc
      • for c, 500 lines code without dependency to run
      • self contained package(no network need)
  • how to get parameters?

    • lossy compress large chunk of text (10TB) with 6000 GPU for 12 days (cost 200$) to 140G zip file(gestalt of the text, weights and parameters)
  • what neural do is trying to predict the next word in a sequence. parameters are dispersed throughout the neural network and neurons are connected to each other, fire in a certain way
    LLM-Intro to Large Language Models_第1张图片

  • prediction has strong relationship with compression

  • LLM create a correct form of text and fill it with its knowedge. not create a copy of text that was be trained.

  • how does it work?

LLM-Intro to Large Language Models_第2张图片
LLM-Intro to Large Language Models_第3张图片

training stage

  • pre-training

    • expensive
    • base model. get a document generator model
    • it’s about knowledge
    • internet documents
  • fine tuning

    • cheaper
    • assistant model. get a assistant model
    • it’s about alighment
    • Q&A document
    • training with high quality conversation(question and answer).write labeling instructions to specify how assistant should behave
    • focus on quality not amount
      LLM-Intro to Large Language Models_第4张图片
  • stage 3(optional)

    • use comparison label
    • reenforcement learning from human feedback

LLM-Intro to Large Language Models_第5张图片

  • labeling is a human-machine collaboration

在这里插入图片描述

  • rank of LLM

LLM-Intro to Large Language Models_第6张图片

LLM scaling laws:

  • more D and N will get better model

LLM-Intro to Large Language Models_第7张图片

LLM-Intro to Large Language Models_第8张图片

  • multimodality. now some LLM like GPT can use different tools to help it with answering questions. browser, calculator, python interpreter.

  • future directions of development in LLM

give LLM system 2 ablility

LLM-Intro to Large Language Models_第9张图片
LLM-Intro to Large Language Models_第10张图片

  • LLM now only have system one(instinctive)
  • convert time to accuracy

self-improvement

在这里插入图片描述

  • in narrow domain it is possible to self-improve

customization

experts in certain domain

future of LLM

LLM-Intro to Large Language Models_第11张图片

你可能感兴趣的:(AI,语言模型,人工智能,自然语言处理,LLM,llama)