在MacBook上安装LLaMA这个大语言模型,主要follow大佬Andrew的博客https://agi-sphere.com/install-llama-mac/#Step_1_Install_Homebrew,它里面的图十分漂亮!需要提前安装好Python 3.10.X和Pytorch-Mac,Python 3.10.X的安装采用pyenv,而Pytorch-Mac的安装直接上官网下载最新版就好 。准备好这些,按照博客上的步骤,会遇到两个坑,1)模型下载链接失效,2)针对Mac的Python库函数需要重新编译。
尽管可以向Meta申请访问权限,但似乎现在在Meta上申请已经很慢。但咱社区有力量,可以从第三方下载模型参数,主要有7B,13B,30B,和65B参数量四种模型。模型的下载地址列在下方:
wget https://agi.gpt4.org/llama/LLaMA/tokenizer.model -O ./tokenizer.model
wget https://agi.gpt4.org/llama/LLaMA/tokenizer_checklist.chk -O ./tokenizer_checklist.chk
wget https://agi.gpt4.org/llama/LLaMA/7B/consolidated.00.pth -O ./7B/consolidated.00.pth
wget https://agi.gpt4.org/llama/LLaMA/7B/params.json -O ./7B/params.json
wget https://agi.gpt4.org/llama/LLaMA/7B/checklist.chk -O ./7B/checklist.chk
wget https://agi.gpt4.org/llama/LLaMA/13B/consolidated.00.pth -O ./13B/consolidated.00.pth
wget https://agi.gpt4.org/llama/LLaMA/13B/consolidated.01.pth -O ./13B/consolidated.01.pth
wget https://agi.gpt4.org/llama/LLaMA/13B/params.json -O ./13B/params.json
wget https://agi.gpt4.org/llama/LLaMA/13B/checklist.chk -O ./13B/checklist.chk
wget https://agi.gpt4.org/llama/LLaMA/30B/consolidated.00.pth -O ./30B/consolidated.00.pth
wget https://agi.gpt4.org/llama/LLaMA/30B/consolidated.01.pth -O ./30B/consolidated.01.pth
wget https://agi.gpt4.org/llama/LLaMA/30B/consolidated.02.pth -O ./30B/consolidated.02.pth
wget https://agi.gpt4.org/llama/LLaMA/30B/consolidated.03.pth -O ./30B/consolidated.03.pth
wget https://agi.gpt4.org/llama/LLaMA/30B/params.json -O ./30B/params.json
wget https://agi.gpt4.org/llama/LLaMA/30B/checklist.chk -O ./30B/checklist.chk
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.00.pth -O ./65B/consolidated.00.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.01.pth -O ./65B/consolidated.01.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.02.pth -O ./65B/consolidated.02.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.03.pth -O ./65B/consolidated.03.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.04.pth -O ./65B/consolidated.04.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.05.pth -O ./65B/consolidated.05.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.06.pth -O ./65B/consolidated.06.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/consolidated.07.pth -O ./65B/consolidated.07.pth
wget https://agi.gpt4.org/llama/LLaMA/65B/params.json -O ./65B/params.json
wget https://agi.gpt4.org/llama/LLaMA/65B/checklist.chk -O ./65B/checklist.chk
————————————————
原文链接:https://blog.csdn.net/u014297502/article/details/129829677
之后安装,还会碰到一些缺包的问题,缺啥补啥就好。但还会遇到cannot import name '_itree' from partially initialized module 'itree' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e))
,具体报错信息如下:
python3 -m llama.download
Traceback (most recent call last):
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/itree/__init__.py", line 5, in <module>
from . import _itree
ImportError: cannot import name '_itree' from partially initialized module 'itree' (most likely due to a circular import) (/Users/paulo/Library/Python/3.9/lib/python/site-packages/itree/__init__.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 188, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 111, in _get_module_details
__import__(pkg_name)
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/llama/__init__.py", line 4, in <module>
from .model_single import ModelArgs, Transformer
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/llama/model_single.py", line 8, in <module>
import hiq
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/hiq/__init__.py", line 57, in <module>
from .tree import (
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/hiq/tree.py", line 9, in <module>
import itree
File "/Users/paulo/Library/Python/3.9/lib/python/site-packages/itree/__init__.py", line 7, in <module>
import _itree
ImportError: dlopen(/Users/paulo/Library/Python/3.9/lib/python/site-packages/_itree.cpython-39-darwin.so, 0x0002): tried: '/Users/paulo/Library/Python/3.9/lib/python/site-packages/_itree.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e)))
这就需要卸载pip uninstall py-itree
,然后手动下载,
pip install https://github.com/juncongmoo/itree/archive/refs/tags/v0.0.18.tar.gz
Bingo,一个比较傻的7B参数量LLaMA模型就可以成功部署在MacBookPro上了,如果需要提升聪明程度,可能就需要大内存开销来用65B的那么个模型。
conda_llm@Mac% ./examples/chat.sh
main: build = 801 (3e08ae9)
main: seed = 1688828046
llama.cpp: loading model from ./pyllama_data/7B/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.08 MB
llama_model_load_internal: mem required = 5439.94 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size = 256.00 MB
system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
Reverse prompt: 'User:'
sampling: repeat_last_n = 64, repeat_penalty = 1.000000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 48
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User:
Who is the president of France?
Bob: Nicolas Sarkozy.
User:Who am I?
Bob: You are a User.
User:Where am I?
Bob: You are in your home in Los Angeles, California.
User:Where are you?
Bob: I am in my home in Seattle, Washington.
User:How you doing?
Bob: I am doing fine, thank you.