可用性较差,不采用这种方式进行开发。
ubuntu下:
sudo apt-get install filezilla
windows下可以去官网下载安装。
方式1、2不能上传镜像,方式3上传的镜像创建的环境不能进入。怀疑是镜像的原因。这一节可以直接跳过,直接看下一节,根据其公有镜像进行创建环境。
可以在dockerhub里搜索到相关镜像,但很遗憾没在网页端看到下载选项
4k的dockerfile文件一直在上传,莫不是在这个过程中就开始创建镜像了?
❯ sudo cat /etc/group | grep docker
docker:x:998:
❯ sudo usermod -aG docker wj
❯ sudo cat /etc/group | grep docker
❯ sudo chmod a+rw /var/run/docker.sock
❯ sudo systemctl restart docker
❯ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fcd0ba6349f8 hello-world "/hello" About an hour ago Exited (0) About an hour ago blissful_leavitt
如果加载本地镜像的话:
docker load < dockerimages.tar
从dockerhub上拉取第三方镜像(dockerhub上会给出命令)。
docker pull dromni/sdfstudio:0.2.1
❯ sudo docker pull dromni/sdfstudio:0.2.1
0.2.1: Pulling from dromni/sdfstudio
677076032cca: Pulling fs layer
bc572704fd22: Pulling fs layer
82ca2dd0fe9d: Pulling fs layer
335006729f70: Pulling fs layer
1b9f8e302abf: Pulling fs layer
120deaf0783e: Pulling fs layer
f7b8d7bf559f: Pull complete
e62d0dcce85d: Pull complete
dd4b12c0cbdb: Pull complete
96670d94e1e8: Pull complete
bb10049f791d: Pull complete
9e965195e9d1: Pull complete
f1484bec286b: Pull complete
f1196e20290a: Pull complete
c541d97ea6d8: Pull complete
7f511c789668: Pull complete
737bd131d2c1: Pull complete
270a40ad75d6: Pull complete
f0c0226e364b: Pull complete
6f9fdc754fdc: Pull complete
4f4fb700ef54: Pull complete
30485e8f47b6: Pull complete
d1cb36d9c606: Pull complete
db7430713eb7: Pull complete
19a01bfd85d1: Pull complete
63a1d18dba4d: Pull complete
132d02095598: Pull complete
9bc9681eb426: Pull complete
94c3a9acdb3e: Pull complete
Digest: sha256:1823de016219880ac14dae0bb2d3ba71636802683c24fc60f94bb08b484423e9
Status: Downloaded newer image for dromni/sdfstudio:0.2.1
docker.io/dromni/sdfstudio:0.2.1
下载成功,可通过docker images
查看。
编辑本地环境中的 /etc/hosts 文件,添加一条记录 registry.cluster.local ,IP 设置为AI Max头节点的IP地址,如:
……
……
192.168.124.95 registry.cluster.local
sudo mkdir -p /etc/docker/certs.d/registry.cluster.local
sudo wget -O /etc/docker/certs.d/registry.cluster.local/ca.crt http://192.168.124.95:5680/ca.crt
sudo docker tag myimage:v1.0 registry.cluster.local/user_username/myimage:v1.0
user_username中仅替换username为AI Max UI平台登录的用户名,user_是前缀,不可删除。
如:
sudo docker tag dromni/sdfstudio:0.2.1 registry.cluster.local/user_xxxx/sdfstudio:1.0
在“私有镜像”界面可以点击下载Docker仓库认证信息文件。
文件内容如下:
registry.cluster.local
登录的用户名口令是以上pushImagesDoc.txt
的文件中的用户名和密码。
sudo docker login registry.cluster.local
Username: xxxx
Password: xxxxxxxx
网络是个好东西,在另一篇帖子看到了解决方案:
在/etc/docker/daemon.json
加上"insecure-registries": ["https://registry.cluster.local"]
和"default-runtime": "nvidia"
,最终的daemon.json
文件就变成了:
{
"insecure-registries": ["https://registry.cluster.local"],
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
相应目录下没有这个daemon.json
就自己创建一个,加入以上内容。
然后执行
sudo systemctl daemon-reload
sudo systemctl restart docker
$ sudo docker push registry.cluster.local/user_username/myimage:v1.0
sudo docker push registry.cluster.local/user_xxxx/sdfstudio:1.0
The push refers to repository [registry.cluster.local/user_xxxx/sdfstudio]
5f70bf18a086: Preparing
75c9930f04e3: Preparing
d607f5331dd0: Preparing
775fd1ca67da: Preparing
5c85fd87e7d2: Preparing
c2cc2815d350: Preparing
c261386e14e8: Waiting
8f5a7461deb4: Waiting
5668a06c4f00: Waiting
6f9a406a17ed: Waiting
32a3407bed0d: Waiting
dc1bbc4db2ec: Waiting
77b9d6e4b433: Waiting
6be54aac0530: Waiting
77f74632d268: Waiting
4d6a42904634: Waiting
d04569d95086: Waiting
c2ecd79d5a18: Waiting
bd889e83e652: Waiting
d2e28f4121e3: Waiting
3a12ac953428: Waiting
11df89f48870: Waiting
2106d7cd1026: Waiting
f403f5c5948a: Waiting
8f6106a133b8: Waiting
af561c199f2f: Waiting
ea83d1f80fca: Waiting
65abf0edb23d: Waiting
c5ff2d88f679: Waiting
denied: requested access to the resource is denied
失败了!
换hello-world 镜像试试。
❯ sudo docker push registry.cluster.local/user_wuji/hello:v1.0
The push refers to repository [registry.cluster.local/user_wuji/hello]
01bb4fce3eb1: Preparing
denied: requested access to the resource is denied
依然失败,排除镜像原因,因为sdfstudio镜像有22G,hello-world镜像只有几K。
琢磨了一下,发现原因在于:登录用户的时候没有使用 sudo 命令,加上 sudo,重新登录。
再次推送,正常传输。
耗时18分钟,完成,可在私有镜像中找到。
由于镜像较大,这个准备的过程同样耗时较长。
成功,不出意外的话现在可以使用了。
下载安装 MobaXterm,Free-Protable(其实网上有破解版的,一则没必要,二则为了安全考虑)。
可通过新开一个Session,选择ssh输入相关信息,或者直接在终端中输入ssh命令
尝试一下aimax自带镜像,没有问题,可以正常连接。
不死心,在Dockerhub上重新找一个镜像试了一下,依然不行,大概率是要在他自带镜像的基础上再逐步完善了。
进入 /opt/data/private 可以看到自己上传的私有数据。
我这里配置的是sdfstudio环境,直接按官方教程来,也有之前的部署记录。
下面仅针对报错进行记录。
conda activate sdfstudio
时,会有相关报错 CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell
See 'conda init --help' for more information and options.
IMPORTANT: You may need to close and restart your shell after running 'conda init
执行
conda init bash
pip install -e .
时报错 ns-install-cli
报错(sdfstudio) root@sdfstudio:/opt/data/private/sdfstudio# ns-install-cli
[17:52:23] .zshrc not found, skipping. install.py:212
Found .bashrc! install.py:214
[17:52:24] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-install-cli. install.py:124
❌ Completion script generation failed: ['ns-render-mesh', '--tyro-print-completion', 'bash'] install.py:109
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-render-mesh", line 5, in <module>
from scripts.render_mesh import entrypoint
File "/opt/data/private/sdfstudio/scripts/render_mesh.py", line 12, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
❌ Completion script generation failed: ['ns-eval', '--tyro-print-completion', 'bash'] install.py:109
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-eval", line 5, in <module>
from scripts.eval import entrypoint
File "/opt/data/private/sdfstudio/scripts/eval.py", line 11, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
✔ Updated completion at /opt/data/private/sdfstudio/scripts/completions/bash/_ns-dev-test! install.py:122
[17:52:25] ✔ Updated completion at /opt/data/private/sdfstudio/scripts/completions/bash/_ns-process-data! install.py:122
[17:52:26] ❌ Completion script generation failed: ['ns-train', '--tyro-print-completion', 'bash'] install.py:109
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-train", line 5, in <module>
from scripts.train import entrypoint
File "/opt/data/private/sdfstudio/scripts/train.py", line 48, in <module>
from nerfstudio.configs import base_config as cfg
File "/opt/data/private/sdfstudio/nerfstudio/configs/base_config.py", line 197, in <module>
from nerfstudio.pipelines.base_pipeline import VanillaPipelineConfig
File "/opt/data/private/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 41, in
<module>
from nerfstudio.data.datamanagers.base_datamanager import (
File "/opt/data/private/sdfstudio/nerfstudio/data/datamanagers/base_datamanager.py", line
35, in <module>
from nerfstudio.cameras.cameras import CameraType
File "/opt/data/private/sdfstudio/nerfstudio/cameras/cameras.py", line 24, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
❌ Completion script generation failed: ['ns-download-data', '--tyro-print-completion', install.py:109
'bash']
❌ Completion script generation failed: ['ns-extract-mesh', '--tyro-print-completion', 'bash'] install.py:109
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-download-data", line 5, in <module>
from scripts.downloads.download_data import entrypoint
File "/opt/data/private/sdfstudio/scripts/downloads/download_data.py", line 17, in <module>
from nerfstudio.configs.base_config import PrintableConfig
File "/opt/data/private/sdfstudio/nerfstudio/configs/base_config.py", line 197, in <module>
from nerfstudio.pipelines.base_pipeline import VanillaPipelineConfig
File "/opt/data/private/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 41, in
<module>
from nerfstudio.data.datamanagers.base_datamanager import (
File "/opt/data/private/sdfstudio/nerfstudio/data/datamanagers/base_datamanager.py", line
35, in <module>
from nerfstudio.cameras.cameras import CameraType
File "/opt/data/private/sdfstudio/nerfstudio/cameras/cameras.py", line 24, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-extract-mesh", line 5, in <module>
from scripts.extract_mesh import entrypoint
File "/opt/data/private/sdfstudio/scripts/extract_mesh.py", line 16, in <module>
from nerfstudio.utils.eval_utils import eval_setup
File "/opt/data/private/sdfstudio/nerfstudio/utils/eval_utils.py", line 30, in <module>
from nerfstudio.configs import base_config as cfg
File "/opt/data/private/sdfstudio/nerfstudio/configs/base_config.py", line 197, in <module>
from nerfstudio.pipelines.base_pipeline import VanillaPipelineConfig
File "/opt/data/private/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 41, in
<module>
from nerfstudio.data.datamanagers.base_datamanager import (
File "/opt/data/private/sdfstudio/nerfstudio/data/datamanagers/base_datamanager.py", line
35, in <module>
from nerfstudio.cameras.cameras import CameraType
File "/opt/data/private/sdfstudio/nerfstudio/cameras/cameras.py", line 24, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/opt/conda/envs/sdfstudio/bin/ns-install-cli", line 8, in <module>
sys.exit(entrypoint())
File "/opt/data/private/sdfstudio/scripts/completions/install.py", line 284, in entrypoint
tyro.cli(main, description=__doc__)
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/tyro/_cli.py", line 177, in cli
output = _cli_impl(
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/tyro/_cli.py", line 430, in _cli_impl
out, consumed_keywords = _calling.call_from_args(
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/tyro/_calling.py", line 204, in call_from_args
return unwrapped_f(*positional_args, **kwargs), consumed_keywords # type: ignore
File "/opt/data/private/sdfstudio/scripts/completions/install.py", line 253, in main
completion_paths = list(
File "/opt/conda/envs/sdfstudio/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/opt/conda/envs/sdfstudio/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/opt/conda/envs/sdfstudio/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/opt/conda/envs/sdfstudio/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/data/private/sdfstudio/scripts/completions/install.py", line 255, in <lambda>
lambda path_or_entrypoint_and_shell: _generate_completion(
File "/opt/data/private/sdfstudio/scripts/completions/install.py", line 114, in _generate_completion
raise e
File "/opt/data/private/sdfstudio/scripts/completions/install.py", line 101, in _generate_completion
new = subprocess.run(
File "/opt/conda/envs/sdfstudio/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ns-download-data', '--tyro-print-completion', 'bash']' returned non-zero exit status 1.
❌ Completion script generation failed: ['ns-render', '--tyro-print-completion', 'bash'] install.py:109
Traceback (most recent call last): install.py:113
File "/opt/conda/envs/sdfstudio/bin/ns-render", line 5, in <module>
from scripts.render import entrypoint
File "/opt/data/private/sdfstudio/scripts/render.py", line 27, in <module>
from nerfstudio.cameras.camera_paths import get_path_from_json, get_spiral_path
File "/opt/data/private/sdfstudio/nerfstudio/cameras/camera_paths.py", line 27, in <module>
from nerfstudio.cameras.cameras import Cameras
File "/opt/data/private/sdfstudio/nerfstudio/cameras/cameras.py", line 24, in <module>
import cv2
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 181, in
<module>
bootstrap()
File "/opt/conda/envs/sdfstudio/lib/python3.8/site-packages/cv2/__init__.py", line 153, in
bootstrap
native_module = importlib.import_module("cv2")
File "/opt/conda/envs/sdfstudio/lib/python3.8/importlib/__init__.py", line 127, in
import_module
return _bootstrap._gcd_import(name, package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
解决办法:
apt-get update && apt-get install libgl1
icecream
和 cryptography
pip install icecream
pip install cryptography
再次执行ns-install-cli
,成功。
(sdfstudio) root@sdfstudio:/opt/data/private/sdfstudio# ns-install-cli
[19:32:24] .zshrc not found, skipping. install.py:212
Found .bashrc! install.py:214
[19:32:25] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-dev-test. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-install-cli. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-process-data. install.py:124
[19:32:36] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-eval. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-download-data. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-extract-mesh. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-render-mesh. install.py:124
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-render. install.py:124
[19:32:38] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-train. install.py:124
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-eval. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-install-cli. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-train. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-extract-mesh. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-process-data. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-render. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-download-data. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-render-mesh. install.py:270
Deleted /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-dev-test. install.py:270
Completions installed to /root/.bashrc. Exciting! Open a new shell to try them out. install.py:186
All done!
先小小测试一下。
(sdfstudio) root@sdfstudio:/opt/data/private/sdfstudio# ns-train -h
usage: ns-train [-h]
{testsdf,bakedangelo,neuralangelo,bakedsdf,bakedsdf-mlp,neus-facto-angelo,neus-facto,neus-fac
to-bigmlp,geo-volsdf,monosdf,volsdf,geo-neus,mono-neus,neus,unisurf,mono-unisurf,geo-unisurf,dto,neusW,neus-a
cc,nerfacto,instant-ngp,mipnerf,semantic-nerfw,vanilla-nerf,tensorf,dnerf,phototourism}
Train a radiance field with nerfstudio. For real captures, we recommend using the nerfacto model.
Nerfstudio allows for customizing your training and eval configs from the CLI in a powerful way, but there
are some things to understand.
The most demonstrative and helpful example of the CLI structure is the difference in output between the
following commands:
ns-train -h
ns-train nerfacto -h nerfstudio-data
ns-train nerfacto nerfstudio-data -h
In each of these examples, the -h applies to the previous subcommand (ns-train, nerfacto, and
nerfstudio-data).
In the first example, we get the help menu for the ns-train script. In the second example, we get the help
menu for the nerfacto model. In the third example, we get the help menu for the nerfstudio-data dataparser.
With our scripts, your arguments will apply to the preceding subcommand in your command, and thus where you
put your arguments matters! Any optional arguments you discover from running
ns-train nerfacto -h nerfstudio-data
need to come directly after the nerfacto subcommand, since these optional arguments only belong to the
nerfacto subcommand:
ns-train nerfacto {nerfacto optional args} nerfstudio-data
╭─ arguments ───────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help show this help message and exit │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ subcommands ─────────────────────────────────────────────────────────────────────────────────────────────╮
│ {testsdf,bakedangelo,neuralangelo,bakedsdf,bakedsdf-mlp,neus-facto-angelo,neus-facto,neus-facto-bigmlp,g… │
│ testsdf Implementation of TestSDF │
│ bakedangelo Implementation of Neuralangelo with BakedSDF │
│ neuralangelo Implementation of Neuralangelo │
│ bakedsdf Implementation of BackedSDF with multi-res hash grids │
│ bakedsdf-mlp Implementation of BackedSDF with large MLPs │
│ neus-facto-angelo Implementation of Neuralangelo with neus-facto │
│ neus-facto Implementation of NeuS similar to nerfacto where proposal sampler is used. │
│ neus-facto-bigmlp NeuS-facto with big MLP, it is used in training heritage data with 8 gpus │
│ geo-volsdf Implementation of patch warping from GeoNeuS with VolSDF. │
│ monosdf Implementation of MonoSDF. │
│ volsdf Implementation of VolSDF. │
│ geo-neus Implementation of patch warping from GeoNeuS with NeuS. │
│ mono-neus Implementation of MonoSDF with NeuS rendering formulation. │
│ neus Implementation of NeuS. │
│ unisurf Implementation of UniSurf. │
│ mono-unisurf Implementation of MonoSDF with unisurf rendering formulation. │
│ geo-unisurf Implementation of patch warping from GeoNeuS with UniSurf. │
│ dto Occupancy field with density guided sampling │
│ neusW Implementation of Neural Reconstruction in the wild │
│ neus-acc Implementation of NeuS with empty space skipping. │
│ nerfacto Recommended real-time model tuned for real captures. This model will be continually │
│ updated. │
│ instant-ngp Implementation of Instant-NGP. Recommended real-time model for bounded synthetic │
│ data. │
│ mipnerf High quality model for bounded scenes. (slow) │
│ semantic-nerfw Predicts semantic segmentations and filters out transient objects. │
│ vanilla-nerf Original NeRF model. (slow) │
│ tensorf tensorf │
│ dnerf Dynamic-NeRF model. (slow) │
│ phototourism Uses the Phototourism data. │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯
没毛病!
emmm,bash 没有耗时记录,考虑了一下,还是装一下zsh吧。
apt install zsh
mv .oh-my-zsh ~/.oh-my-zsh
cp ~/.oh-my-zsh/templates/zshrc.zsh-template ~/.zshrc
chsh -s /bin/zsh
mv powerlevel10k ~/.oh-my-zsh/custom/themes
mkdir ~/.fonts
mv MesloLGS* ~/.fonts/*
打开 ~/.bashrc,查看conda 相关配置
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/conda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/conda/etc/profile.d/conda.sh" ]; then
. "/opt/conda/etc/profile.d/conda.sh"
else
export PATH="/opt/conda/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
~/.zshrc
vim~/.zshrc
...
ZSH_THEME="powerlevel10k/powerlevel10k"
..
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/conda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/conda/etc/profile.d/conda.sh" ]; then
. "/opt/conda/etc/profile.d/conda.sh"
else
export PATH="/opt/conda/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
更新~/.zshrc
source ~/.zshrc
zsh 安装后还需要重新执行一下 ns-install-cli
/opt/da/p/sdfstudio ❯ ns-install-cli sdfstudio root@sdfstudio 21:51:55
[21:51:58] Found .zshrc! install.py:214
Found .bashrc! install.py:214
[21:51:59] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-install-cli. install.py:124
✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-install-cli! install.py:119
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-dev-test. install.py:124
✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-dev-test! install.py:119
[21:52:00] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-process-data. install.py:124
✔ Wrote new completion to install.py:119
/opt/data/private/sdfstudio/scripts/completions/zsh/_ns-process-data!
[21:52:55] ✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-download-data. install.py:124
✔ Wrote new completion to install.py:119
/opt/data/private/sdfstudio/scripts/completions/zsh/_ns-download-data!
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-eval. install.py:124
✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-eval! install.py:119
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-render. install.py:124
✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-render! install.py:119
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-extract-mesh. install.py:124
✔ Wrote new completion to install.py:119
/opt/data/private/sdfstudio/scripts/completions/zsh/_ns-extract-mesh!
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-render-mesh. install.py:124
✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-render-mesh! install.py:119
[21:52:57] ✔ Wrote new completion to /opt/data/private/sdfstudio/scripts/completions/zsh/_ns-train! install.py:119
✔ Nothing to do for /opt/data/private/sdfstudio/scripts/completions/bash/_ns-train. install.py:124
Completions installed to /root/.zshrc. Exciting! Open a new shell to try them out. install.py:186
Existing completions uninstalled from /root/.bashrc. install.py:180
Completions installed to /root/.bashrc. Ok! Open a new shell to try them out. install.py:186
All done!
完成,这下就可以使用集群硬件进行训练了。
然后可在私有镜像下看到这个镜像,然后可通过这个镜像创建新的环境。
数据较大时任务会被killed(这个问题通过提高内存没有得到解决,难道是显存或参数的原因?)。
降低分辨率后可以正常训练。
注意到断开ssh连接会导致任务终止,后续考虑通过任务训练方式进行训练,交互式开发可能是得一直处于交互状态才可以(可以用另一台电脑挂着)。