conda,jupyter lab 以及对应插件 的安装和配置

Jupyter环境安装和配置

    • 1.conda 安装
    • 2.创建虚拟环境
    • 3.jupyter lab 配置
    • 3.jupyter lab 插件配置
      • 1.scala内核
      • 2.spark内核
    • 4.启动

conda,jupyter lab 以及对应插件的安装和配置

1.conda 安装

参考:https://blog.csdn.net/LJX_ahut/article/details/114282900
conda国内下载地址:https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/

bash Anaconda3-2021.05-Linux-x86_64.sh -p /opt/module/anaconda_202105_install_env
source ~/.bashrc

添加国内镜像源:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --set show_channel_urls yes

2.创建虚拟环境

3.jupyter lab 配置

参考:https://www.cnblogs.com/liuxiaomo/p/13164530.html
conda自带jupyter lab
生成配置文件:jupyter lab --generate-config
生成密钥:

from jupyter_server.auth import passwd
passwd()

修改配置文件: vim /root/.jupyter/jupyter_lab_config.py

c.ServerApp.ip = ‘*’
c.ServerApp.password = “密钥”
c.ExtensionApp.open_browser = False
c.ServerApp.port = 8889
c.ServerApp.allow_remote_access = True
c.ServerApp.root_dir = ‘/home/jupyter-work-dir’

3.jupyter lab 插件配置

参考: https://blog.csdn.net/moledyzhang/article/details/78850820

1.scala内核

下载jupyter-scala-cli2.11内核包:https://oss.sonatype.org/content/repositories/snapshots/com/github/alexarchambault/jupyter/jupyter-scala-cli_2.11.6/0.2.0-SNAPSHOT/
安装:

tar xvf jupyter-scala_2.11.6-0.2.0-SNAPSHOT.tar.xz -C /opt/module/
bash /opt/module/jupyter-scala_2.11.6-0.2.0-SNAPSHOT/bin/jupyter-scala

2.spark内核

需要提前安装sbt和docker(前戏有点多,,)

git clone https://github.com/apache/incubator-toree.git
cd incubator-toree/

修改文件MAKEFILE,修改内容为:

APACHE_SPARK_VERSION?=2.4.5

make build  # sbt要绑定国内源,不然很慢
make dist  # docker要绑定国内源,不然很慢
cd dist/toree/bin/
ls
pwd  # 记住路径

在 /root/.ipython/kernels 创建目录spark,新建文件kernel.json,内容为(要记得粘贴时候,把注释去掉):

{
“display_name”: “Spark 2.4.5 (Scala 2.12.12)”,
“lauguage_info”: {“name”: “scala”},
“argv”: [
“/opt/module/incubator-toree-master/dist/toree/bin/run.sh”, # 改为上面的路径
“–profile”,
“{connection_file}”
],
“codemirror_mode”: “scala”,
“env”: {
“SPARK_OPTS”: “–master=local[2] --conf spark.sql.catalogImplementation=hive --driver-java-options=-Xms1024M --driver-java-options=-Xms4096M --driver-java-options=-Dlog4j.logLevel=info”, # 后面可以修改这些参数
“MAX_INTERPRETER_THREADS”: “16”,
“CAPTURE_STANDARD_OUT”: “true”,
“CAPTURE_STANDARD_ERR”: “true”,
“SEND_EMPTY_OUTPUT”: “false”,
“SPARK_HOME”: “/opt/module/spark-2.4.5”,
“PYTHONPATH”: “/opt/module/spark-2.4.5/python:/opt/module/spark-2.4.5/python/lib/py4j-0.10.7-src.zip” # 改为自己的路径
}
}

查看内核

jupyter kernelspec list

在这里插入图片描述
可以通过查看spark-shell的web界面来看运行进度:http://server:4040/jobs/

4.启动

jupyter lab --allow-root

你可能感兴趣的:(python,大数据,conda,jupyter,ide)