Flink 命令行提交参数:
1 参数必选 :
-n,--container 分配多少个yarn容器 (=taskmanager的数量)
2 参数可选 :
-D 动态属性
-d,--detached 独立运行
-jm,--jobManagerMemory JobManager的内存 [in MB]
-nm,--name 在YARN上为一个自定义的应用设置一个名字
-q,--query 显示yarn中可用的资源 (内存, cpu核数)
-qu,--queue 指定YARN队列.
-s,--slots 每个TaskManager使用的slots数量
-tm,--taskManagerMemory 每个TaskManager的内存 [in MB]
-z,--zookeeperNamespace 针对HA模式在zookeeper上创建NameSpace
-id,--applicationId YARN集群上的任务id,附着到一个后台运行的yarn session中
3 run [OPTIONS]
run操作参数:
-c,--class 如果没有在jar包中指定入口类,则需要在这里通过这个参数指定
-m,--jobmanager 指定需要连接的jobmanager(主节点)地址,使用这个参数可以指定一个不同于配置文件中的jobmanager
-p,--parallelism 指定程序的并行度。可以覆盖配置文件中的默认值。
4 启动一个新的yarn-session,它们都有一个y或者yarn的前缀
例如:./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar
连接指定host和port的jobmanager:
./bin/flink run -m SparkMaster:1234 ./examples/batch/WordCount.jar -input hdfs://hostname:port/hello.txt -output hdfs://hostname:port/result1
启动一个新的yarn-session:
./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar -input hdfs://hostname:port/hello.txt -output hdfs://hostname:port/result1
5 注意:命令行的选项也可以使用./bin/flink 工具获得。
6 Action "run" compiles and runs a program.
Syntax: run [OPTIONS]
"run" action options:
-c,--class Class with the program entry point
("main" method or "getPlan()" method.
Only needed if the JAR file does not
specify the class in its manifest.
-C,--classpath Adds a URL to each user code
classloader on all nodes in the
cluster. The paths must specify a
protocol (e.g. file://) and be
accessible on all nodes (e.g. by means
of a NFS share). You can use this
option multiple times for specifying
more than one URL. The protocol must
be supported by the {@link
java.net.URLClassLoader}.
-d,--detached If present, runs the job in detached
mode
-n,--allowNonRestoredState Allow to skip savepoint state that
cannot be restored. You need to allow
this if you removed an operator from
your program that was part of the
program when the savepoint was
triggered.
-p,--parallelism The parallelism with which to run the
program. Optional flag to override the
default value specified in the
configuration.
-q,--sysoutLogging If present, suppress logging output to
standard out.
-s,--fromSavepoint Path to a savepoint to restore the job
from (for example
hdfs:///flink/savepoint-1537).
7 Options for yarn-cluster mode:
-d,--detached If present, runs the job in detached
mode
-m,--jobmanager Address of the JobManager (master) to
which to connect. Use this flag to
connect to a different JobManager than
the one specified in the
configuration.
-yD use value for given property
-yd,--yarndetached If present, runs the job in detached
mode (deprecated; use non-YARN
specific option instead)
-yh,--yarnhelp Help for the Yarn session CLI.
-yid,--yarnapplicationId Attach to running YARN session
-yj,--yarnjar Path to Flink jar file
-yjm,--yarnjobManagerMemory Memory for JobManager Container with
optional unit (default: MB)
-yn,--yarncontainer Number of YARN container to allocate
(=Number of Task Managers)
-ynl,--yarnnodeLabel Specify YARN node label for the YARN
application
-ynm,--yarnname Set a custom name for the application
on YARN
-yq,--yarnquery Display available YARN resources
(memory, cores)
-yqu,--yarnqueue Specify YARN queue.
-ys,--yarnslots Number of slots per TaskManager
-yst,--yarnstreaming Start Flink in streaming mode
-yt,--yarnship Ship files in the specified directory
(t for transfer)
-ytm,--yarntaskManagerMemory Memory per TaskManager Container with
optional unit (default: MB)
-yz,--yarnzookeeperNamespace Namespace to create the Zookeeper
sub-paths for high availability mode
-z,--zookeeperNamespace Namespace to create the Zookeeper
sub-paths for high availability mode
flink run命令执行模板:flink run [option]
-c,–class : 需要指定的main方法的类
-C,–classpath : 向每个用户代码添加url,他是通过UrlClassLoader加载。url需要指定文件的schema如(file://)
-d,–detached : 在后台运行
-p,–parallelism : job需要指定env的并行度,这个一般都需要设置。
-q,–sysoutLogging : 禁止logging输出作为标准输出。
-s,–fromSavepoint : 基于savepoint保存下来的路径,进行恢复。
-sae,–shutdownOnAttachedExit : 如果是前台的方式提交,当客户端中断,集群执行的job任务也会shutdown。
-m,–jobmanager : yarn-cluster集群
-yd,–yarndetached : 后台
-yjm,–yarnjobManager : jobmanager的内存
-ytm,–yarntaskManager : taskmanager的内存
-yn,–yarncontainer : TaskManager的个数
-yid,–yarnapplicationId : job依附的applicationId
-ynm,–yarnname : application的名称
-ys,–yarnslots : 分配的slots个数
例:flink run -m yarn-cluster -yd -yjm 1024m -ytm 1024m -ynm -ys 1
flink list:列出flink的job列表。
flink list -r/–runing :列出正在运行的job
flink list -s/–scheduled :列出已调度完成的job
flink cancel [options]
flink cancel -s/–withSavepoint
通过 -m 来指定要停止的 JobManager 的主机地址和端口
例: bin/flink cancel -m 127.0.0.1:8081 5e20cb6b0f357591171dfcca2eea09de
flink stop [options]
flink stop
通过 -m 来指定要停止的 JobManager 的主机地址和端口
例: bin/flink stop -m 127.0.0.1:8081 d67420e52bd051fae2fddbaa79e046bb
取消和停止(流作业)的区别如下:
flink modify
flink modify
例: flink modify -p 并行数
flink savepoint [options]
eg: # 触发保存点
flink savepoint
使用yarn触发保存点
flink savepoint
使用savepoint取消作业
flink cancel -s
从保存点恢复
flink run -s
如果复原的程序,对逻辑做了修改,比如删除了算子可以指定allowNonRestoredState参数复原。
flink run -s
savepoint 与 checkpoint 的区别