https://kserve.github.io/website/0.10/modelserving/v1beta1/serving_runtime/
KServe提供了一个简单的Kubernetes CRD,可以将单个或多个经过训练的模型部署到模型服务运行时,如TFServing、TorchServe、Triton Inference Server。此外,ModelServer是在KServe中使用预测v1协议实现的Python模型服务运行时,MLServer使用REST和gRPC实现预测v2协议。这些模型服务运行时能够提供开箱即用的模型服务,但您也可以选择为更复杂的用例构建自己的模型服务器。KServe提供了基本的API基元,使您可以轻松地构建自定义模型服务运行时,您可以使用BentML等其他工具来构建您的自定义模型服务镜像。
使用推理服务部署模型后,您将获得KServe提供的以下所有无服务器功能。
下表列出了KServe支持的每个模型服务运行时。HTTP和gRPC列指示服务运行时支持的预测协议版本。KServe预测协议被记为“v1”或“v2”。一些服务运行时也支持它们自己的预测协议,这些协议用*表示。默认的服务运行时版本列定义了服务运行时的源和版本——MLServer、KServe或它自己的。这些版本也可以在runtime kustomization YAML中找到。为运行时提供服务的所有KServe本机模型都使用当前的KServe发布版本(v0.10)。支持的框架版本列列出了支持的模型的主要版本。这些也可以在相应的runtime YAML中supportedModelFormats字段下找到。对于使用KServe服务运行时的模型框架,可以在KServe/python中找到特定的默认版本。在给定的服务运行时目录中,setup.py文件包含所使用的确切模型框架版本。例如,在kserve/python/lgbserver中,setup.py文件将模型框架版本设置为3.3.2,lightgbm==3.3.2。
Model Serving Runtime | Exported model | HTTP | gRPC | Default Serving Runtime Version | Supported Framework (Major) Version(s) | Examples |
---|---|---|---|---|---|---|
Custom ModelServer | – | v1, v2 | v2 | – | – | Custom Model |
LightGBM MLServer | Saved LightGBM Model | v2 | v2 | v1.0.0 (MLServer) | 3 | LightGBM Iris V2 |
LightGBM ModelServer | Saved LightGBM Model | v1 | – | v0.10 (KServe) | 3 | LightGBM Iris |
MLFlow ModelServer | Saved MLFlow Model | v2 | v2 | v1.0.0 (MLServer) | 1 | MLFLow wine-classifier |
PMML ModelServer | PMML | v1 | – | v0.10 (KServe) | 3, 4 (PMML4.4.1) | SKLearn PMML |
SKLearn MLServer | Pickled Model | v2 | v2 | v1.0.0 (MLServer) | 1 | SKLearn Iris V2 |
SKLearn ModelServer | Pickled Model | v1 | – | v0.10 (KServe) | 1 | SKLearn Iris |
TFServing | TensorFlow SavedModel | v1 | *tensorflow | 2.6.2 (TFServing Versions) | 2 | TensorFlow flower |
TorchServe | Eager Model/TorchScript | v1, v2, *torchserve | *torchserve | 0.7.0 (TorchServe) | 1 | TorchServe mnist |
Triton Inference Server | TensorFlow,TorchScript,ONNX | v2 | v2 | 21.09-py3 (Triton) | 8 (TensoRT), 1, 2 (TensorFlow), 1 (PyTorch), 2 (Triton) Compatibility Matrix | Torchscript cifar |
XGBoost MLServer | Saved Model | v2 | v2 | v1.0.0 (MLServer) | 1 | XGBoost Iris V2 |
XGBoost ModelServer | Saved Model | v1 | – | v0.10 (KServe) | 1 | XGBoost Iris |
*tensorflow-除了KServe的预测协议外,tensorflow还实现了自己的预测协议。请参阅:Tensorflow服务预测API文档
*torchserve-PyTorch除了KServe之外,还实现了自己的预引用协议。请参阅:Torchserve gRPC API文档
笔记
提供运行时版本的模型可以用推理服务yaml上的runtimeVersion字段覆盖,我们强烈建议为生产服务设置此字段。
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchscript-cifar"
spec:
predictor:
triton:
storageUri: "gs://kfserving-examples/models/torchscript"
runtimeVersion: 21.08-py3
创建一个指定框架tensorflow和指向已保存的tensorflow模型的storageUri的推理服务yaml,并将其命名为tensorflow.yaml。
旧架构
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-sample"
spec:
predictor:
tensorflow:
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
新架构
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-sample"
spec:
predictor:
model:
modelFormat:
name: tensorflow
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
kubectl
kubectl apply -f tensorflow.yaml
期望输出
$ inferenceservice.serving.kserve.io/flower-sample created
等待推理服务处于就绪状态
kubectl get isvc flower-sample
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
flower-sample http://flower-sample.default.example.com True 100 flower-sample-predictor-default-n9zs6 7m15s
第一步是确定入口IP和端口,并设置ingress_HOST和ingress_PORT,在这里可以下载推理请求输入文件。
MODEL_NAME=flower-sample
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
期望输出
* Connected to localhost (::1) port 8080 (#0)
> POST /v1/models/tensorflow-sample:predict HTTP/1.1
> Host: tensorflow-sample.default.example.com
> User-Agent: curl/7.73.0
> Accept: */*
> Content-Length: 16201
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 16201 out of 16201 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 222
< content-type: application/json
< date: Sun, 31 Jan 2021 01:01:50 GMT
< x-envoy-upstream-service-time: 280
< server: istio-envoy
<
{
"predictions": [
{
"scores": [0.999114931, 9.20987877e-05, 0.000136786213, 0.000337257545, 0.000300532585, 1.84813616e-05],
"prediction": 0,
"key": " 1"
}
]
}
Canary Rollout是控制推出新模型风险的一个好方法,首先将一小部分流量转移到它身上,然后逐渐增加这个百分比。要运行canary rollout,可以应用指定了canaryTrafficPercent字段的canary.yaml。
旧架构
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-sample"
spec:
predictor:
canaryTrafficPercent: 20
tensorflow:
storageUri: "gs://kfserving-examples/models/tensorflow/flowers-2"
新架构
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-sample"
spec:
predictor:
canaryTrafficPercent: 20
model:
modelFormat:
name: tensorflow
storageUri: "gs://kfserving-examples/models/tensorflow/flowers-2"
应用canary.yaml创建金丝雀推理服务。
kubectl
kubectl apply -f canary.yaml
要验证流量分割百分比是否正确应用,可以运行以下命令:
kubectl get isvc flower-sample
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
flower-sample http://flower-sample.default.example.com True 80 20 flower-sample-predictor-default-n9zs6 flower-sample-predictor-default-2kwtr 7m15s
正如您所看到的,流量在上一次推出的修订版和当前最新准备的修订版之间分配,KServe会自动为您跟踪上一次发布的(稳定的)修订版,因此您不需要像v1alpha2那样在推理服务上同时维护default和金丝雀。
创建推理服务,它暴露gRPC端口,默认情况下它侦听端口9000。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-grpc"
spec:
predictor:
tensorflow:
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
ports:
- containerPort: 9000
name: h2c
protocol: TCP
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flower-grpc"
spec:
predictor:
model:
modelFormat:
name: tensorflow
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
ports:
- containerPort: 9000
name: h2c
protocol: TCP
应用grp.yaml创建grpc推理服务。
kubectl
kubectl apply -f grpc.yaml
期望输出
$ inferenceservice.serving.kserve.io/flower-grpc created
我们使用python gRPC客户端进行预测,因此您需要创建一个python虚拟环境并安装tensorflow服务api。
# The prediction script is written in TensorFlow 1.x
pip install tensorflow-serving-api>=1.14.0,<2.0.0
运行gRPC预测脚本。
MODEL_NAME=flower-grpc
INPUT_PATH=./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
python grpc_client.py --host $INGRESS_HOST --port $INGRESS_PORT --model $MODEL_NAME --hostname $SERVICE_HOSTNAME --input_path $INPUT_PATH
期望输出
outputs {
key: "key"
value {
dtype: DT_STRING
tensor_shape {
dim {
size: 1
}
}
string_val: " 1"
}
}
outputs {
key: "prediction"
value {
dtype: DT_INT64
tensor_shape {
dim {
size: 1
}
}
int64_val: 0
}
}
outputs {
key: "scores"
value {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 1
}
dim {
size: 6
}
}
float_val: 0.9991149306297302
float_val: 9.209887502947822e-05
float_val: 0.00013678647519554943
float_val: 0.0003372581850271672
float_val: 0.0003005331673193723
float_val: 1.848137799242977e-05
}
}
model_spec {
name: "flowers-sample"
version {
value: 1
}
signature_name: "serving_default"
}
在本例中,我们部署了一个经过训练的PyTorch MNIST模型,通过使用TorchServe runtime运行推理服务来预测手写数字,TorchServe运行时是PyTorch模型的默认安装服务运行时。模型可解释性也是一个重要方面,它有助于理解哪些输入特征对特定分类很重要。Captum是一个模型可解释性库。在这个例子中,TorchServe解释端点是用Captum最先进的算法实现的,包括集成的梯度,为用户提供了一种简单的方式来理解哪些特征有助于模型输出。您可以参考Captum教程了解更多示例。
KServe/TorchServe集成需要以下模型商店布局。
├── config
│ ├── config.properties
├── model-store
│ ├── densenet_161.mar
│ ├── mnist.mar
TorchServe提供了一个实用程序,可以将所有模型工件打包到一个TorchServe Model Archive File (MAR)中。将模型工件打包到MAR文件中后,您将上传到模型存储路径下的模型存储。
您可以将模型和相关文件存储在远程存储或本地持久卷上。MNIST模型和相关文件可以从这里获得。
笔记
对于远程存储,您可以选择使用存储在KServe example GCS bucket gs://kfserving examples/models/torchserver/image_classifier上的预构建MNIST MAR文件启动示例,或者使用torch模型归档器生成MAR文件,并根据上述布局在远程存储上创建模型存储。
torch模型归档器--模型名称mnist--1.0版\
--模型文件模型归档器/模型库/mnist/mnist.py\
--序列化文件模型归档器/模型存储/mnist/mnist_cnn.pt\
--处理程序模型归档器/模型存储区/mnist/mnist_handler.py\
对于PVC用户,请参阅模型档案文件生成,以自动生成带有模型和从属文件的MAR文件。
TorchServe使用config.properties文件来存储配置。有关配置文件支持的属性的更多详细信息,请参阅此处。以下是KServe的示例文件:
inference_address=http://0.0.0.0:8085
management_address=http://0.0.0.0:8085
metrics_address=http://0.0.0.0:8082
grpc_inference_port=7070
grpc_management_port=7071
enable_metrics_api=true
metrics_format=prometheus
number_of_netty_threads=4
job_queue_size=10
enable_envvars_config=true
install_py_dep_per_model=true
model_store=/mnt/models/model-store
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"mnist":{"1.0":{"defaultVersion":true,"marName":"mnist.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":10,"responseTimeout":120}}}}
KServe/TorchServe集成支持KServe v1/v2 REST协议。在config.properties中,我们需要打开标志enable_envvars_config,以便使用环境变量设置KServe信封。
警告
以前的service_envelope属性已被弃用,并且在config.properties文件中使用标志enable_envvars_config=true来启用在运行时设置服务信封。请求从KServe推理请求格式转换为TorchServe请求格式,并发送到通过本地套接字配置的推理地址。
当您在新模型规范上指定模型格式pytorch时,KServe默认选择TorchServe运行时。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
pytorch:
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
model:
modelFormat:
name: pytorch
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
要在CPU上部署模型,请应用以下torchserver.yaml来创建推理服务。
kubectl
kubectl apply -f torchserve.yaml
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
pytorch:
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
resources:
limits:
memory: 4Gi
nvidia.com/gpu: "1"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
model:
modelFormat:
name: pytorch
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
resources:
limits:
memory: 4Gi
nvidia.com/gpu: "1"
要在GPU上部署模型,请应用gpu.yaml来创建GPU推理服务。
kubectl
kubectl apply -f gpu.yaml
期望输出
$ inferenceservice.serving.kserve.io/torchserve created
第一步是确定入口IP和端口并设置INGRESS_HOST 和 INGRESS_PORT
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
您可以使用图像转换器将图像转换为基于64字节的数组,其他型号请参阅输入请求。
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
期望输出
* Trying 52.89.19.61...
* Connected to a881f5a8c676a41edbccdb0a394a80d6-2069247558.us-west-2.elb.amazonaws.com (52.89.19.61) port 80 (#0)
> PUT /v1/models/mnist HTTP/1.1
> Host: torchserve.kserve-test.example.com
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Length: 167
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< cache-control: no-cache; no-store, must-revalidate, private
< content-length: 1
< date: Tue, 27 Oct 2020 08:26:19 GMT
< expires: Thu, 01 Jan 1970 00:00:00 UTC
< pragma: no-cache
< x-request-id: b10cfc9f-cd0f-4cda-9c6c-194c2cdaa517
< x-envoy-upstream-service-time: 6
< server: istio-envoy
<
* Connection #0 to host a881f5a8c676a41edbccdb0a394a80d6-2069247558.us-west-2.elb.amazonaws.com left intact
{"predictions": ["2"]}
要获取模型说明:
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/mnist:explain -d @./mnist.json
期望输出
{"explanations": [[[[0.0005394675730469475, -0.0022280013123036043, -0.003416480100841055, -0.0051329881112415965, -0.009973864160829985, -0.004112560908882716, -0.009223458030656112, -0.0006676354577291628, -0.005249806664413386, -0.0009790519227372953, -0.0026914653993121195, -0.0069470097151383995, -0.00693530415962956, -0.005973878697847718, -0.00425042437288857, 0.0032867281838150977, -0.004297780258633562, -0.005643196661192014, -0.00653025019738562, -0.0047062916121001185, -0.0018656628277792628, -0.0016757477204072532, -0.0010410417081844845, -0.0019093520822156726, -0.004451403461006374, -0.0008552767257773671, -0.0027638888169885267, -0.0], [0.006971297052106784, 0.007316855222185687, 0.012144494329150574, 0.011477799383288441, 0.006846725347670252, 0.01149386176451476, 0.0045351987881190655, 0.007038361889638708, 0.0035855377023272157, 0.003031419502053957, -0.0008611575226775316, -0.0011085224745969223, -0.0050840743637658534, 0.009855491784340777, 0.007220680811043034, 0.011374285598070253, 0.007147725481709019, 0.0037114580912849457, 0.00030763245479291384, 0.0018305492665953394, 0.010106224395114147, 0.012932881164284687, 0.008862892007714321, 0.0070960526615982435, -0.0015931137903787505, 0.0036495747329455906, 0.0002593849391051298, -0.0], [0.006467265785857396, -0.00041793201228071674, 0.004900316089756856, 0.002308395474823997, 0.007859295399592283, 0.003916404948969494, 0.005630750246437249, 0.0043712538044184375, 0.006128530599133763, -0.009446321309831246, -0.014173645867037036, -0.0062988650915794565, -0.011473838941118539, -0.009049151947644047, -0.0007625645864610934, -0.013721416630061238, -0.0005580156670410108, 0.0033404383756480784, -0.006693278798487951, -0.003705084551144756, 0.005100375089529131, 5.5276874714401074e-05, 0.007221745280359063, -0.00573598303916232, -0.006836169033785967, 0.0025401608627538936, 9.303533912921196e-05, -0.0], [0.005914399808621816, 0.00452643561023696, 0.003968242261515448, 0.010422786058967673, 0.007728358107899074, 0.01147115923288383, 0.005683869479056691, 0.011150670502307374, 0.008742555292485278, 0.0032882897575743754, 0.014841138421861584, 0.011741228362482451, 0.0004296862879259221, -0.0035118140680654854, -0.006152254410078331, -0.004925121936901983, -2.3611205202801947e-06, 0.029347073037039074, 0.02901626308947743, 0.023379353021343398, 0.004027157620197582, -0.01677662249919171, -0.013497255736128979, 0.006957482854214602, 0.0018321766800746145, 0.008277034396684563, 0.002733405455464871, -0.0], [0.0049579739156640065, -0.002168016158233997, 0.0020644317321723642, 0.0020912464240293825, 0.004719691119907336, 0.007879231202446626, 0.010594445898145937, 0.006533067778982801, 0.002290214592708113, -0.0036651114968251986, 0.010753227423379443, 0.006402706020466243, -0.047075193909339695, -0.08108259303568185, -0.07646875196692542, -0.1681834845371156, -0.1610307396135756, -0.12010309927453829, -0.016148831320070896, -0.009541525999486027, 0.04575604594761406, 0.031470966329886635, 0.02452149438024385, 0.016594078577569567, 0.012213591301610382, -0.002230875840404426, 0.0036704051254298374, -0.0], [0.006410107592414739, 0.005578283890924384, 0.001977103461731095, 0.008935476507124939, 0.0011305055729953436, 0.0004946313900665659, -0.0040266029554395935, -0.004270765544167256, -0.010832150944943138, -0.01653511868336456, -0.011121302103373972, -0.42038514526905024, -0.22874576003118394, -0.16752936178907055, -0.17021699697722079, -0.09998584936787697, -0.09041117495322142, -0.10230248444795721, -0.15260897522094888, 0.07770835838531896, -0.0813761125123066, 0.027556910053932963, 0.036305965104261866, 0.03407793793894619, 0.01212761779302579, 0.006695133380685627, 0.005331392748588556, -0.0], [0.008342680065996267, -0.00029249776150416367, 0.002782130291086583, 0.0027793744856745373, 0.0020525102690845407, 0.003679269934110004, 0.009373846012918791, -0.0031751745946300403, -0.009042846256743316, 0.0074141593032070775, -0.02796812516561052, -0.593171583786029, -0.4830164472795136, -0.353860128479443, -0.256482708704862, 0.11515586314578445, 0.12700563162828346, 0.0022342450630152204, -0.24673707669992118, -0.012878340813781437, 0.16866821780196756, 0.009739033161051434, -0.000827843726513152, -0.0002137320694585577, -0.004179480126338929, 0.008454049232317358, -0.002767934266266998, -0.0], [0.007070382982749552, 0.005342127805750565, -0.000983984198542354, 0.007910101170274493, 0.001266267696096404, 0.0038575136843053844, 0.006941130321773131, -0.015195182020687892, -0.016954974010578504, -0.031186444096787943, -0.031754626467747966, 0.038918845112017694, 0.06248943950328597, 0.07703301092601872, 0.0438493628024275, -0.0482404449771698, -0.08718650815999045, -0.0014764704694506415, -0.07426336448916614, -0.10378029666564882, 0.008572087846793842, -0.00017173413848283343, 0.010058893270893113, 0.0028410498666004377, 0.002008290211806285, 0.011905375389931099, 0.006071375802943992, -0.0], [0.0076080165949142685, -0.0017127333725310495, 0.00153128150106188, 0.0033391793764531563, 0.005373442509691564, 0.007207746020295443, 0.007422946703693544, -0.00699779191449194, 0.002395328253696969, -0.011682618874195954, -0.012737004464649057, -0.05379966383523857, -0.07174960461749053, -0.03027341304050314, 0.0019411862216381327, -0.0205575129473766, -0.04617091711614171, -0.017655308106959804, -0.009297162816368814, -0.03358572117988279, -0.1626068444778013, -0.015874364762085157, -0.0013736074085577258, -0.014763439328689378, 0.00631805792697278, 0.0021769414283267273, 0.0023061635006792498, -0.0], [0.005569931813561535, 0.004363218328087518, 0.00025609463218383973, 0.009577483244680675, 0.007257755916229399, 0.00976284778532342, -0.006388840235419147, -0.009017880790555707, -0.015308709334434867, -0.016743935775597355, -0.04372596546189275, -0.03523469356755156, -0.017257810114846107, 0.011960489902313411, 0.01529079831828911, -0.020076559119468443, -0.042792547669901516, -0.0029492027218867116, -0.011109560582516062, -0.12985858077848939, -0.2262858575494602, -0.003391725540087574, -0.03063368684328981, -0.01353486587575121, 0.0011140822443932317, 0.006583451102528798, 0.005667533945285076, -0.0], [0.004056272267155598, -0.0006394041203204911, 0.004664893926197093, 0.010593032387298614, 0.014750931538689989, 0.015428721146282149, 0.012167820222401367, 0.017604752451202518, 0.01038886849969188, 0.020544326931163263, -0.0004206566917812794, -0.0037463581359232674, -0.0024656693040735075, 0.0026061897697624353, -0.05186055271869177, -0.09158655048397382, 0.022976389912563913, -0.19851635458461808, -0.11801281807622972, -0.29127727790584423, -0.017138655663803876, -0.04395515676468641, -0.019241432506341576, 0.0011342298743447392, 0.0030625771422964584, -0.0002867924892991192, -0.0017908808807543712, -0.0], [0.0030114260660488892, 0.0020246448273580006, -0.003293361220376816, 0.0036965043883218584, 0.00013185761728146236, -0.004355610866966878, -0.006432601921104354, -0.004148701459814858, 0.005974553907915845, -0.0001399233607281906, 0.010392944122965082, 0.015693249298693028, 0.0459528427528407, -0.013921539948093455, -0.06615556518538708, 0.02921438991320325, -0.16345220625101778, -0.002130491295590408, -0.11449749664916867, -0.030980255589300607, -0.04804122537359171, -0.05144994776295644, 0.005122827412776085, 0.006464862173908011, 0.008624278272940246, 0.0037316228508156427, 0.0036947794337026706, -0.0], [0.0038173843228389405, -0.0017091931226819494, -0.0030871869816778068, 0.002115642501535999, -0.006926441921580917, -0.003023077828426468, -0.014451359520861637, -0.0020793048380231397, -0.010948003939342523, -0.0014460716966395166, -0.01656990336897737, 0.003052317148320358, -0.0026729564809943513, -0.06360067057346147, 0.07780985635080599, -0.1436689936630281, -0.040817177623437874, -0.04373367754296477, -0.18337299150349698, 0.025295182977407064, -0.03874921104331938, -0.002353901742617205, 0.011772560401335033, 0.012480994515707569, 0.006498422579824301, 0.00632320984076023, 0.003407169765754805, -0.0], [0.00944355257990139, 0.009242583578688485, 0.005069860444386138, 0.012666191449103024, 0.00941789912565746, 0.004720427012836104, 0.007597687789204113, 0.008679266528089945, 0.00889322771021875, -0.0008577904940828809, 0.0022973860384607604, 0.025328230809207493, -0.09908781123080951, -0.07836626399832172, -0.1546141264726177, -0.2582207272050766, -0.2297524599578219, -0.29561835103416967, 0.12048787956671528, -0.06279365699861471, -0.03832012404275233, 0.022910264999199934, 0.005803508497672737, -0.003858461926053348, 0.0039451232171312765, 0.003858476747495933, 0.0013034515558609956, -0.0], [0.009725756015628606, -0.0004001101998876524, 0.006490722835571152, 0.00800808023631959, 0.0065880711806331265, -0.0010264326176194034, -0.0018914305972878344, -0.008822522194658438, -0.016650520788128117, -0.03254382594389507, -0.014795713101569494, -0.05826499837818885, -0.05165369567511702, -0.13384277337594377, -0.22572641373340493, -0.21584739544668635, -0.2366836351939208, 0.14937824076489659, -0.08127414932170171, -0.06720440139736879, -0.0038552732903526744, 0.0107597891707803, -5.67453590118174e-05, 0.0020161340511396244, -0.000783322694907436, -0.0006397207517995289, -0.005291639205010064, -0.0], [0.008627543242777584, 0.007700097300051849, 0.0020430960246806138, 0.012949015733198586, 0.008428709579953574, 0.001358177022953576, 0.00421863939925833, 0.002657580000868709, -0.007339431957237175, 0.02008439775442315, -0.0033717631758033114, -0.05176633249899187, -0.013790328758662772, -0.39102366157050594, -0.167341447585844, -0.04813367828213947, 0.1367781582239039, -0.04672809260566293, -0.03237784669978756, 0.03218068777925178, 0.02415063765016493, -0.017849899351200002, -0.002975675228088795, -0.004819438014786686, 0.005106898651831245, 0.0024278620704227456, 6.784303333368138e-05, -0.0], [0.009644258527009343, -0.001331907219439711, -0.0014639718434477777, 0.008481926798958248, 0.010278031715467508, 0.003625808326891529, -0.01121188617599796, -0.0010634587872994379, -0.0002603820881968461, -0.017985648016990465, -0.06446652745470374, 0.07726063173046191, -0.24739929795334742, -0.2701855018480216, -0.08888614776216278, 0.1373325760136816, -0.02316068912438066, -0.042164834956711514, 0.0009266091344106458, 0.03141872420427644, 0.011587728430225652, 0.0004755143243520787, 0.005860642609620605, 0.008979633931394438, 0.005061734169974005, 0.003932710387086098, 0.0015489986106803626, -0.0], [0.010998736164377534, 0.009378969800902604, 0.00030577045264713074, 0.0159329353530375, 0.014849508018911006, -0.0026513365659554225, 0.002923303082126996, 0.01917908707828847, -0.02338288107991566, -0.05706674679291175, 0.009526265752669624, -0.19945255386401284, -0.10725519695909647, -0.3222906835083537, -0.03857038318412844, -0.013279804965996065, -0.046626023244262085, -0.029299060237210447, -0.043269580558906555, -0.03768510002290657, -0.02255977771908117, -0.02632588166863199, -0.014417349488098566, -0.003077271951572957, -0.0004973277708010661, 0.0003475839139671271, -0.0014522783025903258, -0.0], [0.012215315671616316, -0.001693194176229889, 0.011365785434529038, 0.0036964574178487792, -0.010126738168635003, -0.025554378647710443, 0.006538003839811914, -0.03181759044467965, -0.016424751042854728, 0.06177539736110035, -0.43801735323216856, -0.29991040815937386, -0.2516019795363623, 0.037789523540809, -0.010948746374759491, -0.0633901687126727, -0.005976006160777705, 0.006035133605976937, -0.04961632526071937, -0.04142116972831476, -0.07558952727782252, -0.04165176179187153, -0.02021603856619006, -0.0027365663096057032, -0.011145473712733575, 0.0003566937349350848, -0.00546472985268321, -0.0], [0.008009386447317503, 0.006831207743885825, 0.0051306149795546365, 0.016239014770865052, 0.020925441734273218, 0.028344800173195076, -0.004805080609285047, -0.01880521614501033, -0.1272329010865855, -0.39835936819190537, -0.09113694760349819, -0.04061591094832608, -0.12677021961235907, 0.015567707226741051, -0.005615051546243333, -0.06454044862001587, 0.0195457674752272, -0.04219686517155871, -0.08060569979524296, 0.027234494361702787, -0.009152881336047056, -0.030865118003992217, -0.005770311060090559, 0.002905833371986098, 5.606663556872091e-05, 0.003209538083839772, -0.0018588810743365345, -0.0], [0.007587008852984699, -0.0021213639853557625, 0.0007709558092903736, 0.013883256128746423, 0.017328713012428214, 0.03645357525636198, -0.04043993335238427, 0.05730125171252314, -0.2563293727512057, -0.11438826083879326, 0.02662382809034687, 0.03525271352483709, 0.04745678120172762, 0.0336360484090392, -0.002916635707204059, -0.17950855098650784, -0.44161773297052964, -0.4512180227831197, -0.4940283106297913, -0.1970108671285798, 0.04344323143078066, -0.012005120444897523, 0.00987576109166055, -0.0018336757466252476, 0.0004913959502151706, -0.0005409724034216215, -0.005039223900868212, -0.0], [0.00637876531169957, 0.005189469227685454, 0.0007676355246000376, 0.018378100865097655, 0.015739815031394887, -0.035524983116512455, 0.03781006978038308, 0.28859052096740495, 0.0726464110153121, -0.026768468497420147, 0.06278766200288134, 0.17897045813699355, -0.13780371920803108, -0.14176458123649577, -0.1733103177731656, -0.3106508869296763, 0.04788355140275794, 0.04235327890285105, -0.031266625292514394, -0.016263819217960652, -0.031388328800811355, -0.01791363975905968, -0.012025067979443894, 0.008335083985905805, -0.0014386677797296231, 0.0055376544652972854, 0.002241522815466253, -0.0], [0.007455256326741617, -0.0009475207572210404, 0.0020288385162615286, 0.015399640135796092, 0.021133843188103074, -0.019846405097622234, -0.003162485751163173, -0.14199005055318842, -0.044200898667146035, -0.013395459413208084, 0.11019680479230103, -0.014057216041764874, -0.12553853334447865, -0.05992513534766256, 0.06467942189539834, 0.08866056095907732, -0.1451321508061849, -0.07382491447758655, -0.046961739981080476, 0.0008943713493160624, 0.03231044103656507, 0.00036034241706501196, -0.011387669277619417, -0.00014602449257226195, -0.0021863729003374116, 0.0018817840156005856, 0.0037909804578166286, -0.0], [0.006511855618626698, 0.006236866054439829, -0.001440571166157676, 0.012795776609942026, 0.011530545030403624, 0.03495489377257363, 0.04792403136095304, 0.049378583599065225, 0.03296101702085617, -0.0005351385876652296, 0.017744115897640366, 0.0011656622496764954, 0.0232845869823761, -0.0561191397060232, -0.02854070511118366, -0.028614174047247348, -0.007763531086362863, 0.01823079560098924, 0.021961392405283622, -0.009666681805706179, 0.009547046884328725, -0.008729943263791338, 0.006408909680578429, 0.009794327096359952, -0.0025825219195515304, 0.007063559189211571, 0.007867244119267047, -0.0], [0.007936663546039311, -0.00010710180170593153, 0.002716512705673228, 0.0038633557307721487, -0.0014877316616940372, -0.0004788143065635909, 0.012508842248031202, 0.0045381104608414645, -0.010650910516128294, -0.013785341529644855, -0.034287643221318206, -0.022152707546335495, -0.047056481347685974, -0.032166744564720455, -0.021551611335278546, -0.002174962503376043, 0.024344287130424306, 0.015579272560525105, 0.010958169741952194, -0.010607232913436921, -0.005548369726118836, -0.0014630046444242706, 0.013144180105016433, 0.0031349366359021916, 0.0010984887428255974, 0.005426941473328394, 0.006566511860044785, -0.0], [0.0005529184874606495, 0.00026139355020588705, -0.002887623443531047, 0.0013988462990850632, 0.00203365139495493, -0.007276926701775218, -0.004010419939595932, 0.017521952161185662, 0.0006996977433557911, 0.02083134683611201, 0.013690533534289498, -0.005466724359976675, -0.008857712321334327, 0.017408578822635818, 0.0076439343049154425, 0.0017861314923539985, 0.007465865707523924, 0.008034420825988495, 0.003976298558337994, 0.00411970637898539, -0.004572592545819698, 0.0029563907011979935, -0.0006382227820088148, 0.0015153753877889707, -0.0052626601797995595, 0.0025664706985019416, 0.005161751034260073, -0.0], [0.0009424280561998445, -0.0012942360298110595, 0.0011900868416523343, 0.000984424113178899, 0.0020988269382781564, -0.005870080062890889, -0.004950484744457169, 0.003117643454332697, -0.002509563565777083, 0.005831604884101081, 0.009531085216183116, 0.010030206821909806, 0.005858190171099734, 4.9344529936340524e-05, -0.004027895832421331, 0.0025436439920587606, 0.00531153867563076, 0.00495942692369508, 0.009215148318606382, 0.00010011928317543458, 0.0060051362999805355, -0.0008195376963202741, 0.0041728603512658224, -0.0017597169567888774, -0.0010577007775543158, 0.00046033327178068433, -0.0007674196306044449, -0.0], [-0.0, -0.0, 0.0013386963856532302, 0.00035183178922260837, 0.0030610334903526204, 8.951834979315781e-05, 0.0023676793550483524, -0.0002900551076915047, -0.00207019445286608, -7.61697478482574e-05, 0.0012150086715244216, 0.009831239281792168, 0.003479667642621962, 0.0070584324334114525, 0.004161851261339585, 0.0026146296354490665, -9.194746959222099e-05, 0.0013583866966571571, 0.0016821551239318913, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0]]]]}
当您在新的模型规范中指定模型格式pytorch并启用KServe v1推理协议时,KServe默认选择TorchServe运行时。要启用v2推理协议,请指定值为v2的protocolVersion字段。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve-mnist-v2"
spec:
predictor:
pytorch:
protocolVersion: v2
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v2
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve-mnist-v2"
spec:
predictor:
model:
modelFormat:
name: pytorch
protocolVersion: v2
storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v2
要在CPU上部署模型,请应用mnist_v2.yaml来创建推理服务。
kubectl
kubectl apply -f mnist_v2.yaml
期望输出
$ inferenceservice.serving.kserve.io/torchserve-mnist-v2 created
第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT。
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve-mnist-v2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)
您可以使用v2协议发送字节数组和张量,对于字节数组,使用图像转换器将图像转换为字节数组输入。在这里,我们使用mnist_v2_bytes.json文件来运行一个示例推理。
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2_bytes.json
期望输出
{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [], "datatype": "INT64", "data": [1]}]}
对于张量输入,使用张量图像转换器将图像转换为张量输入,这里我们使用mnist_v2.json文件来运行示例推理。
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2.json
期望输出
{"id": "2266ec1e-f600-40af-97b5-7429b8195a80", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [], "datatype": "INT64", "data": [1]}]}
要使用v2解释端点获取模型解释,请执行以下操作:
MODEL_NAME=mnist
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mnist/explain -d @./mnist_v2.json
期望输出
{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "explain", "shape": [1, 28, 28], "datatype": "FP64", "data": [-0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0040547528781588035, -0.00022612877200043775, -0.0001273413606783097, 0.005648369508785856, 0.008904784451506994, 0.0026385365879584796, 0.0026802458602499875, -0.002657801604900743, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00024465772895309256, 0.0008218449738666515, 0.015285917610467934, 0.007512832227517626, 0.007094984753782517, 0.003405668751094489, -0.0020919252360163056, -0.00078002938659872, 0.02299587777864007, 0.01900432942654754, -0.001252955497754338, -0.0014666116894338772, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.005298396384926053, -0.0007901605067151054, 0.0039060659788228954, 0.02317408211645009, 0.017237917554858186, 0.010867034286601965, 0.003001563092717309, 0.00622421762838887, 0.006120712336480808, 0.016736329175541464, 0.005674718838256385, 0.004344134814439431, -0.001232842177319105, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.0006867353660007012, 0.00977289933298656, -0.003875493166540815, 0.0017986937404117591, 0.0013075440157543057, -0.0024510980461748236, -0.0008806773426546923, -0.0, -0.0, -0.00014277890422995419, -0.009322313284511257, 0.020608317953885236, 0.004351394739722548, -0.0007875565409186222, -0.0009075897751127677, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00022247237111456804, -0.0007829031603535926, 0.002666369539125161, 0.000973336852105775, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.000432321003928822, 0.023657172129172684, 0.010694844898905204, -0.002375952975746018, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0020747972047037, -0.002320101258915877, -0.0012899205783904548, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.007629679655402933, 0.01044862724376463, 0.00025032878924736025, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.00037708370104137974, -0.005156369275302328, 0.0012477582442296628, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -4.442516083381132e-05, 0.01024804634283815, 0.0009971135240970147, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 0.0004501048968956462, -0.0019630535686311007, -0.0006664793297549408, 0.0020157403539278907, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0022144569383238466, 0.008361583574785395, 0.00314019428604999, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0028943544591141838, -0.0031301383432286406, 0.002113252872926688, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0010321050605717045, 0.008905753926369048, 0.002846438277738756, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.005305288883499087, -0.00192711009725932, 0.0012090042768467344, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0011945156500241256, 0.005654442715832439, 0.0020132075345016807, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0014689356969061985, 0.0010743412638183228, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0017047980586912376, 0.00290660517425009, -0.0007805869640505143, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 5.541* Connection #0 to host localhost left intact
725422148614e-05, 0.0014516114512869852, 0.0002827701966546988, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0014401407633627265, 0.0023812497776698745, 0.002146825301700187, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0011500529125940918, 0.0002865015572973405, 0.0029798151042282686, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0017750295500283872, 0.0008339859126060243, -0.00377073933577687, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0006093176894575109, -0.00046905787892409935, 0.0034053218511795034, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0007450011768391558, 0.001298767372877851, -0.008499247640112315, -6.145166131400234e-05, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0011809726042792137, -0.001838476328106708, 0.00541110661116898, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.002139234224224006, 0.0003259163407641124, -0.005276118873855287, -0.001950984007438105, -9.545670742026532e-07, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0007772404228681039, -0.0001517956264720738, 0.0064814848131711815, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 8.098064985902114e-05, -0.00249042660692983, -0.0020718619200672302, -5.341117902942147e-05, -0.00045564724429915073, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0022750983476959733, 0.0017164060958460778, 0.0003221344707738082, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0015560282678744543, 9.107238495871273e-05, 0.0008772841497928399, 0.0006502978626355868, -0.004128780767525651, 0.0006030386900152659, 0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.001395995791096219, 0.0026791526689584344, 0.0023995008266391488, -0.0004496096312746451, 0.003101832450753724, 0.007494536066960778, 0.0028641187148287965, -0.0030525907182629075, 0.003420222396518567, 0.0014924018363498125, -0.0009357388301326025, 0.0007856228933169799, -0.0018433973914981437, 1.6031856831240914e-05, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0006999018502034005, 0.004382250870697946, -0.0035419313267119365, -0.0028896748092595375, -0.00048734542493666705, -0.0060873452419295, 0.000388224990424471, 0.002533641537585585, -0.004352836563597573, -0.0006079418766875505, -0.0038101334053377753, -0.000828441340357984, 0.0, -0.0, 0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0010901530866342661, -0.013135008038845744, 0.0004734518707654666, 0.002050423283568135, -0.006609451922460863, 0.0023647861820124366, 0.0046789204256194, -0.0018122527412311837, 0.002137538353955849, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}]}
主要的无服务器推理功能之一是自动缩放与传入工作负载匹配的推理服务的副本。KServe默认启用Knative Pod Autoscaler,它可以监视流量并根据配置的指标进行上下缩放。
KServe支持Knative Pod Autoscaler(KPA)和Kubernetes的Horizontal Pod Autocaler(HPA)的实现。下面列出了每个自动缩放器的功能和限制。
笔记
如果你想使用Kubernetes Horizontal Pod Autoscaler(HPA),你必须安装HPA扩展
Knative Pod Autoscaler(KPA)
Horizontal Pod Autoscaler(HPA)
硬/软自动缩放限制
您可以使用注释autoscaling.knactive.dev/target配置推理服务以获得软限制。软限制是一个有针对性的限制,而不是严格执行的限制,特别是如果突然出现请求突发,可能会超过这个值。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
annotations:
autoscaling.knative.dev/target: "10"
spec:
predictor:
pytorch:
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
annotations:
autoscaling.knative.dev/target: "10"
spec:
predictor:
model:
modelFormat:
name: pytorch
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"
您也可以配置具有字段containerConcurrency和硬限制的推理服务。硬性限制是一个强制的上限。如果并发达到硬限制,多余的请求将被缓冲,并且必须等到有足够的空闲容量来执行请求。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
containerConcurrency: 10
pytorch:
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
spec:
predictor:
containerConcurrency: 10
model:
modelFormat:
name: pytorch
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"
在指定缩放目标的软限制或硬限制后,现在可以使用autoscaling.yaml部署推理服务。
kubectl
kubectl apply -f autoscaling.yaml
期望输出
$ inferenceservice.serving.kserve.io/torchserve created
第一步是安装hey load generator,然后将并发请求发送到推理服务。
go get -u github.com/rakyll/hey
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
hey -m POST -z 30s -D ./mnist.json -host ${SERVICE_HOSTNAME} http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict
hey默认情况下会同时生成50个请求,因此您可以看到,当容器并发目标设置为10时,推理服务会扩展到5个pod。
期望输出
NAME READY STATUS RESTARTS AGE
torchserve-predictor-default-cj2d8-deployment-69444c9c74-67qwb 2/2 Terminating 0 103s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-nnxk8 2/2 Terminating 0 95s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-rq8jq 2/2 Running 0 50m
torchserve-predictor-default-cj2d8-deployment-69444c9c74-tsrwr 2/2 Running 0 113s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-vvpjl 2/2 Running 0 109s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-xvn7t 2/2 Terminating 0 103s
金丝雀发布是一种部署策略,当您向生产流量的一小部分发布新版本的模型时。
在上面的实验之后,现在让我们看看如何在不将全部流量默认移动到新模型的情况下推出新模型。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
annotations:
serving.kserve.io/enable-tag-routing: "true"
spec:
predictor:
canaryTrafficPercent: 20
pytorch:
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v2"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "torchserve"
annotations:
serving.kserve.io/enable-tag-routing: "true"
spec:
predictor:
canaryTrafficPercent: 20
model:
modelFormat:
name: pytorch
storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v2"
在本例中,我们使用canaryTrafficPercent字段将storageUri更改为v2版本,然后应用canary.yaml。
kubectl
kubectl apply -f canary.yaml
期望输出
kubectl get revisions -l serving.kserve.io/inferenceservice=torchserve
NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASON ACTUAL REPLICAS DESIRED REPLICAS
torchserve-predictor-default-00001 torchserve-predictor-default 1 True 1 1
torchserve-predictor-default-00002 torchserve-predictor-default 2 True 1 1
kubectl get pods -l serving.kserve.io/inferenceservice=torchserve
NAME READY STATUS RESTARTS AGE
torchserve-predictor-default-00001-deployment-7d99979c99-p49gk 2/2 Running 0 28m
torchserve-predictor-default-00002-deployment-c6fcc65dd-rjknq 2/2 Running 0 3m37s
金丝雀模型推出后,流量应在金丝雀模型修订版和“稳定”修订版之间进行分配,后者以100%的流量推出,现在从推理服务流量状态检查流量分配:
kubectl get isvc torchserve -ojsonpath='{.status.components}'
期望输出
{
"predictor": {
"address": {
"url": "http://torchserve-predictor-default.default.svc.cluster.local"
},
"latestCreatedRevision": "torchserve-predictor-default-00002",
"latestReadyRevision": "torchserve-predictor-default-00002",
"latestRolledoutRevision": "torchserve-predictor-default-00001",
"traffic": [
{
"latestRevision": true,
"percent": 20,
"revisionName": "torchserve-predictor-default-00002",
"tag": "latest",
"url": "http://latest-torchserve-predictor-default.default.example.com"
},
{
"latestRevision": false,
"percent": 80,
"revisionName": "torchserve-predictor-default-00001",
"tag": "prev",
"url": "http://prev-torchserve-predictor-default.default.example.com"
}
],
"url": "http://torchserve-predictor-default.default.example.com"
}
}
向InferenceService运行以下curl请求几次,您可以看到请求被发送到具有20/80拆分的两个修订版。
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
for i in {1..10}; do curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json; done
期望输出
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
500: Internal Server Error</title> 500: Internal Server Error</body></html>Handling connection for 8080
500: Internal Server Error</title> 500: Internal Server Error</body></html>Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
您可以注意到,当请求命中canary修订版时,它会失败,这是因为新修订版需要v2推理输入mnist_v2.json,这是一个突破性的更改,此外,流量会根据指定的流量百分比在两个修订版之间随机分配。在这种情况下,您应该使用0 canaryTrafficPercent来推出金丝雀模型,并在将全部流量移动到新模型之前使用最新标记的url来测试金丝雀模型。
kubectl
kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 0}]'
curl -v -H "Host: latest-torchserve-predictor-default.default.example.com" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
期望输出
{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [1], "datatype": "INT64", "data": [1]}]}
在测试和验证了新模型后,您现在可以将canaryTrafficPercent提高到100,以将流量完全推送到新版本,现在latestRolledoutRevision变为torchserver-predictor-default-00002,而previousRolledoutRevision变成torchserver-predictor-default-00001。
kubectl
kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 100}]'
检查流量状态:
kubectl get isvc torchserve -ojsonpath='{.status.components}'
期望输出
{
"predictor": {
"address": {
"url": "http://torchserve-predictor-default.default.svc.cluster.local"
},
"latestCreatedRevision": "torchserve-predictor-default-00002",
"latestReadyRevision": "torchserve-predictor-default-00002",
"latestRolledoutRevision": "torchserve-predictor-default-00002",
"previousRolledoutRevision": "torchserve-predictor-default-00001",
"traffic": [
{
"latestRevision": true,
"percent": 100,
"revisionName": "torchserve-predictor-default-00002",
"tag": "latest",
"url": "http://latest-torchserve-predictor-default.default.example.com"
},
],
"url": "http://torchserve-predictor-default.default.example.com"
}
}
如果流量移动到新版本后,新模型版本不起作用,您仍然可以将canaryTrafficPercent修补为0,并将流量移回以前滚动的模型,即torchserver-predictor-default-00001。
kubectl
kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 0}]'
检查流量状态:
kubectl get isvc torchserve -ojsonpath='{.status.components}'
期望输出
{
"predictor": {
"address": {
"url": "http://torchserve-predictor-default.default.svc.cluster.local"
},
"latestCreatedRevision": "torchserve-predictor-default-00002",
"latestReadyRevision": "torchserve-predictor-default-00002",
"latestRolledoutRevision": "torchserve-predictor-default-00001",
"previousRolledoutRevision": "torchserve-predictor-default-00001",
"traffic": [
{
"latestRevision": true,
"percent": 0,
"revisionName": "torchserve-predictor-default-00002",
"tag": "latest",
"url": "http://latest-torchserve-predictor-default.default.example.com"
},
{
"latestRevision": false,
"percent": 100,
"revisionName": "torchserve-predictor-default-00001",
"tag": "prev",
"url": "http://prev-torchserve-predictor-default.default.example.com"
}
],
"url": "http://torchserve-predictor-default.default.example.com"
}
}
Metrics Exposure and Grafana Dashboard Setup
本示例将引导您了解如何利用推理服务CRD的v1beta1版本部署scikit-learn模型。请注意,默认情况下,v1beta1版本将通过与现有V1 Dataplane兼容的API公开您的模型。但是,本示例将向您展示如何通过与新的V2 Dataplane兼容的API为模型提供服务。
第一步是训练一个样本scikit学习模型。请注意,该模型将保存为model.joblib。
from sklearn import svm
from sklearn import datasets
from joblib import dump
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf = svm.SVC(gamma='scale')
clf.fit(X, y)
dump(clf, 'model.joblib')
一旦您获得了模型序列化的model.joblib,我们就可以使用MLServer来启动本地服务器。有关MLServer的更多详细信息,请随时查看SKLearn示例文档。
笔记
此步骤是可选的,仅用于测试,可以直接使用InfenceService进行部署。
首先,要在本地使用MLServer,首先需要在本地环境以及SKLearn运行时中安装MLServer包。
pip install mlserver mlserver-sklearn
下一步将提供一些模型设置,以便MLServer知道:
这些可以通过环境变量或创建本地model-settings.json文件来指定:
{
"name": "sklearn-iris",
"version": "v1.0.0",
"implementation": "mlserver_sklearn.SKLearnModel"
}
请注意,当您部署模型时,KServe已经注入了一些合理的默认值,这样它就可以开箱即用,而无需任何进一步的配置。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型。
有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:
mlserver start .
最后,您将使用KServe来部署经过训练的模型。为此,您只需要使用推理服务CRD的v1beta1版本,并将protocolVersion字段设置为v2。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "sklearn-irisv2"
spec:
predictor:
sklearn:
protocolVersion: "v2"
storageUri: "gs://seldon-models/sklearn/mms/lr_model"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "sklearn-irisv2"
spec:
predictor:
model:
modelFormat:
name: sklearn
runtime: kserve-mlserver
storageUri: "gs://seldon-models/sklearn/mms/lr_model"
请注意,这做出了以下假设:
kubectl
kubectl apply -f ./sklearn.yaml
现在,您可以通过发送示例请求来测试已部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:
{
"inputs": [
{
"name": "input-0",
"shape": [2, 4],
"datatype": "FP32",
"data": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
]
}
现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。
您可以使用curl将推理请求发送为:
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-irisv2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v \
-H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
-d @./iris-input.json \
http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/sklearn-irisv2/infer
期望输出
{
"id": "823248cc-d770-4a51-9606-16803395569c",
"model_name": "sklearn-irisv2",
"model_version": "v1.0.0",
"outputs": [
{
"data": [1, 1],
"datatype": "INT64",
"name": "predict",
"parameters": null,
"shape": [2]
}
]
}
本示例将引导您了解如何利用推理服务CRD的v1beta1版本部署xgboost模型。请注意,默认情况下,v1beta1版本将通过与现有V1Dataplane兼容的API公开您的模型。但是,本示例将向您展示如何通过与新的V2数据平面兼容的API为模型提供服务。
第一步是训练一个样例xgboost模型。我们将此模型另存为model.bst。
import xgboost as xgb
from sklearn.datasets import load_iris
import os
model_dir = "."
BST_FILE = "model.bst"
iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = xgb.DMatrix(X, label=y)
param = {'max_depth': 6,
'eta': 0.1,
'silent': 1,
'nthread': 4,
'num_class': 10,
'objective': 'multi:softmax'
}
xgb_model = xgb.train(params=param, dtrain=dtrain)
model_file = os.path.join((model_dir), BST_FILE)
xgb_model.save_model(model_file)
一旦我们的model.bst模型串行化,我们就可以使用MLServer来启动本地服务器。有关MLServer的更多详细信息,请随时查看他们文档中的XGBoost示例。
请注意,此步骤是可选的,仅用于测试。您可以直接部署经过训练的模型。
首先,要在本地使用MLServer,首先需要在本地环境以及XGBoost运行时中安装MLServer包。
pip install mlserver mlserver-xgboost
下一步将提供一些模型设置,以便MLServer知道:
这些可以通过环境变量或创建本地model-settings.json文件来指定:
{
"name": "xgboost-iris",
"version": "v1.0.0",
"implementation": "mlserver_xgboost.XGBoostModel"
}
请注意,当我们部署我们的模型时,KServe已经注入了一些合理的默认值,这样它就可以在没有任何进一步配置的情况下开箱即用。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型。
有了本地安装的mlserver包和本地model-settings.json文件,我们现在应该可以启动服务器了:
mlserver start .
最后,我们将使用KServe来部署我们经过训练的模型。为此,我们只需要使用推理服务CRD的v1beta1版本,并将protocolVersion字段设置为v2。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "xgboost-iris"
spec:
predictor:
xgboost:
protocolVersion: "v2"
storageUri: "gs://kfserving-examples/models/xgboost/iris"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "xgboost-iris"
spec:
predictor:
model:
modelFormat:
name: xgboost
runtime: kserve-mlserver
storageUri: "gs://kfserving-examples/models/xgboost/iris"
请注意,这做出了以下假设:
假设我们已经安装了KServe,可以通过kubectl访问集群,我们可以将我们的模型部署为:
kubectl apply -f xgboost.yaml
现在,我们可以通过发送一个示例请求来测试我们部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:
{
"inputs": [
{
"name": "input-0",
"shape": [2, 4],
"datatype": "FP32",
"data": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
]
}
现在,假设我们的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,我们可以使用curl将我们的推理请求发送为:
您可以按照以下说明查找入口IP和端口。
SERVICE_HOSTNAME=$(kubectl get inferenceservice xgboost-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v \
-H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
-d @./iris-input.json \
http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/xgboost-iris/infer
输出将类似于:
期望输出
{
"id": "4e546709-0887-490a-abd6-00cbc4c26cf4",
"model_name": "xgboost-iris",
"model_version": "v1.0.0",
"outputs": [
{
"data": [1.0, 1.0],
"datatype": "FP32",
"name": "predict",
"parameters": null,
"shape": [2]
}
]
}
PMML,或称预测模型标记语言,是一种用于描述数据挖掘和统计模型的XML格式,包括模型的输入、用于为数据挖掘准备数据的转换,以及定义模型本身的参数。在这个例子中,我们展示了如何在推理服务上提供PMML格式模型。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "pmml-demo"
spec:
predictor:
pmml:
storageUri: gs://kfserving-examples/models/pmml
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "pmml-demo"
spec:
predictor:
model:
modelFormat:
name: pmml
storageUri: "gs://kfserving-examples/models/pmml"
使用上述yaml创建推理服务
kubectl apply -f pmml.yaml
期望输出
$ inferenceservice.serving.kserve.io/pmml-demo created
警告
pmmlserver基于Py4J,不支持多进程模式,因此我们无法设置spec.productor.contenerConcurrency。如果您想扩展pmmlserver以提高预测性能,您应该将推理服务的resources.limits.cu设置为1,并扩展副本大小。
第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT
MODEL_NAME=pmml-demo
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
期望输出
* TCP_NODELAY set
* Connected to localhost (::1) port 8081 (#0)
> POST /v1/models/pmml-demo:predict HTTP/1.1
> Host: pmml-demo.default.example.com
> User-Agent: curl/7.64.1
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 45 out of 45 bytes
< HTTP/1.1 200 OK
< content-length: 39
< content-type: application/json; charset=UTF-8
< date: Sun, 18 Oct 2020 15:50:02 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 12
<
* Connection #0 to host localhost left intact
{"predictions": [{'Species': 'setosa', 'Probability_setosa': 1.0, 'Probability_versicolor': 0.0, 'Probability_virginica': 0.0, 'Node_Id': '2'}]}* Closing connection 0
1.安装pyspark 3.0.x和pyspark2pmml
pip install pyspark~=3.0.0
pip install pyspark2pmml
2.获取JPMML-SparkML.jar
使用–jars启动pyspark以指定JPMML-SparkML uber-JAR的位置
pyspark --jars ./jpmml-sparkml-executable-1.6.3.jar
安装Spark ML管道
from pyspark.ml import Pipeline
from pyspark.ml.classification import DecisionTreeClassifier
from pyspark.ml.feature import RFormula
df = spark.read.csv("Iris.csv", header = True, inferSchema = True)
formula = RFormula(formula = "Species ~ .")
classifier = DecisionTreeClassifier()
pipeline = Pipeline(stages = [formula, classifier])
pipelineModel = pipeline.fit(df)
from pyspark2pmml import PMMLBuilder
pmmlBuilder = PMMLBuilder(sc, df, pipelineModel)
pmmlBuilder.buildFile("DecisionTreeIris.pmml")
将DecisionTreeRis.pmml上传到GCS存储桶,注意PMMLServer期望模型文件名为model.pmml
gsutil cp ./DecisionTreeIris.pmml gs://$BUCKET_NAME/sparkpmml/model.pmml
使用pmml预测器创建推理服务,并指定您上传到的存储桶位置的storageUri
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "spark-pmml"
spec:
predictor:
pmml:
storageUri: gs://kfserving-examples/models/sparkpmml
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "spark-pmml"
spec:
predictor:
model:
modelFormat:
name: pmml
storageUri: gs://kfserving-examples/models/sparkpmml
应用推理服务自定义资源
kubectl apply -f spark_pmml.yaml
期望输出
$ inferenceservice.serving.kserve.io/spark-pmml created
等待推理服务准备就绪
kubectl wait --for=condition=Ready inferenceservice spark-pmml
inferenceservice.serving.kserve.io/spark-pmml condition met
第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT
MODEL_NAME=spark-pmml
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice spark-pmml -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
期望输出
* Connected to spark-pmml.default.35.237.217.209.xip.io (35.237.217.209) port 80 (#0)
> POST /v1/models/spark-pmml:predict HTTP/1.1
> Host: spark-pmml.default.35.237.217.209.xip.io
> User-Agent: curl/7.73.0
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 45 out of 45 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 39
< content-type: application/json; charset=UTF-8
< date: Sun, 07 Mar 2021 19:32:50 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 14
<
* Connection #0 to host spark-pmml.default.35.237.217.209.xip.io left intact
{"predictions": [[1.0, 0.0, 1.0, 0.0]]}
要测试LightGBM服务器,首先需要使用以下python代码训练一个简单的LightGBM模型。
import lightgbm as lgb
from sklearn.datasets import load_iris
import os
model_dir = "."
BST_FILE = "model.bst"
iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = lgb.Dataset(X, label=y, feature_names=iris['feature_names'])
params = {
'objective':'multiclass',
'metric':'softmax',
'num_class': 3
}
lgb_model = lgb.train(params=params, train_set=dtrain)
model_file = os.path.join(model_dir, BST_FILE)
lgb_model.save_model(model_file)
使用经过训练的模型在本地安装并运行LightGBM服务器,并测试预测。
python -m lgbserver --model_dir /path/to/model_dir --model_name lgb
LightGBM服务器在本地启动后,我们可以通过发送推理请求来测试模型。
import requests
request = {'sepal_width_(cm)': {0: 3.5}, 'petal_length_(cm)': {0: 1.4}, 'petal_width_(cm)': {0: 0.2},'sepal_length_(cm)': {0: 5.1} }
formData = {
'inputs': [request]
}
res = requests.post('http://localhost:8080/v1/models/lgb:predict', json=formData)
print(res)
print(res.text)
要在Kubernetes上部署模型,可以通过使用lightgbm和storageUri指定modelFormat来创建推理服务。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-iris"
spec:
predictor:
lightgbm:
storageUri: "gs://kfserving-examples/models/lightgbm/iris"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-iris"
spec:
predictor:
model:
modelFormat:
name: lightgbm
storageUri: "gs://kfserving-examples/models/lightgbm/iris"
应用上述yaml创建推理服务
kubectl apply -f lightgbm.yaml
期望输出
$ inferenceservice.serving.kserve.io/lightgbm-iris created
要测试部署的模型,第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT,然后运行以下curl命令将推理请求发送到推理服务。
MODEL_NAME=lightgbm-iris
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
期望输出
* Trying 169.63.251.68...
* TCP_NODELAY set
* Connected to 169.63.251.68 (169.63.251.68) port 80 (#0)
> POST /models/lightgbm-iris:predict HTTP/1.1
> Host: lightgbm-iris.default.svc.cluster.local
> User-Agent: curl/7.60.0
> Accept: */*
> Content-Length: 76
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 76 out of 76 bytes
< HTTP/1.1 200 OK
< content-length: 27
< content-type: application/json; charset=UTF-8
< date: Tue, 21 May 2019 22:40:09 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 13032
<
* Connection #0 to host 169.63.251.68 left intact
{"predictions": [[0.9, 0.05, 0.05]]}
一旦您获得了模型序列化的model.bst,我们就可以使用实现KServe V2推理协议的MLServer来启动本地服务器。有关MLServer的更多详细信息,请查看LightGBM示例文档。
要在本地运行MLServer,首先要在本地环境以及LightGBM运行时中安装mlserver包。
pip install mlserver mlserver-lightgbm
下一步是提供模型设置,以便MLServer知道:
这些可以通过环境变量或创建本地model-settings.json文件来指定:
{
"name": "lightgbm-iris",
"version": "v1.0.0",
"implementation": "mlserver_lightgbm.LightGBMModel"
}
有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:
mlserver start .
当您使用推理服务部署模型时,KServe会注入合理的默认值,这样它就可以开箱即用,而无需任何进一步的配置。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型。
要使用V2推理协议部署LightGBM模型,需要将protocolVersion字段设置为V2。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-v2-iris"
spec:
predictor:
lightgbm:
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-v2-iris"
spec:
predictor:
model:
modelFormat:
name: lightgbm
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
应用推理服务yaml来获取REST端点
kubectl
kubectl apply -f lightgbm-v2.yaml
期望输出
$ inferenceservice.serving.kserve.io/lightgbm-v2-iris created
现在,您可以通过发送示例请求来测试已部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:
{
"inputs": [
{
"name": "input-0",
"shape": [2, 4],
"datatype": "FP32",
"data": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
]
}
现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。
您可以使用curl将推理请求发送为:
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v \
-H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
-d @./iris-input-v2.json \
http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/lightgbm-v2-iris/infer
期望输出
{
"model_name":"lightgbm-v2-iris",
"model_version":null,
"id":"96253e27-83cf-4262-b279-1bd4b18d7922",
"parameters":null,
"outputs":[
{
"name":"predict",
"shape":[2,3],
"datatype":"FP64",
"parameters":null,
"data":
[8.796664107010673e-06,0.9992300031041593,0.0007612002317336916,4.974786820804187e-06,0.9999919650711493,3.0601420299625077e-06]
}
]
}
创建推理服务yaml并公开gRPC端口,目前只允许一个端口公开HTTP或gRPC端口并且默认情况下公开HTTP端口。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-v2-iris"
spec:
predictor:
lightgbm:
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
ports:
- name: h2c
protocol: TCP
containerPort: 9000
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "lightgbm-v2-iris"
spec:
predictor:
model:
modelFormat:
name: lightgbm
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
ports:
- name: h2c
protocol: TCP
containerPort: 9000
应用推理服务yaml获取gRPC端点
kubectl
kubectl apply -f lightgbm-v2-grpc.yaml
gRPC推理服务就绪后,grpurl可用于向推理服务发送gRPC请求。
# download the proto file
curl -O https://raw.githubusercontent.com/kserve/kserve/master/docs/predict-api/v2/grpc_predict_v2.proto
INPUT_PATH=iris-input-v2-grpc.json
PROTO_FILE=grpc_predict_v2.proto
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
gRPC API遵循KServe预测V2协议。
例如,ServerReady API可用于检查服务器是否已就绪:
grpcurl \
-plaintext \
-proto ${PROTO_FILE} \
-authority ${SERVICE_HOSTNAME}" \
${INGRESS_HOST}:${INGRESS_PORT} \
inference.GRPCInferenceService.ServerReady
期望输出
{
"ready": true
}
ModelInfer API按照grpc_predict_v2.proto文件中定义的ModelInferRequest模式获取输入。请注意,输入文件与上一个curl示例中使用的文件不同。
grpcurl \
-vv \
-plaintext \
-proto ${PROTO_FILE} \
-authority ${SERVICE_HOSTNAME} \
-d @ \
${INGRESS_HOST}:${INGRESS_PORT} \
inference.GRPCInferenceService.ModelInfer \
<<< $(cat "$INPUT_PATH")
期望输出
Resolved method descriptor:
// The ModelInfer API performs inference using the specified model. Errors are
// indicated by the google.rpc.Status returned for the request. The OK code
// indicates success and other codes indicate failure.
rpc ModelInfer ( .inference.ModelInferRequest ) returns ( .inference.ModelInferResponse );
Request metadata to send:
(empty)
Response headers received:
accept-encoding: identity,gzip
content-type: application/grpc
date: Sun, 25 Sep 2022 10:25:05 GMT
grpc-accept-encoding: identity,deflate,gzip
server: istio-envoy
x-envoy-upstream-service-time: 99
Estimated response size: 91 bytes
Response contents:
{
"modelName": "lightgbm-v2-iris",
"outputs": [
{
"name": "predict",
"datatype": "FP64",
"shape": [
"2",
"3"
],
"contents": {
"fp64Contents": [
8.796664107010673e-06,
0.9992300031041593,
0.0007612002317336916,
4.974786820804187e-06,
0.9999919650711493,
3.0601420299625077e-06
]
}
}
]
}
在本例中,我们使用经过训练的paddle resnet50模型,通过运行带有paddle预测器的推理服务来对图像进行分类。
旧框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "paddle-resnet50"
spec:
predictor:
paddle:
storageUri: "https://zhouti-mcp-edge.cdn.bcebos.com/resnet50.tar.gz"
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "paddle-resnet50"
spec:
predictor:
model:
modelFormat:
name: paddle
storageUri: "https://zhouti-mcp-edge.cdn.bcebos.com/resnet50.tar.gz"
应用上述yaml创建推理服务
kubectl apply -f paddle.yaml
期望输出
$ inferenceservice.serving.kserve.io/paddle-resnet50 created
第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT
MODEL_NAME=paddle-resnet50
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./jay.json
期望输出
* Trying 127.0.0.1:80...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 80 (#0)
> POST /v1/models/paddle-resnet50:predict HTTP/1.1
> Host: paddle-resnet50.default.example.com
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Length: 3010209
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 23399
< content-type: application/json; charset=UTF-8
< date: Mon, 17 May 2021 03:34:58 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 511
<
{"predictions": [[6.736678770380422e-09, 1.1535990829258935e-08, 5.142250714129659e-08, 6.647170636142619e-08, 4.094492567219277e-08, 1.3402451770616608e-07, 9.355561303436843e-08, 2.8935891904779965e-08, 6.845367295227334e-08, 7.680615965455218e-08, 2.0334689452283783e-06, 1.1085678579547675e-06, 2.3477592492326949e-07, 6.582037030966603e-07, 0.00012373103527352214, 4.2878804151769145e-07, 6.419959845516132e-06, 0.9993496537208557, 7.372002437477931e-05, 3.101135735050775e-05, 5.6028093240456656e-06, 2.1862508674530545e-06, 1.9544044604913324e-08, 3.728893887000595e-07, 4.2903633357127546e-07, 1.8251179767503345e-07, 7.159925985433802e-08, 9.231618136595898e-09, 6.469241498052725e-07, 7.031690341108288e-09, 4.451231561120039e-08, 1.2455971898361895e-07, 9.44632745358831e-08, 4.347704418705689e-08, 4.658220120745682e-07, 6.797721141538204e-08, 2.1060276367279585e-07, 2.2605123106700376e-08, 1.4311490303953178e-07, 7.951298641728499e-08, 1.2341783417468832e-07, 1.0921713737843675e-06, 1.5243892448779661e-05, 3.1173343018053856e-07, 2.4152058131221565e-07, 6.863762536113427e-08, 8.467682022228473e-08, 9.4246772164297e-08, 1.0219210366813058e-08, 3.3770753304906975e-08, 3.6928835100979995e-08, 1.3694031508748594e-07, 1.0674284567357972e-07, 2.599483650556067e-07, 3.4866405940192635e-07, 3.132053549848024e-08, 3.574873232992104e-07, 6.64843895492595e-08, 3.1638955988455564e-07, 1.2095878219042788e-06, 8.66409024524728e-08, 4.0144172430700564e-08, 1.2544761318622477e-07, 3.3201178695208e-08, 1.9731444922399533e-07, 3.806405572959193e-07, 1.3827865075199952e-07, 2.300225965257141e-08, 7.14422512260171e-08, 2.851114544455413e-08, 2.982567437470607e-08, 8.936032713791064e-08, 6.22388370175031e-07, 6.478838798784636e-08, 1.3663023423760023e-07, 9.973181391842445e-08, 2.5761554667269593e-08, 4.130220077058766e-08, 3.9384463690339544e-08, 1.2158079698565416e-07, 4.302821707824478e-06, 1.8179063090428826e-06, 1.8520155435908237e-06, 1.6246107179540559e-06, 1.6448313544970006e-05, 1.0544916221988387e-05, 3.993061909568496e-06, 2.646479799750523e-07, 1.9193475964129902e-05, 4.803242745765601e-07, 1.696285067964709e-07, 4.550505764200352e-06, 4.235929372953251e-05, 4.443338639248395e-06, 5.104009687784128e-06, 1.3506396498996764e-05, 4.1758724478313525e-07, 4.494491463447048e-07, 3.156698369366495e-07, 1.0557599807725637e-06, 1.336463917311903e-08, 1.3893659556174498e-08, 6.770379457066156e-08, 1.4129696523923485e-07, 7.170518756538513e-08, 7.934466594861078e-08, 2.639154317307657e-08, 2.6134321373660896e-08, 7.196725881897237e-09, 2.1752363466021052e-08, 6.684639686227456e-08, 3.417795824134373e-08, 1.6228275967478112e-07, 4.107114648377319e-07, 6.472135396506928e-07, 2.951379372007068e-07, 5.653474133282543e-09, 4.830144462175667e-08, 8.887481861563629e-09, 3.7306168820805397e-08, 1.7784264727538357e-08, 4.641905082536368e-09, 3.413118676576232e-08, 1.937393818707278e-07, 1.2980176506971475e-06, 3.5641004814124244e-08, 2.149332445355867e-08, 3.055293689158134e-07, 1.5532516783878236e-07, 1.4520978766086046e-06, 3.488464628276233e-08, 3.825438398052938e-05, 4.5088432898410247e-07, 4.1766969616219285e-07, 6.770622462681786e-07, 1.4142248971893423e-07, 1.4235997696232516e-05, 6.293820433711517e-07, 4.762866865348769e-06, 9.024900577969674e-07, 9.058987870957935e-07, 1.5713684433649178e-06, 1.5720647184025438e-07, 1.818536503606083e-07, 7.193188622522939e-08, 1.1952824934269302e-06, 8.874837362782273e-07, 2.0870831463071227e-07, 9.906239029078279e-08, 7.793621747964607e-09, 1.0058498389753368e-07, 4.2059440374941914e-07, 1.843624630737395e-07, 1.6437947181202617e-07, 7.025352743994517e-08, 2.570448600636155e-07, 7.586877615040066e-08, 7.841313731660193e-07, 2.495309274763713e-07, 5.157681925993529e-08, 4.0674127177453556e-08, 7.531796519799627e-09, 4.797485431140558e-08, 1.7419973019627832e-08, 1.7958679165985814e-07, 1.2566392371127222e-08, 8.975440124459055e-08, 3.26965476915575e-08, 1.1208359751435637e-07, 3.906746215420753e-08, 4.6769045525252295e-08, 1.8523553535487736e-07, 1.4833052830454108e-07, 1.2279349448363064e-07, 1.0729105497375713e-06, 3.6538490011395197e-09, 1.6198403329781286e-07, 1.6190719875908144e-08, 1.2004933580556099e-07, 1.4800277448046018e-08, 4.02294837442696e-08, 2.15060893538066e-07, 1.1925696696835075e-07, 4.8982514044837444e-08, 7.608920071788816e-08, 2.3137479487900237e-08, 8.521050176568679e-08, 9.586213423062873e-08, 1.3351650807180704e-07, 3.021699157557123e-08, 4.423876376336011e-08, 2.610667060309879e-08, 2.3977091245797055e-07, 1.3192564551900432e-07, 1.6734931662654162e-08, 1.588336999702733e-07, 4.0643516285854275e-07, 8.753454494581092e-08, 8.366999395548191e-07, 3.437598650180007e-08, 7.847892646850596e-08, 8.526394701391382e-09, 9.601382799928615e-08, 5.258924034023948e-07, 1.3557448141909845e-07, 1.0307226716577134e-07, 1.0429813457335513e-08, 5.187714435805901e-08, 2.187001335585137e-08, 1.1791439824548888e-08, 2.98065643278278e-08, 4.338393466696289e-08, 2.9991046091026874e-08, 2.8507610494443725e-08, 3.058665143385042e-08, 6.441099031917474e-08, 1.5364101102477434e-08, 1.5973883549236234e-08, 2.5736850872704053e-08, 1.0903765712555469e-07, 3.2118737891551064e-08, 6.819742992547617e-09, 1.9251311300649832e-07, 5.8258109447706374e-08, 1.8765761922168167e-07, 4.0070790419122204e-07, 1.5791577823165426e-08, 1.950158434738114e-07, 1.0142063189277906e-08, 2.744815041921811e-08, 1.2843531571604672e-08, 3.7297493094001766e-08, 7.407496838141014e-08, 4.20607833007125e-08, 1.6924804668860816e-08, 1.459203531339881e-07, 4.344977000414474e-08, 1.7191403856031684e-07, 3.5817443233554513e-08, 8.440249388286247e-09, 4.194829728021432e-08, 2.514032360068086e-08, 2.8340199520471288e-08, 8.747196034164517e-08, 8.277125651545703e-09, 1.1676293709683705e-08, 1.4548514570833504e-07, 7.200282148289716e-09, 2.623600948936655e-06, 5.675736929333652e-07, 1.9483527466945816e-06, 6.752595282932816e-08, 8.168475318370838e-08, 1.0933046468153407e-07, 1.670913718498923e-07, 3.1387276777650186e-08, 2.973524537708272e-08, 5.752163900751839e-08, 5.850877471402782e-08, 3.2544622285968217e-07, 3.330221431951941e-08, 4.186786668469722e-07, 1.5085906568401697e-07, 2.3346819943981245e-07, 2.86402780602657e-07, 2.2940319865938363e-07, 1.8537603807544656e-07, 3.151798182443599e-07, 1.1075967449869495e-06, 1.5369782602192572e-07, 1.9237509718550427e-07, 1.64044664074936e-07, 2.900835340824415e-07, 1.246654903752642e-07, 5.802622027317739e-08, 5.186220519703966e-08, 6.0094205167615655e-09, 1.2333241272699524e-07, 1.3798474185477971e-07, 1.7370231830682314e-07, 5.617761189569137e-07, 5.1604470030497396e-08, 4.813277598714194e-08, 8.032698417537176e-08, 2.0645263703045202e-06, 5.638597713186755e-07, 8.794199857220519e-07, 3.4785980460583232e-06, 2.972389268052211e-07, 3.3904532870110415e-07, 9.469074058188198e-08, 3.754845678827223e-08, 1.5679037801419327e-07, 8.203105039683578e-08, 6.847962641387539e-09, 1.8251624211984563e-08, 6.050240841659615e-08, 3.956342808919544e-08, 1.0699947949888156e-07, 3.2566634899922065e-07, 3.5369430406717584e-07, 7.326295303755614e-08, 4.85765610847011e-07, 7.717713401689252e-07, 3.4567779749750116e-08, 3.246204585138912e-07, 3.1608601602783892e-06, 5.33099466792919e-08, 3.645687343123427e-07, 5.48158936908294e-07, 4.62306957160763e-08, 1.3466177506415988e-07, 4.3529482240955986e-08, 1.6404105451783835e-07, 2.463695381038633e-08, 5.958712634424046e-08, 9.493651020875404e-08, 5.523462576206839e-08, 5.7412357534758485e-08, 1.1850350347231142e-05, 5.8263944993086625e-06, 7.4208674050169066e-06, 9.127966222877149e-07, 2.0019581370434025e-06, 1.033498961078294e-06, 3.5146850763112525e-08, 2.058995278275688e-06, 3.5655509122989315e-07, 6.873234070781109e-08, 2.1935298022413008e-09, 5.560363547374436e-08, 3.3266996979364194e-07, 1.307369217329324e-07, 2.718762992515167e-08, 1.0462929189714032e-08, 7.466680358447775e-07, 6.923166040451179e-08, 1.6145664361033596e-08, 8.568521003837759e-09, 4.76221018175238e-09, 1.233977116044116e-07, 8.340628632197422e-09, 3.2649041248333788e-09, 5.0632489312363305e-09, 4.0704994930251814e-09, 1.2043538610839732e-08, 5.105608380517879e-09, 7.267142887457112e-09, 1.184516307262129e-07, 7.53557927168913e-08, 6.386964201965384e-08, 1.6212936770898523e-08, 2.610429419291904e-07, 6.979425393183192e-07, 6.647513117741255e-08, 7.717492849224072e-07, 6.651206945207377e-07, 3.324495310152997e-07, 3.707282019149716e-07, 3.99564243025452e-07, 6.411632114122767e-08, 7.107352217872176e-08, 1.6380016631956096e-07, 6.876800995314625e-08, 3.462474467141874e-07, 2.0256503319160402e-07, 6.19610148078209e-07, 2.6841073363925716e-08, 6.720335363752383e-07, 1.1348340649419697e-06, 1.8397931853542104e-06, 6.397251581802266e-07, 7.257533241045167e-08, 4.2213909523525217e-07, 3.9657925299252383e-07, 1.4037439655112394e-07, 3.249856774800719e-07, 1.5857655455420172e-07, 1.1122217102865761e-07, 7.391420808744442e-08, 3.42322238111592e-07, 5.39796154441774e-08, 8.517296379295658e-08, 4.061009803990601e-06, 1.4478755474556237e-05, 7.317032757470088e-09, 6.9484960008026064e-09, 4.468917325084476e-08, 9.23141172393116e-08, 5.411982328951126e-08, 2.2242811326123046e-07, 1.7609554703312824e-08, 2.0906279374344194e-08, 3.6797682678724186e-09, 6.177919686933819e-08, 1.7920288541972695e-07, 2.6279179721200308e-08, 2.6988200119149042e-08, 1.6432807115052128e-07, 1.2827612749788386e-07, 4.468908798571647e-08, 6.316552969565237e-08, 1.9461760203398626e-08, 2.087125849925542e-08, 2.2414580413965268e-08, 2.4765244077684656e-08, 6.785398465325443e-09, 2.4248794971981624e-08, 4.554979504689527e-09, 2.8977037658250993e-08, 2.0402325162649504e-08, 1.600950270130852e-07, 2.0199709638291097e-07, 1.611188515937556e-08, 5.964113825029926e-08, 4.098318573397819e-09, 3.9080127578472457e-08, 7.511338218080255e-09, 5.965624154669058e-07, 1.6478223585636442e-07, 1.4106989354445432e-08, 3.2855584919389e-08, 3.3387166364917675e-09, 1.220043444050134e-08, 4.624639160510924e-08, 6.842309385746148e-09, 1.74262879681919e-08, 4.6611329906909305e-08, 9.331947836699328e-08, 1.2306078644996887e-07, 1.2359445022980253e-08, 1.1173199254699284e-08, 2.7724862405875683e-08, 2.419210147763806e-07, 3.451186785241589e-07, 2.593766978975509e-08, 9.964568192799561e-08, 9.797809674694236e-09, 1.9085564417764544e-07, 3.972706252852731e-08, 2.6639204619982593e-08, 6.874148805735558e-09, 3.146993776681484e-08, 2.4086594407890516e-07, 1.3126927456141857e-07, 2.1254339799270383e-07, 2.050203384840188e-08, 3.694976058454813e-08, 6.563175816154398e-07, 2.560050127442537e-08, 2.6882981174480847e-08, 6.880636078676616e-07, 2.0092733166166e-07, 2.788039665801989e-08, 2.628409134786125e-08, 5.1678345158734373e-08, 1.8935413947929192e-07, 4.61852835087484e-07, 1.1086777718105623e-08, 1.4542604276357451e-07, 2.8737009216683873e-08, 6.105167926762078e-07, 1.2016463379893594e-08, 1.3944705301582871e-07, 2.093712758721722e-08, 4.3801410498645055e-08, 1.966320795077081e-08, 6.654448991838535e-09, 1.1149590584125235e-08, 6.424939158478082e-08, 6.971554888934861e-09, 3.260019587614238e-09, 1.4260189473702667e-08, 2.7895078247297533e-08, 8.11578289017234e-08, 2.5995715802196173e-08, 2.2855578762914774e-08, 1.055962854934478e-07, 8.145542551574181e-08, 3.7793686402665116e-08, 4.881891513264236e-08, 2.342062366267328e-08, 1.059935517133681e-08, 3.604105103249822e-08, 5.062430830093945e-08, 3.6804440384230475e-08, 1.501580193519203e-09, 1.4475033367489232e-06, 1.076210423889279e-06, 1.304991315009829e-07, 3.073601462233455e-08, 1.7184021317007137e-08, 2.0421090596300928e-08, 7.904992216367646e-09, 1.6902052379919041e-07, 1.2416506933732308e-08, 5.4758292122869534e-08, 2.6250422280327257e-08, 1.3261367115546818e-08, 6.29807459517906e-08, 1.270998595259698e-08, 2.0171681569536304e-07, 4.386637186826192e-08, 6.962349630157405e-08, 2.9565120485131047e-07, 7.925131626507209e-07, 2.0868920103112032e-07, 1.7341794489311724e-07, 4.2942417621816276e-08, 4.213406956665722e-09, 8.824785169281313e-08, 1.7341569957807224e-08, 7.321587247588468e-08, 1.7941774288487977e-08, 1.1245148101579616e-07, 4.242405395871174e-07, 8.259573469615589e-09, 1.1336403105133286e-07, 8.268798978861014e-08, 2.2186977588489754e-08, 1.9539720952366224e-08, 1.0675703876472653e-08, 3.288517547161973e-08, 2.4340963022950746e-08, 6.639137239972115e-08, 5.604687380866835e-09, 1.386604697728444e-08, 6.675873720496384e-08, 1.1355886009312144e-08, 3.132159633878473e-07, 3.12451788886392e-08, 1.502181845580708e-07, 1.3461754377885882e-08, 1.8882955998833495e-07, 4.645742279762999e-08, 4.6453880742092224e-08, 7.714453964524637e-09, 3.5857155467056145e-08, 7.60832108426257e-09, 4.221501370693659e-08, 4.3407251126836854e-09, 1.340157496088068e-08, 8.565600495558101e-08, 1.7045413969185574e-08, 5.4221903411644234e-08, 3.021912675649219e-08, 6.153376119755194e-08, 3.938857240370908e-09, 4.135628017820636e-08, 1.781920389021252e-08, 4.3105885083605244e-08, 3.903354972578654e-09, 7.663085455078544e-08, 1.1890405993142394e-08, 9.304217840622186e-09, 1.0968062014171664e-09, 1.0536767902635802e-08, 1.1516804221400889e-07, 8.134522886393825e-07, 5.952623993721318e-08, 2.806350174466843e-08, 1.2833099027886874e-08, 1.0605690192733164e-07, 7.872949936427176e-07, 2.7501393162765453e-08, 3.936289072470345e-09, 2.0519442145428002e-08, 7.394815870753746e-09, 3.598397313453461e-08, 2.5378517065632877e-08, 4.698972233541099e-08, 7.54952989012736e-09, 6.322805461422831e-07, 5.582006412652163e-09, 1.29640980617296e-07, 1.5874988434916304e-08, 3.3837810775594335e-08, 6.474512037613067e-09, 9.121148281110436e-08, 1.3918511676536127e-08, 8.230025549949005e-09, 2.7061290097663004e-08, 2.6095918315149902e-08, 5.722363471960534e-09, 6.963475698285038e-07, 4.685091781198025e-08, 9.590579885809802e-09, 2.099205858030473e-07, 3.082160660028421e-08, 3.563162565001221e-08, 7.326312925215461e-07, 2.1759731225756695e-06, 2.407518309155421e-07, 2.974515780351794e-07, 2.529018416908002e-08, 7.667950718825978e-09, 2.663289251358947e-07, 3.4358880185436647e-08, 2.3130198201215535e-08, 3.1239693498719134e-08, 2.8691621878351725e-07, 3.895845068768722e-08, 2.4184130253956937e-08, 1.1582445225144511e-08, 5.1545349322168477e-08, 2.034345492063494e-08, 8.201963197507212e-08, 1.164153573540716e-08, 5.496356720868789e-07, 1.1682151246361627e-08, 4.7576914852243135e-08, 1.6349824605299546e-08, 4.090862759653646e-08, 2.1271189609706198e-07, 1.6697286753242224e-07, 3.989708119433999e-08, 2.852450279533514e-06, 1.2500372292834072e-07, 2.4846613655427063e-07, 1.245429093188477e-08, 2.9700272463628608e-08, 4.250991558762962e-09, 1.61443480806156e-07, 2.6386018703306036e-07, 7.638056409575711e-09, 3.4455793773702226e-09, 7.273289526210647e-08, 1.7631434090503717e-08, 7.58661311550668e-09, 2.1547013062672704e-08, 1.2675349125856883e-07, 2.5637149292379036e-08, 3.500976220038865e-08, 6.472243541111311e-08, 8.387915251262257e-09, 3.069512288789156e-08, 7.520387867998579e-08, 1.5724964441687916e-07, 1.9634005354873807e-07, 1.2290831818972947e-07, 1.112118730439704e-09, 1.546895944670723e-08, 9.91701032404535e-09, 6.882473257974198e-07, 8.267616635748709e-08, 4.469531234008173e-08, 2.075201344098332e-08, 8.649378457903367e-08, 5.202766573120243e-08, 4.5564942041664835e-08, 2.0319955496006514e-08, 8.705182352741758e-09, 6.452066969586667e-08, 2.1777438519166026e-08, 1.030954166481024e-08, 3.211904342492744e-08, 2.3336936294526822e-07, 8.054096056753224e-09, 1.9623354319264763e-07, 1.2888089884199871e-07, 1.5392496166555247e-08, 1.401903038100727e-09, 5.696818305978013e-08, 6.080025372057207e-09, 1.0782793324892737e-08, 2.4260730313585555e-08, 1.9388659566743627e-08, 2.2970310453729326e-07, 1.9971754028347277e-08, 2.8477993296860404e-08, 5.2273552597625894e-08, 2.7392806600801123e-07, 9.857291161097237e-08, 3.12910977129377e-08, 4.151442212219081e-08, 5.251196366629074e-09, 1.580681100676884e-06, 8.547603442821128e-07, 1.068913135782168e-08, 1.0621830597301596e-06, 7.737313012512459e-08, 6.394216711669287e-08, 1.1698345758759388e-07, 1.0486609625104393e-07, 2.1161000063329993e-07, 1.53396815250062e-08, 5.094453570109181e-08, 1.4005379966874898e-08, 2.6282036102998063e-08, 8.778433624456738e-08, 7.772066545896905e-09, 4.228875383205377e-08, 3.3243779284930497e-07, 7.729244799747903e-08, 7.636901111496286e-10, 5.989500806435899e-08, 1.326090597331131e-07, 1.2853634245857393e-07, 8.844242671557367e-09, 1.0194374766570036e-07, 2.493779334145074e-07, 1.6547971881664125e-07, 1.1762754326127833e-08, 1.1496195639892903e-07, 2.9342709240154363e-07, 1.326124099421122e-08, 8.630262726683213e-08, 5.7394842656322e-08, 1.1094081031615133e-07, 2.2933713239581266e-07, 3.4706170026765903e-07, 1.4751107357824367e-07, 1.502495017291494e-08, 6.454319390059027e-08, 5.164533689594464e-08, 6.23741556182722e-08, 1.293601457064142e-07, 1.4052071506398534e-08, 5.386946000385251e-08, 2.0827554791935654e-08, 1.3040637902861363e-08, 1.0578981601838677e-07, 1.5079727688771527e-08, 8.92632726845477e-07, 4.6374381668101705e-08, 7.481006036869076e-07, 5.883147302654379e-09, 2.8707685117979054e-09, 8.381598490814213e-07, 7.341958596640552e-09, 1.4245998158912698e-08, 1.0926417104428765e-07, 1.1308178216040687e-07, 2.52339901862797e-07, 1.1782835684925885e-07, 4.6678056975224536e-08, 2.7959197179683315e-09, 3.4363861090014325e-08, 1.4674496640054713e-07, 3.5396915620822256e-08, 2.0581127557761647e-07, 7.18387909159901e-08, 2.7693943138729082e-08, 4.5493386835460115e-08, 1.9559182717898693e-08, 1.5359708172013598e-08, 1.2336623278486059e-08, 2.9570605519779747e-08, 2.877552560676122e-07, 9.051845495378075e-07, 2.3732602016934834e-07, 1.6521676471370483e-08, 1.5478875070584763e-08, 3.526786329643983e-08, 3.616410637619083e-08, 1.61590953950963e-08, 7.65007328595857e-08, 1.9661483108279754e-08, 4.917534823789538e-08, 1.1712612746350715e-07, 1.0889253054813253e-08, 1.494120169809321e-06, 1.018585660261806e-08, 3.7575969003000864e-08, 2.097097784314883e-08, 3.368558054717141e-08, 4.845588819080149e-09, 6.039624622644624e-07, 1.037331109898787e-08, 2.841650257323636e-07, 4.4990630954089283e-07, 3.463186004637464e-08, 7.720684180867465e-08, 1.471122175189521e-07, 1.1601575522490748e-07, 4.007488030310924e-07, 3.025649775167949e-08, 6.706784461130155e-08, 2.0128741340386114e-08, 1.5987744461654074e-09, 4.1919822280078733e-08, 1.3167154477855547e-08, 3.231814815762846e-08, 9.247659704669786e-08, 1.3075300842047e-07, 1.0574301256838226e-07, 3.762165334819656e-08, 1.0942246575496029e-07, 7.001474955359299e-08, 2.742706151082075e-08, 2.0766625752344225e-08, 4.5403403703403455e-08, 3.39040298058535e-08, 1.0469661759771043e-07, 2.8271578855765256e-08, 3.406226767310727e-07, 5.146206945028098e-07, 6.740708613506285e-07, 6.382248063374618e-09, 3.63878704945364e-08, 3.626059807970705e-08, 1.6065602892467723e-07, 3.639055989879125e-07, 6.232691696084203e-09, 4.805490050330263e-08, 3.372633727849461e-08, 6.328880317596486e-07, 6.480631498106959e-08, 2.1165197949812864e-07, 8.38779143919055e-08, 1.7589144363228115e-08, 2.729027670511641e-09, 2.144795097080987e-08, 7.861271456022223e-08, 2.0118186228046397e-08, 2.8407685093156942e-08, 2.4922530883486615e-07, 2.0156670998972004e-08, 2.6551649767725394e-08, 2.7848242822869906e-08, 6.907123761834555e-09, 1.880543720744754e-08, 1.3006903998302732e-08, 3.685918272822164e-07, 3.967941211158177e-07, 2.7592133022835696e-08, 2.5228947819755376e-08, 1.547002881352455e-07, 3.689306637966183e-08, 1.440177199718562e-09, 2.1504929392790473e-08, 5.068111263994979e-08, 5.081711407228795e-08, 1.171875219085905e-08, 5.409278358570191e-08, 7.138276600926474e-07, 2.5237213208129106e-07, 7.072044638789521e-08, 7.199763984999663e-08, 1.2525473103153217e-08, 3.4803417747752974e-07, 1.9591827538079087e-07, 1.2404700555634918e-07, 1.234617457157583e-07, 1.9201337408958352e-08, 1.9895249181445251e-07, 3.7876677794201896e-08, 1.0629785052174157e-08, 1.2437127772102485e-08, 2.1861892207653e-07, 2.6181456291851646e-07, 1.112900775979142e-07, 1.0776630432474121e-07, 6.380325157095967e-09, 3.895085143312826e-09, 1.5762756788717525e-07, 2.909027019271093e-09, 1.0381050685737137e-08, 2.8135211493918177e-08, 1.0778002490496874e-08, 1.3605974125141529e-08, 2.9236465692861202e-08, 1.9189795352758665e-07, 2.199506354827463e-07, 1.326399790002597e-08, 4.9004846403022384e-08, 2.980837132682268e-09, 8.926045680368588e-09, 1.0996975774446582e-08, 7.71560149104289e-09, 7.454491246505768e-09, 5.086162246925596e-08, 1.5129764108223753e-07, 1.1960075596562092e-08, 1.1323334270230134e-08, 9.391332156383214e-09, 9.585701832293125e-08, 1.905532798218701e-08, 1.8105303922766325e-08, 6.179227796110354e-08, 6.389401363549041e-08, 1.1853179771037503e-08, 9.37277544466042e-09, 1.2332148457971925e-07, 1.6522022860954166e-08, 1.246116454467483e-07, 4.196171854431441e-09, 3.996593278543514e-08, 1.2554556505506298e-08, 1.4302138140465104e-08, 6.631793780798034e-09, 5.964224669696705e-09, 5.556936244488497e-09, 1.4192455921602232e-07, 1.7613080771639034e-08, 3.380189639301534e-07, 7.85651934620546e-08, 2.966783085867064e-08, 2.8992105853831163e-06, 1.3787366697215475e-06, 5.313622430946907e-09, 2.512852859126724e-08, 8.406627216572815e-08, 4.492839167369311e-08, 5.408793057881667e-08, 2.4239175999696272e-08, 4.016805235096399e-07, 4.1083545454512205e-08, 5.4153481698904216e-08, 8.640767212853007e-09, 5.773256717134245e-08, 2.6443152023603034e-07, 8.953217047746875e-07, 2.7994001783326894e-08, 5.889480014786841e-09, 4.1788819515886644e-08, 2.8880645430717777e-08, 2.135752907861388e-08, 2.3024175277441827e-07, 8.786625471657317e-08, 2.0697297209437693e-09, 2.236410523437371e-08, 3.203276310870251e-09, 1.176874686592555e-08, 6.963571053120177e-08, 2.271932153519174e-08, 7.360382525689602e-09, 6.922528772435044e-09, 3.213871480056696e-08, 1.370577820125618e-07, 1.9815049157045905e-08, 1.0578956377571558e-08, 2.7049420481262132e-08, 2.9755937713815683e-09, 2.1773699288019088e-08, 1.09755387001087e-08, 1.991872444762066e-08, 2.3882098076910552e-08, 2.1357365653784655e-08, 6.109098560358461e-09, 1.1890497475519624e-08, 1.1459891702259029e-08, 3.73173456580389e-08, 1.572620256240498e-08, 3.404023374287135e-08, 3.6921580459647885e-08, 9.281765045443535e-08, 1.2323201303843234e-07, 4.2347593876002065e-08, 1.7423728237986325e-08, 5.8113389656000436e-08, 3.931436154402945e-08, 2.3690461148362374e-08, 1.792850135018398e-08, 1.440664210150544e-08, 7.019830494670032e-09, 6.041522482291839e-08, 4.867479930226182e-08, 1.0685319296044327e-08, 1.0051243393149889e-08, 4.2426261614991745e-08, 2.607815297039906e-08, 5.136670200300* Connection #0 to host localhost left intact
841e-09, 1.69729952315123e-09, 1.9131586981302462e-08, 2.111743526711507e-07, 1.337269672774255e-08, 2.0002481448955223e-08, 1.0454256482717028e-07, 2.8144228281234973e-08, 2.1344791889532644e-07, 2.1046110632028103e-08, 1.9114453664315079e-07, 3.957693550660224e-08, 2.931631826186276e-08, 1.105203111251285e-07, 4.84007678380749e-08, 5.583606110803885e-08, 1.2130111315400427e-07, 1.77621615193857e-08, 2.5610853882085394e-08, 1.203865309662433e-07, 4.674859610531712e-09, 1.5916098661250544e-08, 3.147594185293201e-08, 6.147686093527227e-08, 2.204641802450169e-08, 3.257763410147163e-07, 1.198914532096751e-07, 2.3818989802748547e-07, 1.4909986134625797e-08, 5.10168831624469e-08, 5.5142201915714395e-08, 2.288550327023131e-08, 5.714110073995471e-08, 5.185095801607531e-07, 4.977285783525076e-08, 1.1049896109227575e-08, 1.264099296349741e-07, 8.174881571676451e-08]]}
此示例将引导您了解如何利用KServe推理服务CRD部署mlflow模型,以及如何使用V2数据平面发送推理请求。
第一步是通过调用mlflow log_model API来训练样本sklearn模型并保存为mlflow模型格式。
# Original source code and more details can be found in:
# https://www.mlflow.org/docs/latest/tutorials-and-examples/tutorial.html
# The data set used in this example is from
# http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties.
# In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
from mlflow.models.signature import infer_signature
import logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2
if __name__ == "__main__":
warnings.filterwarnings("ignore")
np.random.seed(40)
# Read the wine-quality csv file from the URL
csv_url = (
"http://archive.ics.uci.edu/ml"
"/machine-learning-databases/wine-quality/winequality-red.csv"
)
try:
data = pd.read_csv(csv_url, sep=";")
except Exception as e:
logger.exception(
"Unable to download training & test CSV, "
"check your internet connection. Error: %s",
e,
)
# Split the data into training and test sets. (0.75, 0.25) split.
train, test = train_test_split(data)
# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]
alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5
with mlflow.start_run():
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
model_signature = infer_signature(train_x, train_y)
# Model registry does not work with file store
if tracking_url_type_store != "file":
# Register the model
# There are other ways to use the Model Registry,
# which depends on the use case,
# please refer to the doc for more information:
# https://mlflow.org/docs/latest/model-registry.html#api-workflow
mlflow.sklearn.log_model(
lr,
"model",
registered_model_name="ElasticnetWineModel",
signature=model_signature,
)
else:
mlflow.sklearn.log_model(lr, "model", signature=model_signature)
训练脚本还将利用MLflow模型格式对我们训练的模型进行序列化。
model/
├── MLmodel
├── model.pkl
├── conda.yaml
└── requirements.txt
一旦你得到了你的模型序列化model.pkl,我们就可以使用MLServer来启动一个本地服务器。有关MLServer的更多详细信息,请随时查看MLflow示例文档。
笔记
此步骤是可选的,仅用于测试,可以直接使用InfenceService进行部署。
首先,要在本地使用MLServer,首先需要在本地环境以及MLflow运行时中安装MLServer包。
pip install mlserver mlserver-mlflow
下一步将提供一些模型设置,以便MLServer知道:
这些可以通过环境变量或创建本地model-settings.json文件来指定:
{
"name": "mlflow-wine-classifier",
"version": "v1.0.0",
"implementation": "mlserver_mlflow.MLflowRuntime"
}
有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:
mlserver start .
当您使用InferenceService部署模型时,KServe会注入合理的默认值,这样它就可以在没有任何进一步配置的情况下开箱即用。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型。
为了使用v2协议对部署的模型进行推理,您将protocolVersion字段设置为v2,在本例中,您的模型工件已经上传到“GCS模型存储库”,可以作为gs://kfserving examples/models/mlflow/wine访问。
新框架
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "mlflow-v2-wine-classifier"
spec:
predictor:
model:
modelFormat:
name: mlflow
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/mlflow/wine"
kubectl
kubectl apply -f mlflow-new.yaml
现在,您可以通过发送示例请求来测试已部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:
{
"parameters": {
"content_type": "pd"
},
"inputs": [
{
"name": "fixed acidity",
"shape": [1],
"datatype": "FP32",
"data": [7.4]
},
{
"name": "volatile acidity",
"shape": [1],
"datatype": "FP32",
"data": [0.7000]
},
{
"name": "citric acid",
"shape": [1],
"datatype": "FP32",
"data": [0]
},
{
"name": "residual sugar",
"shape": [1],
"datatype": "FP32",
"data": [1.9]
},
{
"name": "chlorides",
"shape": [1],
"datatype": "FP32",
"data": [0.076]
},
{
"name": "free sulfur dioxide",
"shape": [1],
"datatype": "FP32",
"data": [11]
},
{
"name": "total sulfur dioxide",
"shape": [1],
"datatype": "FP32",
"data": [34]
},
{
"name": "density",
"shape": [1],
"datatype": "FP32",
"data": [0.9978]
},
{
"name": "pH",
"shape": [1],
"datatype": "FP32",
"data": [3.51]
},
{
"name": "sulphates",
"shape": [1],
"datatype": "FP32",
"data": [0.56]
},
{
"name": "alcohol",
"shape": [1],
"datatype": "FP32",
"data": [9.4]
}
]
}
现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。
您可以使用curl将推理请求发送为:
SERVICE_HOSTNAME=$(kubectl get inferenceservice mlflow-v2-wine-classifier -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v \
-H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
-d @./mlflow-input.json \
http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mlflow-v2-wine-classifier/infer
期望输出
{
"model_name":"mlflow-v2-wine-classifier",
"model_version":null,
"id":"699cf11c-e843-444e-9dc3-000d991052cc",
"parameters":null,
"outputs":[
{
"name":"predict",
"shape":[1],
"datatype":"FP64",
"parameters":null,
"data":[5.576883936610762]
}
]
}