Categories

Versions

You are viewing the RapidMiner Scoring-agent documentation for version 9.7 -Check here for latest version

Kubernetes

The providedDocker Imagesare ready to deploy to any Kubernetes Cluster.

Please review the configuration below according to your environment and requirements.

以下指南需要Kubernetes运行cluster.

Our example configuration was tested in the following Kubernetes services:

Deployment architecture and definition

This tutorial coversMulti-container-based deploymenton Kubernetes with the following components:

  • RapidMiner Real-Time Scoring Agent,
  • Rapidminer Real-Time Scoring Web UIWeb UI,
  • a frontend proxy and,
  • a cron container.

To deploy RapidMiner Real-Time Scoring on Kubernetes, you need to define theservices,volumesanddeployments.

Volumes

The Volumes configuration uses four persistent volumes, similar as descibed in theDocker-based deploymentsection:

  1. A volume for uploaded files storage
  2. A volume for cron log files storage
  3. A volume for license data storage
  4. A volume for the deployments of the RapidMiner Real-Time Scoring

To define the volumes, you can apply the following Kubernetes Object Configuration YAML file.

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rapidminer-uploaded-pvc labels: app: rapidminer-webui spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rapidminer-cron-log-pvc labels: app: rapidminer-cron spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rts-licenses-pvc labels: app: rapidminer-rts spec: accessModes: - ReadWriteOnce resources: requests: storage: 100M --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rts-deployments-pvc labels: app: rapidminer-rts spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi

Services

To deploy the example configuration, 3 Kubernetes Service Endpoint are defined:

  1. The public proxy service endpoint represents the public web interface of the proxy container (port: 443).
  2. The private RapidMiner WebUI service endpoint represents the private web interface of the WebUI container (port: 80)
  3. The private RapidMiner Real-Time Scoring service endpoint represents the private web interface of the Real-Time Scoring container (port: 8090)

Note:

  • the public endpoint definition may differ on different Kubernetes Clusters. Public cloud providers support theLoadBalancertype, but the MicroK8S implementation requires the setting of anIngressto enable public access.
  • When testing in MiniKube, the annotation block and the type: LoadBalancer line can be ignored. Please read theNotices about minikube.
  • It is strongly recommended to use a valid certificate. The sample service definition contains recommended settings to set up an AWS loadbalancer for https offloading with AWS Certificate Manager. For usage in a protected network, or for testing (eg. MiniKube or MicroK8S), the annotation block can be omitted and nodePort can be used for all the services.

To define the service endpoints, you can apply the following Kubernetes Object Configuration YAML file:

kind: Service apiVersion: v1 metadata: name: rapidminer-proxy annotations: service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:XX-XXXX-X:XXXXXXXXXXXX:certificate/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: "ELBSecurityPolicy-TLS-1-2-2017-01" service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http" service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60" service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "false" service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "Name=rapidminer-rts-elb" labels: app: real-time-scoring-webui role: webui spec: type: LoadBalancer ports: - name: rts-proxyhttp port: 443 protocol: TCP targetPort: rts-proxy-http selector: app: real-time-scoring-webui role: webui --- kind: Service apiVersion: v1 metadata: name: real-time-scoring-webui labels: app: real-time-scoring-webui role: webui spec: ports: - name: rts-webuiport port: 81 protocol: TCP targetPort: rts-webuiport selector: app: real-time-scoring-webui role: webui --- kind: Service apiVersion: v1 metadata: name: real-time-scoring-agent labels: app: real-time-scoring-agent role: real-time-scoring spec: ports: - name: rts-scoreport port: 8090 protocol: TCP targetPort: rts-scoreport selector: app: real-time-scoring-agent role: real-time-scoring

Deployments (Pods, Containers)

The containers are deployed using a Deployment Kubernetes object type, that provides replication and starts one replica from each type in this example.

The environment variables are defined based on theDocker Image documentation.

The example configuration defines the following 2 deployments:

  • The RapidMiner Real-Time Scoring Agent pod is defined with the following configuration. Therts-deployments-pvcis used to provide the persistcy for the scoring deployments.

Because sharing volumes between Kubernetes pods can be difficult to set up and maintain, the example configuration below is prepared to download the licensing information from the WebUI at container startup.

Please review the resource limitations to fit with your hardware capabilities.

To influence that on which worker node Kubernetes should start the pod, first you have to add a label to a worker node of the cluster, and with the nodeSelector property you can set this up in the deployment too.

:部署apiVersion:应用程序/ v1: Deployment metadata: name: real-time-scoring-agent labels: app: real-time-scoring-agent role: real-time-scoring spec: replicas: 1 selector: matchLabels: app: real-time-scoring-agent template: metadata: labels: app: real-time-scoring-agent role: real-time-scoring spec: containers: - name: real-time-scoring-agent image: rapidminer/rapidminer-execution-scoring:latest ports: - name: rts-scoreport containerPort: 8090 env: - name: WAIT_FOR_LICENSES value: "1" - name: MANAGEMENT_API_ENDPOINT value: "http://real-time-scoring-webui:81/" resources: requests: memory: "2G" cpu: "1" limits: memory: "32G" cpu: "1" volumeMounts: - name: rts-deployments-pv mountPath: /rapidminer-scoring-agent/home/deployments volumes: - name: rts-deployments-pv persistentVolumeClaim: claimName: rts-deployments-pvc # nodeSelector: # node-label-name: label-value-of-worker-node-where-rts-may-started
  • The RapidMiner Real-Time Scoring WebUI pod is defined with the following configuration. Therapidminer-uploaded-pvc,rapidminer-cron-log-pvc,rts-licenses-pvcare used to provide the persistcy for the uploaded files, logs, and licenses.

Because sharing volumes between Kubernetes pods can be difficult to set up and maintain, the example pod configuration below contains 3 containers, so they are deployed always on the same worker node by Kubernetes and this way they can share volumes.

The resource limitations are included for reference, this containers are not resource intensive.

To influence that on which worker node Kubernetes should start the pod, first you have to add a label to a worker node of the cluster, and with the nodeSelector property you can set this up in the deployment too.

:部署apiVersion:应用程序/ v1: Deployment metadata: name: real-time-scoring-webui labels: app: real-time-scoring-webui role: webui spec: replicas: 1 selector: matchLabels: app: real-time-scoring-webui template: metadata: labels: app: real-time-scoring-webui role: webui spec: containers: - name: rapidminer-cron image: rapidminer/rapidminer-real-time-scoring-cron:latest resources: requests: memory: "100M" cpu: "0.5" limits: memory: "200M" cpu: "0.5" volumeMounts: - name: rapidminer-uploaded-pv mountPath: /rapidminer/uploaded/ - name: rapidminer-cron-log-pv mountPath: /var/log/ - name: rts-licenses-pv mountPath: /rapidminer/rts_home/licenses/ - name: real-time-scoring-webui image: rapidminer/rapidminer-real-time-scoring-webui:latest ports: - name: rts-webuiport containerPort: 81 resources: requests: memory: "200M" cpu: "0.5" limits: memory: "500M" cpu: "0.5" volumeMounts: - name: rapidminer-uploaded-pv mountPath: /var/www/html/uploaded - name: rapidminer-proxy image: rapidminer/rapidminer-real-time-scoring-proxy:latest ports: - name: rts-proxy-http containerPort: 80 resources: requests: memory: "200M" cpu: "1" limits: memory: "200M" cpu: "1" volumeMounts: - name: rapidminer-uploaded-pv mountPath: /rapidminer/uploaded readOnly: true volumes: - name: rapidminer-uploaded-pv persistentVolumeClaim: claimName: rapidminer-uploaded-pvc - name: rapidminer-cron-log-pv persistentVolumeClaim: claimName: rapidminer-cron-log-pvc - name: rts-licenses-pv persistentVolumeClaim: claimName: rts-licenses-pvc # nodeSelector: # node-label-name: label-value-of-worker-node-where-rts-may-started

Deployment process

Based on the object definitions shown above, the RapidMiner Real-Time Scoring can be deployed on Kubernetes Cluster with all the components:

  • Make sure that the connection to your Kubernetes Cluster is working
$ kubectl版本客户端版本:版本。信息{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:13:54Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:15:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Create and check the volumes
$ kubectl apply -f volumes.yml persistentvolumeclaim/rapidminer-uploaded-pvc created persistentvolumeclaim/rapidminer-cron-log-pvc created persistentvolumeclaim/rts-licenses-pvc created persistentvolumeclaim/rts-deployments-pvc created $ kubectl get pv,pvc
  • Create and check services
$ kubectl apply -f services.yml` service/rapidminer-proxy created service/real-time-scoring-webui created service/real-time-scoring-agent created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE rapidminer-proxy ClusterIP 10.103.149.61
        443/TCP,80/TCP 115s real-time-scoring-agent ClusterIP 10.104.163.156
         8090/TCP 115s real-time-scoring-webui ClusterIP 10.98.219.140
          80/TCP 115s
        
       
  • Create Deployments
$ kubectl apply -f real-time-scoring-agent.yml deployment.apps/real-time-scoring-agent created $ kubectl apply -f real-time-scoring-webui.yml deployment.apps/real-time-scoring-webui created
  • Check the running deployments
$ kubectl get pod NAME READY STATUS RESTARTS AGE real-time-scoring-agent-85c57b9675-6l2fv 1/1 Running 0 6m6s real-time-scoring-webui-66799d6b74-7c8j9 3/3 Running 0 6m6s
  • Check the logs of a running Real-Time Scoring Agent container/pod (replace pad names as your get pod command above outpouts)
$ kubectl logs -f real-time-scoring-agent-85c57b9675-6l2fv ... [INFO] Waiting for license synchronization.... Please upload your licenses on the Web UI [INFO] Waiting for license synchronization.... Please upload your licenses on the Web UI ...

In case of the real-time-scoring-webui Pod, it is a bit more different, because the pod contains 3 containers, so the container shold be defined in the command too:

$ kubectl logs -f real-time-scoring-webui-66799d6b74-7c8j9 -c rapidminer-proxy ... [entrypoint.sh] Mandatory file missing, waiting... [entrypoint.sh] Starting nginx... 2019/09/02 15:07:46 [warn] 18#18: "ssl_stapling" ignored, issuer certificate not found for certificate "/rapidminer/uploaded/certs/validated_cert.crt" nginx: [warn] "ssl_stapling" ignored, issuer certificate not found for certificate "/rapidminer/uploaded/certs/validated_cert.crt"
$ kubectl logs -f real-time-scoring-webui-66799d6b74-7c8j9 -c real-time-scoring-webui ... [Mon Sep 02 15:07:11.778314 2019] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
$ kubectl logs -f real-time-scoring-webui-66799d6b74-7c8j9 -c rapidminer-cron ... [entrypoint.sh] Starting cron...

From here the way you can connect to the Web UI depends on your installation:

  • In case of deploying to a cloud providers Kubernetes cluster, you will see a new LoadBalancer in your resources list,
  • With MikroK8S you have to define an Ingress,
  • With MiniKube please look at the end of theNotices about minikubesection.

Please note, that by default the proxy container at the port 443 works with a self signed certificate, when opening the Web UI first time, you will see a warning about that. You can bypass the warning, the communication will be encrypted between your browser and the proxy, but it is strongly recommended to replace this certificate with a trusted one.

Limitations

  • At the moment set replicas to more than 1 is not supported.
  • After a new certificate is deployed, the reverse proxy should be reloaded with the following command:
    kubectl exec -it `kubectl get pods | grep webui | awk '{print $1}'` -c rapidminer-proxy -- /etc/init.d/nginx reload

Notices about minikube

In the default MiniKube installation, the cluster resources are limited to a very low level. Using the following commands you can lift up these limitations permanently:

minikube config set memory 16384 minikube config set cpus 8 minikube config set disk-size 200000MB

If you are using a linux workstation and have docker installed, you can start MiniKube with avm-driver noneoption, in that case all the cluster services and deployed objects will run on your existing docker engine. To set this permanently, the following command can be used:

minikube config set vm-driver none

In case using thevm-driver noneoption, minikube api server can be bound to your host:

minikube start --apiserver-ips 127.0.0.1 --apiserver-name localhost

The configuration above will take effect after delete and start minikube commands.

Minikube has no support for loadbalancer, so please modify theservices.ymlfile:

  • remove the complete "annotations:" block
  • remove the "type: LoadBalancer" lines
  • add a line "type: NodePort" right after line "spec:" at every service definition

发现运行以下comman暴露的端口ds after the deployment process is done:

minikube service list |-------------|-------------------------|----------------------------| | NAMESPACE | NAME | URL | |-------------|-------------------------|----------------------------| | default | rapidminer-proxy | http://10.103.149.61:31871 | | default | real-time-scoring-agent | http://10.103.149.61:31488 | | default | real-time-scoring-webui | http://10.103.149.61:30274 | |-------------|-------------------------|----------------------------|

Alternatively:

$ kubectl get services | grep proxy rapidminer-proxy ClusterIP 10.103.149.61
        443/TCP,80/TCP 8m2s

From the output above, you can open the Web UI in your browser on the following URLs:

  • https://10.103.149.61:443/rts-admin/ (using https protocol)
  • http://10.103.149.61:80/rts-admin/ (using http protocol)

(in case you are using the default 80 and 443 ports, they can be omitted from the URLs)