Categories

Versions

You are viewing the RapidMiner Deployment documentation for version 9.7 -Check here for latest version

Basic production template

The template defined below is meant for a typical production environment.

Use it to deployRapidMiner AI Hubon Kubernetes, with the following components:

  • 1 RapidMiner AI Hub instance
  • 3 RapidMiner Job Agents
  • Postgres database
  • Platform Admin
  • 1 JupyterHub instance
  • 1 Dashboards instance
  • 1 KeyCloak instance

For a detailed description of every Docker image, see theimage reference.

System requirements

Minimum recommended hardware configuration

The amount of memory needed depends heavily on the amount of data that will be processed by RapidMiner AI Hub. By themselves, the RapidMiner services can run with as little as 8GB. However, in production environments, we recommend 32GB or more depending on user data, in order to provide users with enough capacity to analyze data from realistic use cases.

Each virtual or physical machine should at least have:

  • Quad core
  • 32GB RAM
  • >20GB free disk space

Instructions

The providedDocker Imagesare ready to deploy to any Kubernetes Cluster.

Please review the configuration below according to your environment and requirements.

以下指南requires a running Kubernetes cluster.

Rapidminer Platform is supported on the following Kubernetes services:

Volumes

Volumes provides the Elastic Block Storage for the RapidMiner Platform components (Postgre DB,Python Enviroment Manager,RapidMiner Server,Real-Time Scoring) to store the data permanently during container life-cycle.

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rm-postgresql-pvc labels: app: rm-postgresql-svc spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pem-uploaded-pvc labels: app: pem-uploaded-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rm-server-home-pvc labels: app: rm-server-svc spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rapidminer-uploaded-pvc labels: app: rapidminer-uploaded-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 100M --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rts-uploaded-pvc labels: app: rts-webui spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rts-licenses-pvc labels: app: rapidminer-rts spec: accessModes: - ReadWriteOnce resources: requests: storage: 100M --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rts-deployments-pvc labels: app: rapidminer-rts spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rm-grafana-home-pvc labels: app: rm-grafana-home-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 500M

Services

服务是RapidMiner的必不可少的部分Platform. The services are used by containers/pods for reaching each other.

:服务apiVersion: v1元数据:名字:快速miner-server-amq-svc labels: app: rapidminer-server-amq-svc role: server spec: ports: - port: 5672 targetPort: amq selector: app: rm-server-svc role: server --- kind: Service apiVersion: v1 metadata: name: rm-proxy-svc labels: app: rm-proxy-svc role: proxy spec: ports: - name: proxyhttp protocol: TCP port: 80 targetPort: proxyhttp - name: proxyhttps protocol: TCP port: 443 targetPort: proxyhttps selector: app: rm-proxy-svc role: proxy type: LoadBalancer --- kind: Service apiVersion: v1 metadata: name: postgres-svc labels: app: rm-postgresql-svc spec: ports: - port: 5432 targetPort: postgresport selector: app: rm-postgresql-svc --- kind: Service apiVersion: v1 metadata: name: rm-server-svc labels: app: rm-server-svc role: server spec: ports: - port: 8080 targetPort: rmswebui selector: app: rm-server-svc role: server --- kind: Service apiVersion: v1 metadata: name: pem-webui-svc labels: app: pem-webui-cron role: pem spec: ports: - name: pem-webuiport port: 82 protocol: TCP targetPort: pem-webuiport selector: app: rm-proxy-svc role: proxy --- kind: Service apiVersion: v1 metadata: name: rm-grafana-svc labels: app: rm-grafana-svc role: grafana spec: ports: - name: grafanaport port: 3000 protocol: TCP targetPort: grafanaport selector: app: rm-grafana-svc role: grafana --- kind: Service apiVersion: v1 metadata: name: rts-webui-svc labels: app: rm-proxy-svc role: proxy spec: ports: - name: rts-webuiport port: 81 protocol: TCP targetPort: rts-webuiport selector: app: rm-proxy-svc role: proxy --- kind: Service apiVersion: v1 metadata: name: real-time-scoring-agent labels: app: real-time-scoring-agent role: real-time-scoring spec: ports: - name: rts-scoreport port: 8090 protocol: TCP targetPort: rts-scoreport selector: app: real-time-scoring-agent role: real-time-scoring

Database

Database is used by RapidMiner Server.

kind: Pod apiVersion: v1 metadata: name: rm-postgresql-svc labels: app: rm-postgresql-svc spec: containers: - name: rm-postgresql-svc image: postgres:9.6 ports: - name: postgresport containerPort: 5432 env: - name: POSTGRES_DB value: rmsdb - name: POSTGRES_USER value: rmsdbuser - name: POSTGRES_PASSWORD value: rmsdbpassword volumeMounts: - name: pgvolume mountPath: /var/lib/postgresql/data subPath: postgres volumes: - name: pgvolume persistentVolumeClaim: claimName: rm-postgresql-pvc

RapidMiner Server

The main component of the RapidMiner Platform.

kind: Pod apiVersion: v1 metadata: name: rm-server-svc labels: app: rm-server-svc role: server spec: hostname: rm-server-svc containers: - name: rapidminer-server image: rapidminer/rapidminer-server:9.6.0 ports: - name: rmswebui containerPort: 8080 - name: amq containerPort: 5672 env: - name: JOBSERVICE_QUEUE_ACTIVEMQ_USERNAME value: amq-user - name: JOBSERVICE_QUEUE_ACTIVEMQ_PASSWORD value: amq-pass - name: JOBSERVICE_AUTH_SECRET value: c29tZS1hdXRoLXNlY3JldAo= - name: DBHOST value: postgres-svc - name: DBSCHEMA value: rmsdb - name: DBUSER value: rmsdbuser - name: DBPASS value: rmsdbpassword - name: JUPYTER_URL_SUFFIX value: /jupyterhub - name: GRAFANA_URL_SUFFIX value: /grafana volumeMounts: - name: rm-server-home-pvc mountPath: /persistent-rapidminer-home subPath: rapidminer-home volumes: - name: rm-server-home-pvc persistentVolumeClaim: claimName: rm-server-home-pvc

Job-Agent

The worker which perform the computation tasks.

kind: Deployment apiVersion: apps/v1 kind: Deployment metadata: name: rm-server-job-agent-svc labels: app: rm-server-job-agent-svc role: execution spec: replicas: 3 selector: matchLabels: app: rm-server-job-agent-svc template: metadata: labels: app: rm-server-job-agent-svc role: execution spec: containers: - name: rm-server-job-agent-svc image: rapidminer/rapidminer-execution-jobagent:9.6.0 env: - name: RAPIDMINER_SERVER_HOST value: rapidminer-server-svc - name: RAPIDMINER_SERVER_PORT value: '8080' - name: JOBAGENT_QUEUE_ACTIVEMQ_URI value: failover:(tcp://rapidminer-server-amq-svc:5672) - name: JOBAGENT_QUEUE_ACTIVEMQ_USERNAME value: amq-user - name: JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD value: amq-pass - name: JOBAGENT_AUTH_SECRET value: c29tZS1hdXRoLXNlY3JldAo= - name: RAPIDMINER_JOBAGENT_OPTS value: "-Djobagent.python.registryBaseUrl=http://pem-webui-svc:82/"

RapidMiner Proxy & Python Environment Manager

The proxy component handles the incoming HTTP(S) traffic into the entire platform. Python Environment manager component (PEM) controls the python packages for job-agents. Real-Time Scoring (RTS) was designed for fast scoring use cases via web services. Those three platform pieces are MUST in one POD in kubernetes beaucuseproxymust read the certificates which are genereated by pem-cron and rts-cron containers.

apiVersion: apps/v1 kind: Deployment metadata: name: rm-proxy-svc labels: app: rm-proxy-svc role: proxy spec: replicas: 1 selector: matchLabels: app: rm-proxy-svc template: metadata: labels: app: rm-proxy-svc role: proxy spec: containers: - name: rm-proxy-svc image: rapidminer/rapidminer-proxy:9.6.0 imagePullPolicy: Always env: - name: RMSERVER_BACKEND value: "http://rm-server-svc:8080" - name: GRAFANA_BACKEND value: "http://rm-grafana-svc:3000" - name: GRAFANA_URL_SUFFIX value: "/grafana" - name: PEM_BACKEND value: "http://pem-webui-svc:82/" - name: PEM_URL_SUFFIX value: "/pem" - name: RTS_WEBUI_BACKEND value: "http://rts-webui-svc:81/" - name: RTS_WEBUI_URL_SUFFIX value: "/rts-admin" - name: RTS_SCORING_BACKEND value: "http://rts-agent-svc:8090/" - name: RTS_SCORING_URL_SUFFIX value: "/rts" - name: HTTPS_CRT_PATH value: "/rapidminer/uploaded/certs/validated_cert.crt" - name: HTTPS_KEY_PATH value: "/rapidminer/uploaded/certs/validated_cert.key" - name: HTTPS_DH_PATH value: "/rapidminer/uploaded/certs/dhparam.pem" - name: DEBUG_CONF_INIT value: "true" ports: - name: proxyhttp containerPort: 80 - name: proxyhttps containerPort: 443 volumeMounts: - name: pem-uploaded-pvc mountPath: /rapidminer/pem/uploaded/ - name: rts-uploaded-pvc mountPath: /rapidminer/rts/uploaded/ - name: pem-webui image: rapidminer/python-environment-manager-webui:9.6.0 imagePullPolicy: Always ports: - name: pem-webuiport containerPort: 82 volumeMounts: - name: pem-uploaded-pvc mountPath: /var/www/html/uploaded - name: pem-cron image: rapidminer/python-environment-manager-cron:9.6.0 imagePullPolicy: Always volumeMounts: - name: pem-uploaded-pvc mountPath: /rapidminer/uploaded - name: rts-cron image: rapidminer/rapidminer-real-time-scoring-cron:9.6.0 resources: requests: memory: "100M" cpu: "0.5" limits: memory: "200M" cpu: "0.5" volumeMounts: - name: rts-uploaded-pvc mountPath: /rapidminer/uploaded/ - name: rts-licenses-pvc mountPath: /rapidminer/rts_home/licenses/ - name: real-time-scoring-webui image: rapidminer/rapidminer-real-time-scoring-webui:9.6.0 ports: - name: rts-webuiport containerPort: 81 resources: requests: memory: "200M" cpu: "0.5" limits: memory: "500M" cpu: "0.5" volumeMounts: - name: rts-uploaded-pvc mountPath: /var/www/html/uploaded - name: rts-licenses-pvc mountPath: volumes: - name: pem-uploaded-pvc persistentVolumeClaim: claimName: pem-uploaded-pvc - name: rts-uploaded-pvc persistentVolumeClaim: claimName: rts-uploaded-pvc - name: rts-licenses-pvc persistentVolumeClaim: claimName: rts-licenses-pvc

Dashboards

Monitoring and metric analytics & dashboards.

apiVersion: apps/v1 kind: Deployment metadata: name: rm-grafana-svc labels: app: rm-grafana-svc role: grafana spec: replicas: 1 selector: matchLabels: app: rm-grafana-svc template: metadata: labels: app: rm-grafana-svc role: grafana spec: containers: - name: rm-grafana-proxy-svc image: rapidminer/rapidminer-grafana-proxy:9.6.0 imagePullPolicy: Always env: - name: RAPIDMINER_URL value: http://rm-server-svc:8080 ports: - name: grafanaport containerPort: 3000 - name: rm-grafana-svc image: rapidminer/rapidminer-grafana:9.6.0 imagePullPolicy: Always env: - name: GF_SERVER_ROOT_URL value: '%(protocol)s://%(domain)s:%(http_port)s/grafana/' - name: GF_SECURITY_ADMIN_PASSWORD value: grafanaadminpass volumeMounts: - name: rm-grafana-home-pvc mountPath: /var/lib/grafana volumes: - name: rm-grafana-home-pvc persistentVolumeClaim: claimName: rm-grafana-home-pvc

Real-Time Scoring

This is an add-on product to RapidMiner Server designed for fast scoring use cases via web services.

kind: Deployment apiVersion: apps/v1 kind: Deployment metadata: name: real-time-scoring-agent labels: app: real-time-scoring-agent role: real-time-scoring spec: replicas: 1 selector: matchLabels: app: real-time-scoring-agent template: metadata: labels: app: real-time-scoring-agent role: real-time-scoring spec: containers: - name: real-time-scoring-agent image: rapidminer/rapidminer-execution-scoring:latest ports: - name: rts-scoreport containerPort: 8090 env: - name: WAIT_FOR_LICENSES value: "1" resources: requests: memory: "2G" cpu: "1" limits: memory: "32G" cpu: "1" volumeMounts: - name: rts-deployments-pvc mountPath: /rapidminer-scoring-agent/home/deployments - name: rts-licenses-pvc mountPath: /rapidminer-scoring-agent/home/resources/licenses/rapidminer-scoring-agent/ volumes: - name: rts-deployments-pvc persistentVolumeClaim: claimName: rts-deployments-pvc - name: rts-licenses-pvc persistentVolumeClaim: claimName: rts-licenses-pvc # nodeSelector: # node-label-name: label-value-of-worker-node-where-rts-may-started