Back to Home
Kubernetes
Serverless workloads on Kubernetes: Knative Serving, Eventing, and OpenShift Serverless
serverlessknativeservingeventingscale-to-zeroopenshiftfaas
Kubernetes Serverless
Serverless workloads on Kubernetes with Knative and OpenShift Serverless.
What is Knative?
Knative extends Kubernetes to run serverless workloads:
- Serving: Request-driven auto-scaling (including scale-to-zero)
- Eventing: Event-driven architecture with sources and brokers
Knative Serving
Basic Knative Service
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: ml-inference
namespace: ml-serving
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "0"
autoscaling.knative.dev/maxScale: "10"
autoscaling.knative.dev/target: "100"
spec:
containers:
- image: my-registry/ml-model:v1
ports:
- containerPort: 8080
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
env:
- name: MODEL_PATH
value: "/models/v1"
readinessProbe:
httpGet:
path: /health
port: 8080
Auto-Scaling Configuration
metadata:
annotations:
# Scale to zero when no traffic
autoscaling.knative.dev/minScale: "0"
# Maximum replicas
autoscaling.knative.dev/maxScale: "20"
# Concurrent requests per pod
autoscaling.knative.dev/target: "50"
# Metric type: concurrency or rps
autoscaling.knative.dev/metric: "concurrency"
# Scale down delay (seconds)
autoscaling.knative.dev/scale-down-delay: "30s"
# Scale to zero grace period
autoscaling.knative.dev/scale-to-zero-grace-period: "60s"
Traffic Splitting (Canary / Blue-Green)
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: ml-inference
spec:
template:
metadata:
name: ml-inference-v2
spec:
containers:
- image: my-registry/ml-model:v2
traffic:
- revisionName: ml-inference-v1
percent: 80
- revisionName: ml-inference-v2
percent: 20
# Canary → Full rollout
kn service update ml-inference --traffic ml-inference-v2=100
# Rollback
kn service update ml-inference --traffic ml-inference-v1=100
kn CLI
# Create service
kn service create my-app --image my-registry/app:v1 --port 8080
# List services
kn service list
# Update service
kn service update my-app --image my-registry/app:v2
# Describe service
kn service describe my-app
# Delete service
kn service delete my-app
# List revisions
kn revision list
# Get service URL
kn service describe my-app -o url
Knative Eventing
Event Source → Service
apiVersion: sources.knative.dev/v1
kind: ApiServerSource
metadata:
name: pod-events
spec:
serviceAccountName: event-watcher
mode: Resource
resources:
- apiVersion: v1
kind: Pod
sink:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: event-processor
Broker & Trigger Pattern
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: default
namespace: ml-events
---
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: ml-training-trigger
spec:
broker: default
filter:
attributes:
type: ml.data.updated
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: training-pipeline
Serverless vs Deployment
| Feature | Deployment | Knative Serverless |
|---|---|---|
| Minimum pods | 1+ always running | 0 (scale-to-zero) |
| Scaling | HPA (manual config) | Automatic (request-based) |
| Cost | Always paying | Pay per request |
| Cold start | None | Possible (scale from 0) |
| Traffic split | Manual (multiple deployments) | Built-in |
| Use case | Steady traffic | Bursty/intermittent traffic |
When to Use Serverless
- ML inference APIs with variable/bursty traffic
- Data preprocessing triggered by events
- Webhook handlers and event processors
- Dev/staging environments (cost savings with scale-to-zero)
- Batch scoring triggered on demand