KubernetesGhaRunnerScaleSet - Technical Documentation

Overview

The KubernetesGhaRunnerScaleSet deployment component enables declarative deployment of GitHub Actions self-hosted runners on Kubernetes clusters. It leverages the official Actions Runner Controller (ARC) Helm chart to create AutoScalingRunnerSet resources that dynamically scale based on workflow demand.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        GitHub                                    │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              Repository/Organization                      │   │
│  │                                                          │   │
│  │  Workflow Job → Queue → Webhook → Controller → Runner   │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                            │
│                                                                  │
│  ┌──────────────────────┐    ┌─────────────────────────────┐   │
│  │  Controller (ARC)     │    │  Runner Scale Set           │   │
│  │  ─────────────────    │    │  ───────────────            │   │
│  │  Watches for jobs     │───▶│  AutoScalingRunnerSet       │   │
│  │  Creates runner pods  │    │  EphemeralRunner pods       │   │
│  │  Manages lifecycle    │    │  PVCs for caching           │   │
│  └──────────────────────┘    └─────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

How It Works

Registration Flow

Helm chart creates an AutoScalingRunnerSet custom resource
Controller registers the scale set with GitHub via the config URL
GitHub associates runners with the specified repository/organization/enterprise
Runners appear in GitHub Settings → Actions → Runners

Job Execution Flow

Workflow job is triggered with runs-on: [self-hosted, scale-set-name]
GitHub queues the job and notifies the controller
Controller creates an EphemeralRunner pod
Runner pod registers with GitHub, picks up the job
Job executes in the runner pod
Runner pod terminates after job completion

Scaling Behavior

Scenario	Behavior
Jobs queued	Scale up to handle queue (up to `maxRunners`)
Jobs complete	Scale down to `minRunners`
No jobs	Maintain `minRunners` (can be 0 for cost savings)
Surge in jobs	Parallel runners up to `maxRunners`

Container Modes

DIND (Docker-in-Docker)

containerMode:
  type: DIND

Runner pod includes a privileged DinD sidecar
Supports docker build, docker run, etc.
Required for workflows that build/push Docker images
Requires: Privileged container support in cluster

KUBERNETES

containerMode:
  type: KUBERNETES
  workVolumeClaim:
    storageClass: fast-ssd
    size: "50Gi"

Each workflow step runs as a separate Kubernetes pod
Native Kubernetes container execution
Uses container hooks for orchestration
Requires: Ephemeral volume support, ServiceAccount permissions

KUBERNETES_NO_VOLUME

containerMode:
  type: KUBERNETES_NO_VOLUME

Same as KUBERNETES but without ephemeral volumes
For clusters that don't support ephemeral volume claims
Workspace is not persisted between steps

DEFAULT

containerMode:
  type: DEFAULT

Direct execution on the runner pod
No container isolation for steps
Simple workflows that don't need Docker

Persistent Volumes

Purpose

PVCs persist data across runner pod restarts, enabling:

Dependency caching: npm, maven, gradle, pip packages
Docker layer caching: Faster image builds
Build artifacts: Share between jobs

Implementation

persistentVolumes:
  - name: npm-cache
    size: "20Gi"
    storageClass: standard
    mountPath: /home/runner/.npm

Creates:

A PersistentVolumeClaim named {release-name}-npm-cache
Volume mount in the runner container spec
Volume reference in the pod spec

Cache Effectiveness

For optimal caching:

Use minRunners >= 1 to keep at least one runner warm
PVCs are per-scale-set, not per-runner (shared cache)
Consider storage class with good IOPS for large caches

Authentication

PAT Token

Personal Access Token authentication:

Permissions needed:

Scope	Repository	Organization	Enterprise
`repo`	Required	-	-
`admin:org`	-	Required	-
`manage_runners:enterprise`	-	-	Required

Secret structure:

github_token: ghp_xxxxxxxxxxxx

GitHub App

Recommended for organizations:

Permissions needed:

Repository: actions:read, metadata:read
Organization: self_hosted_runners:read/write

Secret structure:

github_app_id: "123456"
github_app_installation_id: "654321"
github_app_private_key: |
  -----BEGIN RSA PRIVATE KEY-----
  ...

Existing Secret

For secrets provisioned outside this component:

github:
  existingSecretName: my-github-secret

Secret must contain either PAT or GitHub App fields.

IaC Implementations

Pulumi Module

Location: iac/pulumi/module/

Key files:

main.go: Entry point, orchestrates deployment
locals.go: Configuration parsing, defaults, exports
runner.go: Helm release and PVC creation
vars.go: Constants (chart name, repo, version)

Terraform Module

Location: iac/tf/