Skip to main content
Version: Next

Developers guide

Here you can find how to test a virtual kubelet implementation against the main pod use cases we mean to support.

The CI pipeline uses a K3s-based test environment that you can reproduce locally without any special cloud account or token. The same three scripts used by the GitHub Actions workflow (k3s-test-setup.sh, k3s-test-run.sh, k3s-test-cleanup.sh) work on any Linux machine with Docker and Go installed.

Requirements

  • Linux host (bare metal or VM) with sudo access
  • Docker engine
  • Go 1.26+ (go.mod specifies the minimum version)
  • curl, git (standard tools)

The setup script installs K3s automatically; you do not need kubectl or k3s pre-installed.

Run the full e2e suite locally

# Clone the repo and enter it
git clone https://github.com/interlink-hq/interLink.git
cd interLink

# (Optional) enable unprivileged user namespaces if your kernel restricts them
sudo sysctl -w kernel.unprivileged_userns_clone=1

# 1. Build images, start K3s, deploy interLink and the SLURM plugin
bash scripts/k3s-test-setup.sh

# 2. Run the pytest test suite against the live cluster
bash scripts/k3s-test-run.sh

# 3. Collect logs and tear down all resources
bash scripts/k3s-test-cleanup.sh

A convenience Makefile target runs all three steps in sequence:

make test-k3s

Individual targets are also available:

make test-k3s-setup    # setup only
make test-k3s-run # run only (requires setup to have completed)
make test-k3s-cleanup # cleanup only

What the scripts do

ScriptPurpose
k3s-test-setup.shInstalls K3s; builds interlink-api and interlink-plugin (SLURM) images; installs Traefik v3 CRDs and cert-manager; creates a CA ClusterIssuer; starts all containers; starts virtual-kubelet as a host process; approves kubelet CSRs; runs a Slurm/Apptainer smoke test; starts the rathole Docker Compose tunnel environment
k3s-test-run.sh(1) Runs the pytest vk-test-set suite (pod lifecycle + port-forwarding infra) against the live cluster; (2) runs the standalone Docker tunnel tests in test/portforward/
k3s-test-cleanup.shCaptures TLS resource state and rathole container logs; tears down Docker Compose; stops containers; kills virtual-kubelet; uninstalls K3s

Artefacts and logs

All logs and job output are written to a temporary directory created by the setup script (printed as TEST_DIR at startup):

/tmp/interlink-test-XXXXXX/
k3s-install.log – K3s install output
vk.log – virtual-kubelet stdout/stderr
interlink-api.log – interlink-api container logs (live-streamed)
interlink-plugin.log – interlink-plugin container logs (live-streamed)
test-results.log – pytest vk-test-set output
portforward-test-results.log – pytest Docker tunnel test output
rathole-server-tcp.log – rathole TCP server container logs
rathole-server-ws.log – rathole WebSocket server container logs
plugin-jobs/ – job directories from inside the plugin container
slurm-logs/ – Slurm daemon logs from inside the plugin container

Environment variables

VariableDefaultDescription
TEST_DIRauto-created under /tmpOverride the test directory path
K3S_VERSIONv1.31.4+k3s1K3s version to install

Running individual test suites manually

Once k3s-test-setup.sh has completed (or you have an equivalent environment running), you can iterate on individual test suites without re-running the full setup. All commands below assume you are at the repository root and the cluster is reachable via /etc/rancher/k3s/k3s.yaml.

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
TEST_DIR=$(cat /tmp/interlink-test-dir.txt) # set by setup script

1. VK pod integration tests (test/vk-test-set/)

These tests submit real pods to the virtual-kubelet node and validate pod lifecycle, logs, volumes, probes, init containers, and projected volumes.

cd test/vk-test-set

# Create an isolated Python venv (avoids oauthlib conflicts on RHEL/Rocky 9)
python3 -m venv .venv
source .venv/bin/activate
pip install -q pytest kubernetes jinja2 pydantic requests pytest-timeout

# Run the full suite (excluding slow / optional test classes)
pytest -v -k "not rclone and not limits and not stress and not multi-init and not fail"

# Run a single template test
pytest -v vktestset/basic_test.py::test_manifest[virtual-kubelet-000-hello-world.yaml]

# Run the port-forwarding infrastructure tests (separate file)
pytest -v vktestset/port_forward_test.py

# Run one specific port-forwarding test
pytest -v vktestset/port_forward_test.py::test_rathole_annotation_set

deactivate

What port_forward_test.py checks — when a pod with containerPort is scheduled on the virtual-kubelet node the VK must provision the full rathole tunnel infrastructure. The 10 tests verify:

TestAssertion
test_port_forward_pod_scheduledPod reaches Running/Succeeded
test_rathole_annotation_setinterlink.eu/rathole-client-commands annotation is set and non-empty
test_shadow_namespace_exists<namespace>-wstunnel namespace is created
test_rathole_deployment_existsRathole server Deployment exists in the shadow namespace
test_rathole_service_existsRathole server Service exists in the shadow namespace
test_rathole_server_certificate_readycert-manager Certificate (server TLS) reaches Ready=True
test_rathole_client_certificate_readycert-manager Certificate (client TLS) reaches Ready=True
test_tls_secrets_issuedBoth TLS Secrets contain ca.crt, tls.crt, tls.key
test_traefik_ingressroutetcp_existsTraefik IngressRouteTCP is created for SNI routing
test_annotation_contains_valid_tomlAnnotation encodes valid TOML: [client], port 443, TLS transport, p8080 service

Troubleshooting

# Check VK is running and Ready
kubectl get node virtual-kubelet

# Tail VK logs in real time
tail -f "${TEST_DIR}/vk.log"

# Check if interlink API can reach the plugin
curl http://localhost:3000/pinglink

# Inspect shadow namespace resources after a port-forward pod is created
kubectl get all,certificates -n interlink-wstunnel

Rocky/CentOS 9 note: The system oauthlib package is missing SIGNATURE_RSA, which breaks the kubernetes Python client. Always use a fresh venv (as shown above) rather than the system Python.


2. Docker tunnel unit tests (test/portforward/)

These tests are independent of Kubernetes. They run against a local Docker Compose topology that mimics the rathole tunnel between a remote HPC node and the cluster side. They can be run on any machine with Docker — no K3s required.

cd test/portforward

# Start the two-network Docker Compose environment
docker compose up -d

# Wait for tunnels to be ready (rathole clients retry automatically)
# You can watch tunnel handshake with:
docker compose logs -f rathole-client-tcp

# Install test dependencies
python3 -m venv .venv-pf
source .venv-pf/bin/activate
pip install pytest requests pytest-timeout "kubernetes>=28.0"

# Run all 37 tests
pytest -v

# Run only the TCP tunnel tests
pytest -v test_tunnel.py::TestTCPTunnel

# Run only the WebSocket tunnel tests
pytest -v test_tunnel.py::TestWebSocketTunnel

# Run the Kubernetes TLS resource tests (requires KUBECONFIG to be set)
# These verify cert-manager Certificates and Traefik IngressRouteTCP
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
pytest -v test_tls_k8s.py

deactivate

# Tear down when done
docker compose down -v

Network topology

[remote network – isolated "HPC" side]
backend (nginx:alpine)
port 80 → "Hello from remote backend"
port 9090 → "Metrics from remote backend"

[cluster network – "Kubernetes" side]
rathole-server-tcp (TCP) → host:18080, host:19090
rathole-server-ws (WebSocket) → host:18082

[bridge: remote ↔ cluster]
rathole-client-tcp forwards backend:80 and backend:9090 through TCP tunnel
rathole-client-ws forwards backend:80 through WebSocket tunnel

The backend has no ports exposed to the host, so the only way to reach it is through a rathole tunnel — matching real interLink deployments where the remote service is inside the HPC network.

Environment variables for the Docker tests

VariableDefaultDescription
TCP_HTTP_URLhttp://localhost:18080TCP tunnel primary endpoint
TCP_METRICS_URLhttp://localhost:19090TCP tunnel secondary port
WS_HTTP_URLhttp://localhost:18082WebSocket tunnel endpoint
TUNNEL_WAIT_TIMEOUT60Seconds to wait for tunnel readiness

3. Quick health checks after setup

# Cluster and node status
kubectl get nodes
kubectl get pods -A

# VK connectivity to interlink API
curl http://localhost:3000/pinglink

# Shadow namespace should exist if a port-forwarding pod was scheduled before
kubectl get ns | grep wstunnel

# Check all cert-manager Certificates in shadow namespaces
kubectl get certificates -A | grep wstunnel

# Inspect a specific rathole tunnel resource set (replace with real pod name/ns)
kubectl get deployment,service,certificate,ingressroutetcp \
-n interlink-wstunnel -l app=rathole

Dagger-based tests (legacy)

Note: The Dagger-based CI pipeline is disabled. The K3s-based workflow above is the authoritative test environment for all PRs.

The legacy Dagger setup is documented below for reference only.

Requirements (Dagger)

What's in the Dagger module (legacy)

  • E2e integration tests: a reproducible test environment (selfcontained in Dagger runtime). Run the very same tests executed by github actions to validate any PR
  • A development setup tool: optionally you can use your k8s cluster of choice to run and install different interlink components via this module.

⚠️ by default the docker plugin is the one tested and to be referred to for any change as first thing.

Usage (Dagger, legacy)

The whole test suite is based on the application of k8s manifests inside a folder that must be passed at runtime. In ./ci/manifests of this repo you can find the one executed by default by the github actions.

That means you can test your code before any commit, discovering in advance if anything is breaking.

Run e2e tests

The easiest way is to simply run make test from the root folder of interlink. But if you need to debug or understand further the test utility or a plugin, you should follow these instructions.

Edit manifests with your images

  • service-account.yaml is the default set of permission needed by the virtualkubelet. Do not touch unless you know what you are doing.
  • virtual-kubelet-config.yaml is the configuration mounted into the virtual kubelet component to determine its behaviour.
  • virtual-kubelet.yaml is the one that you should touch if you are pointing to different interlink endpoints or if you want to change the virtual kubelet image to be tested.
  • interlink-config.yaml is the configuration mounted into the interlink API component to determine its behaviour.
  • interlink.yaml is the one that you should touch if you are pointing to different plugin endpoints or if you want to change the interlink API image to be tested.
  • plugin-config.yaml is the configuration for the interLink plugin component that you MUST TO START MANUALLY on your host.
    • we do have solution to make it start inside dagger environment, but is not documented yet.

Start the local docker plugin service

For a simple demonstration, you can use the plugin that we actually use in are Github Actions:

wget https://github.com/interlink-hq/interlink-docker-plugin/releases/download/0.0.24-no-gpu/docker-plugin_Linux_x86_64 -O docker-plugin \
&& chmod +x docker-plugin \
&& docker ps \
&& export INTERLINKCONFIGPATH=$PWD/ci/manifests/plugin-config.yaml \
&& ./docker-plugin

Run the tests

Then, in another terminal sessions you are ready to execute the e2e tests with Dagger.

First of all, in ci/manifests/vktest_config.yaml you will find the pytest configuration file. Please see the test documentation for understanding how to tweak it.

The following instructions are thought for building docker images of the virtual-kubelet and interlink api server components at runtime and published on virtual-kubelet-ref and interlink-ref repositories (in this example it will be dockerHUB repository of the dciangot user). It basically consists on a chain of Dagger tasks for building core images (build-images), creating the kubernetes environment configured with core components (new-interlink), installing the plugin of choice indicated in the manifest folder (load-plugin), and eventually the execution of the tests (test)

To run the default tests you can move to ci folder and execute the Dagger pipeline with:

dagger call \
--name my-tests \
build-images \
new-interlink \
--plugin-endpoint tcp://localhost:4000 \
test stdout

⚠️ by default the docker plugin is the one tested and to be referred to for any change as first thing.

In case of success the output should print something like the following:

cachedir: .pytest_cache
rootdir: /opt/vk-test-set
configfile: pyproject.toml
collecting ... collected 12 items / 1 deselected / 11 selected

vktestset/basic_test.py::test_namespace_exists[default] PASSED [ 9%]
vktestset/basic_test.py::test_namespace_exists[kube-system] PASSED [ 18%]
vktestset/basic_test.py::test_namespace_exists[interlink] PASSED [ 27%]
vktestset/basic_test.py::test_node_exists[virtual-kubelet] PASSED [ 36%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-000-hello-world.yaml] PASSED [ 45%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-010-simple-python.yaml] PASSED [ 54%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-020-python-env.yaml] PASSED [ 63%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-030-simple-shared-volume.yaml] PASSED [ 72%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-040-config-volumes.yaml] PASSED [ 81%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-050-limits.yaml] PASSED [ 90%]
vktestset/basic_test.py::test_manifest[virtual-kubelet-060-init-container.yaml] PASSED [100%]

====================== 11 passed, 1 deselected in 41.71s =======================

Debug with interactive session

In case something went wrong, you have the possibility to spawn a session inside the final step of the pipeline to debug things:

dagger call \
--name my-tests \
build-images \
new-interlink \
--plugin-endpoint tcp://localhost:4000 \
run terminal

with this command (after some minutes) then you should be able to access a bash session doing the following commands:

bash
source .venv/bin/activate
export KUBECONFIG=/.kube/config

## check connectivity with k8s cluster
kubectl get pod -A

## re-run the tests
pytest -vk 'not rclone'

Debug from kubectl on your host

You can get the Kubernetes service running with:

dagger call \
--name my-tests \
build-images \
new-interlink \
--plugin-endpoint tcp://localhost:4000 \
kube up

and then from another session, you can get the kubeconfig with:

dagger call \
--name my-tests \
config export --path ./kubeconfig.yaml

Deploy on existing K8s cluster

TBD

Develop Virtual Kubelet code

⚠️ Coming soon

⚠️ Coming soon

Develop your plugin

⚠️ Coming soon

SSL Certificate Management

CSR Integration for Virtual Kubelet

As of this version, Virtual Kubelet now supports proper SSL certificate management using Kubernetes Certificate Signing Requests (CSRs) instead of self-signed certificates. This resolves compatibility issues with kubectl logs and other Kubernetes clients.

Key Changes

  • CSR-based certificates: Virtual Kubelet now requests certificates from the Kubernetes cluster CA using the standard kubernetes.io/kubelet-serving signer
  • Automatic fallback: If CSR creation fails, the system falls back to self-signed certificates with a warning
  • Improved compatibility: No longer requires --insecure-skip-tls-verify-backend flag for kubectl logs

Technical Details

The implementation uses:

  • Signer: kubernetes.io/kubelet-serving (standard kubelet serving certificate signer)
  • Certificate store: /tmp/certs directory with virtual-kubelet prefix
  • Subject: system:node:<node-name> with system:nodes organization
  • IP SANs: Node IP address for proper certificate validation

Testing Certificate Integration

To verify CSR-based certificate functionality:

  1. Check CSR creation:

    kubectl get csr
  2. Test kubectl logs without insecure flag:

    kubectl logs <pod-name-on-virtual-kubelet-node>
  3. Monitor Virtual Kubelet logs for certificate retrieval messages:

    kubectl logs -n interlink virtual-kubelet-<node-name>

⚠️ IMPORTANT: CSR Manual Approval Required

CRITICAL: CSRs (Certificate Signing Requests) must be manually approved by a cluster administrator, otherwise log access will not work. Without CSR approval, kubectl logs and other log-related operations will fail.

Required steps for enabling log functionality:

  1. Check for pending CSRs:

    kubectl get csr
  2. Approve the CSR (replace csr-xxxxx with the actual CSR name):

    kubectl certificate approve csr-xxxxx
  3. Verify logs are accessible:

    kubectl logs <pod-name-on-virtual-kubelet-node>

Troubleshooting

  • CSR approval: Ensure your cluster has automatic CSR approval configured or manually approve CSRs
  • RBAC permissions: Virtual Kubelet needs permissions to create CSRs in the certificates.k8s.io API group
  • Fallback behavior: Check logs for warnings about falling back to self-signed certificates

For clusters without proper CSR support, the system maintains backward compatibility by automatically using self-signed certificates with appropriate warnings.