8.4. Prometheus Node Exporter
The Prometheus Node Exporter is a key component used for collecting operating system metrics from Linux and Windows systems. It exposes a wide range of system-level metrics that Prometheus can scrape, making it useful for monitoring the health and performance of physical and virtual machines.
Some of the key metrics collected by Node Exporter include:
- CPU usage: Tracks how much CPU time is being used by user and system processes
- Memory usage: Monitors free and used memory, swap space, and buffer/cache utilization
- Disk I/O: Provides insights into disk read/write operations and storage usage
- Network statistics: Captures metrics on data sent and received over network interfaces
- File system usage: Monitors available and used space on file systems
Node Exporter runs as a lightweight daemon on each node and is easy to install and configure. It works out of the box, exposing most common system metrics through the /metrics endpoint, but can also be extended with additional collectors to gather more specialized data. These metrics can be visualized through tools like Grafana, helping administrators monitor and troubleshoot infrastructure performance.
Together with the other monitoring capabilities, the internal view to operating system level metrics that the Node Exporter enables completes the comprehensive monitoring.
The goal of this lab is:
- Installing the Node Exporter binary in a VM
- Exposing those Node Exporter metrics on the pod network
- Providing capabilities to integrate those white box metrics to an existing Prometheus stack
Task 8.4.1: Install the Node Exporter using cloud-init
First, we are going to define our cloud-init configuration. Create a file called cloudinit-node-exporter.yaml in the folder labs/lab08 with the following content:
#cloud-config-archive
- type: "text/cloud-config"
content: |
password: kubevirt
chpasswd: { expire: False }
users:
- default
- name: node_exporter
gecos: Node Exporter User
primary_group: node_exporter
groups: node_exporter
shell: /bin/nologin
system: true
write_files:
- content: |
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
# Fallback when environment file does not exist
Environment=OPTIONS=
EnvironmentFile=-/etc/sysconfig/node_exporter
ExecStart=/usr/local/bin/node_exporter $OPTIONS
[Install]
WantedBy=multi-user.target
path: /etc/systemd/system/node_exporter.service
- type: "text/x-shellscript"
content: |
#!/bin/sh
# install node_exporter
curl -fsSL https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz | sudo tar -zxvf - -C /usr/local/bin --strip-components=1 node_exporter-1.8.2.linux-amd64/node_exporter && sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
sudo systemctl enable --now node_exporter
Create a secret using above file’s content using:
kubectl create secret generic lab08-cloudinit-node-exporter --from-file=userdata=labs/lab08/cloudinit-node-exporter.yaml --namespace lab-<username>
Create a virtual machine referencing the configuration from above by creating a new file vm_lab08-node-exporter.yaml in the folder labs/lab08/ with the following content:
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: lab08-node-exporter
spec:
runStrategy: Halted
template:
metadata:
labels:
kubevirt.io/domain: lab08-node-exporter
spec:
domain:
devices:
disks:
- name: containerdisk
disk:
bus: virtio
- name: cloudinitdisk
disk:
bus: virtio
interfaces:
- name: default
masquerade: {}
resources:
requests:
memory: 1024M
networks:
- name: default
pod: {}
volumes:
- name: containerdisk
containerDisk:
image: quay.io/containerdisks/fedora:43
- name: cloudinitdisk
cloudInitNoCloud:
secretRef:
name: lab08-cloudinit-node-exporter
Create your VM with:
kubectl apply -f labs/lab08/vm_lab08-node-exporter.yaml --namespace lab-<username>
Start it with:
virtctl start lab08-node-exporter --namespace lab-<username>
Task 8.4.2: Exposing the Node Exporter
We have spawned a virtual machine that uses cloud-init and installs the Node Exporter, which provides Node metrics on port 9100. Let us test the metrics:
Create the following Kubernetes Service (file: service-node-exporter.yaml folder: labs/lab08):
apiVersion: v1
kind: Service
metadata:
name: lab08-node-exporter
labels:
node-exporter: "true"
spec:
ports:
- name: metrics
port: 9100
protocol: TCP
targetPort: 9100
selector:
kubevirt.io/domain: lab08-node-exporter
type: ClusterIP
And create it with:
kubectl apply -f labs/lab08/service-node-exporter.yaml --namespace lab-<username>
Test your working webserver from your webshell:
curl -s http://lab08-node-exporter.lab-<username>.svc.cluster.local:9100/metrics
The expected output looks similar to this:
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 7
[...]
Those metrics are now easily integrated into Prometheus by deploying a ServiceMonitor resource, which would look similar to:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: node-exporter-servicemonitor
namespace: lab-<username>
spec:
endpoints:
- honorLabels: true
port: metrics
scheme: http
selector:
matchLabels:
node-exporter: "true"
End of lab
Cleanup resources
You have reached the end of this lab. Please stop your running virtual machines to save resources on the cluster.
Stop the VirtualMachineInstance:
virtctl stop lab08-node-exporter --namespace lab-<username>