Log Parser for Karpenter is a Golang based command line tool which can parse the output of Karpenter Controller logs, tested against logs from Karpenter versions:
- v0.37.7 *
- v1.0.x *
- v1.1.x
- v1.2.x
- v1.3.x
- v1.4.x
* Note: "messageKind":"spot_interrupted"
is first supported with Karpenter version v1.1.x, so LogParserForKarpenter (lp4k) does not provide interruptiontime and interruptionkind in earlier versions
It allows using either STDIN (for example for piping live Karpenter controller logs) or multiple Karpenter log files as input and will print CSV style formatted output of nodeclaim data ordered by createdtime to STDOUT, so one can easily redirect it into a file and analyse with tools like Amazon QuickSight or Microsoft Excel.
If neither STDIN nor log files are used as input, lp4k will attach to a running K8s/EKS cluster and parses Karpenter logs (streamed logs, similar to kubectl logs -f using LP4K_KARPENTER_NAMESPACE and LP4K_KARPENTER_LABEL) and creates a ConfigMap lp4k-cm-<date> in same namespace, which gets updated every LP4K_CM_UPDATE_FREQ.
K8s handling can be configured using the following OS environment variables:
Environment variable | Default value | Description |
---|---|---|
LP4K_KARPENTER_NAMESPACE | "karpenter" | K8s namespace where Karpenter controller is running |
LP4K_KARPENTER_LABEL | "app.kubernetes.io/name=karpenter" | Karpenter controller K8s pod labels |
LP4K_CM_UPDATE_FREQ | "30s" | update frequency of ConfigMap and STDOUT if enabled (default), must be valid Go time.Duration string like "30s" or 2m30s" |
LP4K_CM_PREFIX | "lp4k-cm" | nodeclaim ConfigMap prefix, if KARPENTER_LP4K_CM_OVERRIDE=false or ConfigMap name, if KARPENTER_LP4K_CM_OVERRIDE=true |
LP4K_CM_OVERRIDE | "false" | determines, if ConfigMap will just use prefix and will be overriden upon every start of lp4k |
LP4K_NODECLAIM_PRINT | "true" | print nodeclaim information every KARPENTER_CM_UPDATE_FREQ to STDOUT |
* Note: In mode LP4K_CM_OVERRIDE=true
lp4k will read existing nodeclaim data from ConfigMap specified by LP4K_CM_PREFIX
Use:
LP4K_CM_UPDATE_FREQ=10s ./bin/lp4k
or permanently
export LP4K_CM_UPDATE_FREQ=10s
./bin/lp4k
Just run:
make
A binary lp4k
for your OS and platform is build in directory bin
.
Then run it like:
./bin/lp4k sample-input.txt
or
./bin/lp4k <Karpenter log output file 1> [... <Karpenter log output file n>]
or
kubectl logs -n karpenter karpenter deployments/karpenter --all-pods=true [-f] | ./lp4k
or for attaching to K8s/EKS cluster in current KUBECONFIG context
./bin/lp4k
The sample output file sample-multi-file-klp-output.csv shows all exposed nodeclaim information and can be used as a sample starter to build analysis on top of it.
- Note: lp4k will recognise new nodeclaims and populate its internal structures first when Karpenter controller logs show a logline containing
"message":"created nodeclaim"
. That means after a Karpenter controller restart and a subsequent and required restart lp4k will not recognise already existing nodeclaims and showsNo results - empty "nodeclaim" map
lp4kcm is a helper tool to display lp4k ConfigMap data in same CSV format.
Just run:
make tools
A binary lp4kcm
for your OS and platform is build in directory bin
.
Then run it like:
./bin/lp4k <lp4k ConfigMap name 1> [... <lp4k ConfigMap name n>]
The simplest way for analysis is to use the output and parse it using standard Linux utilities like awk, cut and grep.
# indexed header
$ head -1 sample-multi-file-klp-output.csv
nodeclaim[1],createdtime[2],nodepool[3],instancetypes[4],launchedtime[5],providerid[6],instancetype[7],zone[8],capacitytype[9],registeredtime[10],k8snodename[11],initializedtime[12],nodereadytime[13],nodereadytimesec[14],disruptiontime[15],disruptionreason[16],disruptiondecision[17],disruptednodecount[18],replacementnodecount[19],disruptedpodcount[20],annotationtime[21],annotation[22],tainttime[23],taint[24],interruptiontime[25],interruptionkind[26],deletedtime[27],nodeterminationtime[28],nodeterminationtimesec[29],nodelifecycletime[30],nodelifecycletimesec[31],initialized[32],deleted[33]
# print nodeclaim[index/column=1], nodereadytime[13],nodereadytimesec[14]
$ cat sample-multi-file-klp-output.csv | awk -F ',' '{print $1,$13,$14 }' | more
nodeclaim[1] nodereadytime[13] nodereadytimesec[14]
spot-844xp 1m18.591s 78.6
default-brbk4 0s 0.0
default-lpc62 50.935s 50.9
default-j4lj7 43.617s 43.6
default-mpz2w 46.277s 46.3
default-8sxj9 36.714s 36.7
default-zgb22 35.008s 35.0
local-storage-raid-al2023-kdsvk 41.839s 41.8
local-storage-raid-al2023-tq7v5 43.922s 43.9
local-storage-raid-al2023-9kx8z 48.781s 48.8
...
# search for specific value sof a specific nodeclaim
$ cat sample-multi-file-klp-output.csv | awk -F ',' '/local-storage-raid-al2023-nccxt/ { print $1,$13,$14 }'
local-storage-raid-al2023-nccxt 1m8.447s 68.4
# combine directly with lp4k and raw Karpenter controller logs
$ ./bin/lp4k karpenter-log-0.37.7.txt | awk -F ',' '{ print $1,$13,$14 }'
nodeclaim(1) nodereadytime(13) nodereadytimesec(14)
default-brbk4 0s 0.0
default-lpc62 50.935s 50.9
default-j4lj7 43.617s 43.6
default-mpz2w 46.277s 46.3
Amazon QuickSight or Microsoft Excel are possible choices to use the CSV output for advanced analysis to create graphs and/or pivot tables.
We welcome contributions to lp4k ! Please see CONTRIBUTING.md for more information on how to report bugs or submit pull requests.
This project has adopted the Amazon Open Source Code of Conduct. See CODE_OF_CONDUCT.md for more details.
See CONTRIBUTING for more information.
This project is licensed under the Apache 2.0 License.