apps-on-azure

Centralized Logging with Elasticsearch, Fluentd and Kibana (EFK)

It’s difficult to work with Pod logs at scale - Kubectl doesn’t let you search or filter log entries. The production-ready option is to run a central logging subsystem, which collects all Pod logs and stores them in a central database. EFK is the usual system for doing that in Kubernetes.

Reference

Fluent Bit configuration

Fluent Bit is a streamlined log collector which evolved from Fluentd. It will run as a Pod on every node, collecting that nodes container logs. Fluent Bit uses a pipeline to process logs. This input block reads container log files from the nodes:

[INPUT]
  Name              tail
  Tag               kube.<namespace_name>.<container_name>.<pod_name>.<container_id>-
  Tag_Regex         (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?://.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<container_id>[a-z0-9]{64})/.log$
  Path              /var/log/containers/*.log

This output block saves each log line as a document in Elasticsearch:

[OUTPUT]
  Name            es
  Match           kube.default.*
  Host            elasticsearch
  Index           app-logs
  Generate_ID     On

Finding Pod logs

We’ll start by seeing how Kubernetes stores container logs on the node where the Pod is running.

The fulfiment API is a simple REST API which write log entries - there’s nothing special in the manifest:

📋 Deploy the app and check the logs it prints at startup

kubectl apply -f labs/logging/specs/fulfilment-api

kubectl logs -l app=fulfilment,component=api

This is a Java Spring Boot app - you’ll see a set of startup logs.

The log entries are stored in the filesystem of the node that runs the Pod. You might not have access to the filesystem directly, but you can get it using a volume:

Run the Pod and use it to examine the log files on the node:

kubectl apply -f labs/logging/specs/jumpbox

kubectl exec -it jumpbox -- sh

ls /var/log/containers

# use cat to read the contents of API log

exit

Each container on the node has a .log file

Files are named with a pattern:

That’s the pattern Fluent Bit will use to add metadata to each log entry.

Run the EFK stack

Logging is a cluster-wide concern and we’ll run it in a separate namespace:

📋 Which namespaces will have logs collected, and which indices will the log documents be stored in?

There are two output blocks in the ConfigMap:

    [OUTPUT]
        Name            es
        Match           kube.default.*
        Host            elasticsearch
        Index           app-logs
        Generate_ID     On

    [OUTPUT]
        Name            es
        Match           kube.kube-system.*
        Host            elasticsearch
        Index           sys-logs
        Generate_ID     On

The Match uses tag metadata which includes the namespace. Logs from the default namespace will be stored in the app-logs index and logs from kube-system will be stored in the sys-logs index.

Deploy the app and wait for the Pods to be ready:

kubectl apply -f labs/logging/specs/logging

kubectl get all -n logging

Elasticsearch uses a REST API on port 9200 to insert & query data. We can use it from the jumpbox Pod.

We’ll use curl to make HTTP requests - if you’re using Windows, run this script to use the correct curl version:

# only for Windows - enable scripts:

# then run:
. ./scripts/windows-tools.ps1

Generate some application logs by making a call to the fulfilment REST API:

# you can use the NodePort address:
curl http://localhost:30018/documents

# or the LoadBalancer
curl http://localhost:8011/documents

📋 Connect to the jumpbox Pod and make an HTTP request with curl, to the /_cat/indices path on the Elasticsearch Pod.

First exec into a shell session on the Pod:

kubectl exec -it jumpbox -- sh

The container image has curl installed - you need to use the fully-qualified domain name for the Elasticsearch Service, and the port:

curl http://elasticsearch.logging.svc.cluster.local:9200/_cat/indices

exit

The output shows a list of indices, which includes where logs are stored:

yellow open app-logs  85auSZIAQ2SYpflMN7NYGQ 5 1 12984 0   3.4mb   3.4mb
yellow open sys-logs  aKQAl5XvQWaC30upiwo71Q 5 1   106 0 658.9kb 658.9kb
green  open .kibana_1 bEezIodMQ_6FoJ8cQ5mP5A 1 0     0 0    230b    230b

You can do everything with the REST API, but the Kibana UI is much easier to use.

View application logs in Kibana

Browse to Kibana on http://localhost:5601 or http://localhost:30016

From the left menu:

Now from the left menu

You can see all the container logs, plus metadata (namespace, pod, image etc.)

Make another call to the Spring Boot API:

curl localhost:30018/documents

Click Refresh in Kibana and you’ll see a log entry recording the HTTP request you just made.

Add system logs to Kibana

Index patterns in Kibana are used to query data in Elasticsearch. System component logs are being stored in a different index, so you need a new index pattern.

From the left menu:

Switch to the Discover tab and choose the new index pattern. Kibana is pretty user friendly and this is a good place to explore your logs.

📋 Filter the entries to show logs from Kubernetes API server.

Click on the field kubernetes.labels.component, and you’ll see all the values.

Click the + next to kube-apiserver to see the API logs

You’ll see log entries about core system processes.

Lab

This is a generic log collection system which will fetch logs from every Pod - but all not Pods generate logs:

Start by running the app:

kubectl apply -f labs/logging/specs/fulfilment-processor

Check the logs with Kubectl and Kibana - there are none for this new component:

kubectl logs -l app=fulfilment,component=processor

The application does write logs, to a file in the container filesystem:

kubectl exec deploy/fulfilment-processor -- cat /app/logs/fulfilment-processor.log

Your task is to extend the Pod spec so those logs are pulled out and published as Pod logs.

Stuck? Try hints or check the solution.


Cleanup

kubectl delete ns,deploy,svc,po -l kubernetes.courselabs.co=logging