Kubernetes Persistent Volumes

I have been really busy this month, hence not much materialized in terms of blog posts. Anyway, I took a short break and in this time I have experimented further with persistent volumes, using my previous blog post as a base.

The idea is to add a persistent volume to a Pod using a clustered filesystem so that I can access the data the Pod produces from anywhere. As a test, I want to use a GlusterFS volume that is also mounted on my local host OS.

K3s Caveats

It should be important to note that K3s is a lightweight Kubernetes distribution and excludes certain plugins, like GlusterFS.

If you follow a typical Kubernetes GlusterFS example, you will notice the following output when you run something like kubectl describe pod/glusterfs:

Name:         glusterfs
Namespace:    pvtest

... some data omitted ...

Events:
  Type     Reason       Age                From               Message
  ----     ------       ----               ----               -------
  Normal   Scheduled    92s                default-scheduler  Successfully assigned pvtest/glusterfs to node2
  Warning  FailedMount  34s (x2 over 49s)  kubelet            Unable to attach or mount volumes: unmounted volumes=[glusterfsvol], unattached volumes=[kube-api-access-mkjv6 glusterfsvol]: failed to get Plugin from volumeSpec for volume "glusterfsvol" err=no volume plugin matched
  Warning  FailedMount  3s (x5 over 92s)   kubelet            Unable to attach or mount volumes: unmounted volumes=[glusterfsvol], unattached volumes=[glusterfsvol kube-api-access-mkjv6]: failed to get Plugin from volumeSpec for volume "glusterfsvol" err=no volume plugin matched

Since the plugin is not available, the Pod will never start or get into a ready state. I still plan to test the example in another environment at some point in time - perhaps that is also for another blog post... Time will tell.

According to the K3s documentation there are two options when using persistent volumes:

There is a nice Longhorn on K3s for Raspberry Pi that will be great to follow, but I decided to first try a local storage provider as I though that I could expose GlusterFS volumes on each Node which in tern can then be used by the local storage provider.

Lab Setup

The lab implementation will be based on K3s exposing persistent volumes using a local storage provider, which in turn mounted a GlusterFS volume.

Below is a simplified diagram of what I'm trying to accomplish. The diagram itself is based on the C4 model deployment diagram example.

setup

Preparing the cluster for the local storage provider

Before deploying a pod, some preparation work is required. The next couple of steps will delve into the detail of the preparations of the k3s cluster.

Preparing Nodes

As a pre-requisite I have my GlusterFS, as described in my previous blog, up and running.

The biggest task really was to prepare the GlusterFS setup on each node. I have blogged previously about setting up k3s using multipass. I am using that as base and I have created an additional script (available here) to easily install the required packages and mount the GlusterFS volume.

To setup and mount the volume in the k3s nodes, simply run bash install_glusterfs_clinet_on_k3s_nodes.sh.

Note: In the next step it will become clear why the script creates the mount point on /opt/local-path-provisioner.

Installing the local storage provider

To setup the local storage provider, I followed this README and used the stable build, which was version 0.0.22 at the time of experimenting.

The installation command was run as it was on the README:

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.22/deploy/local-path-storage.yaml

Verify installation with the command kubectl get all -n local-path-storage:

NAME                                          READY   STATUS    RESTARTS   AGE
pod/local-path-provisioner-7c795b5576-nklqr   1/1     Running   0          54s

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/local-path-provisioner   1/1     1            1           54s

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/local-path-provisioner-7c795b5576   1         1         1       54s

The configuration is defined in a ConfigMap which expects the local storage to be at /opt/local-path-provisioner. This is why the previous step mounted the GlusterFS volume on this mount point.

Testing

For the test I used a simple Pod manifest as can be seen below:

apiVersion: v1
kind: Pod
metadata:
  name: blox
spec:
  containers:
  - name: flask-demo-app
    image: nicc777/demo-flask-app:0.0.2
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: local-path-pvc

Save this as a file, for example in blox-pod.yaml and apply with the following commands:

# Create a namespace
kubectl create namespace pvtest

# Create a persistent volume claim
kubectl create -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/examples/pvc/pvc.yaml -n pvtest

# Create a Pod, using the persistent volume claim
kubectl apply -f blox-pod.yaml -n pvtest

The pod should be running after a minute or so (depending on Internet bandwidth). A kubectl get all -n pvtest should show the following:

NAME       READY   STATUS    RESTARTS   AGE
pod/blox   1/1     Running   0          2m15s

Discover details about the persistent volume claim

First, let's describe the Pod with kubectl describe pod/blox -n pvtest. The output (shortened) may look something like this:

Name:         blox
Namespace:    pvtest
Priority:     0
Node:         node2/10.0.50.222

... some data omitted ...

Volumes:
  volv:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  local-path-pvc
    ReadOnly:   false

... some data omitted ...

Now, lets have a look at the persistent volume with kubectl describe PersistentVolumeClaim local-path-pvc -n pvtest:

Name:          local-path-pvc
Namespace:     pvtest
StorageClass:  local-path
Status:        Bound
Volume:        pvc-d8bfe66a-1954-4bc2-a7c0-fa8753b2e86d
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
               volume.kubernetes.io/selected-node: node2
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      128Mi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       blox
Events:
  Type    Reason                 Age                From                                                                                                Message
  ----    ------                 ----               ----                                                                                                -------
  Normal  WaitForFirstConsumer   14m (x2 over 14m)  persistentvolume-controller                                                                         waiting for first consumer to be created before binding
  Normal  ExternalProvisioning   14m (x2 over 14m)  persistentvolume-controller                                                                         waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
  Normal  Provisioning           14m                rancher.io/local-path_local-path-provisioner-7c795b5576-xkpk4_e577fe62-821d-4a69-ac5a-42d01042716a  External provisioner is provisioning volume for claim "pvtest/local-path-pvc"
  Normal  ProvisioningSucceeded  14m                rancher.io/local-path_local-path-provisioner-7c795b5576-xkpk4_e577fe62-821d-4a69-ac5a-42d01042716a  Successfully provisioned volume pvc-d8bfe66a-1954-4bc2-a7c0-fa8753b2e86d

Important things to note:

Use the Persistent Volume from a pod

For this part of the experiment, the commands will be run in a shell on the deployed Pod.

Now, let's see if we have a mounted volume:

# Get a bash session
kubectl  exec pod/blox -n pvtest -it -- bash

# Once in the Pod, check out the mounts and verify /data is mounted
root@blox:/usr/src/app# df -h
Filesystem          Size  Used Avail Use% Mounted on
overlay              12G  5.0G  6.5G  44% /
tmpfs                64M     0   64M   0% /dev
tmpfs               3.9G     0  3.9G   0% /sys/fs/cgroup
glusterfs1:volume1   12G  2.2G  9.3G  20% /data
/dev/sda1            12G  5.0G  6.5G  44% /etc/hosts
shm                  64M     0   64M   0% /dev/shm
tmpfs               7.8G   12K  7.8G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs               3.9G     0  3.9G   0% /proc/acpi
tmpfs               3.9G     0  3.9G   0% /proc/scsi
tmpfs               3.9G     0  3.9G   0% /sys/firmware

# Check if we can store data:
echo TEST123 > /data/blox-test.txt

# Check the data locally:
cat /data/blox-test.txt

Verify the data from outside the Pod

Let's have a look at what happened in the actual volume from another system.

On my local development system, I have mounted the GlusterFS volume on /glusterfs-data. If I do a directory listing with ls -lahrt /glusterfs-data I will see the following:

drwxr-xr-x 23 root    root    4.0K May  7 15:40 ..
drwxrwxrwx  5 root    root    4.0K May  7 15:40 .
drwxr-xr-x  2 nicc777 nicc777 4.0K May  7 15:41 nicc777
-rw-rw-r--  1 nicc777 nicc777  537 May 25 10:17 testfile
drwxrwxrwx  2 root    root    4.0K May 25 10:59 pvc-d8bfe66a-1954-4bc2-a7c0-fa8753b2e86d_pvtest_local-path-pvc

And let's see if I can see the test data file I created from the pod with the command: sudo cat /glusterfs-data/pvc-d8bfe66a-1954-4bc2-a7c0-fa8753b2e86d_pvtest_local-path-pvc/blox-test.txt:

TEST123

There are some other directories and files not related to this test. However, the persistent volume claim has created a sub-directory called pvc-d8bfe66a-1954-4bc2-a7c0-fa8753b2e86d_pvtest_local-path-pvc. You should be able to now see how the how the directory name is derived, based on the previous observations of the persistent volume claim and the actual manifest for the Pod.

I also noted it takes a while for the directory to become visible on my local system (more than a minute in my case). This is just time it takes for GlusterFS to ensure in synchronizes everything properly before actually making it visible on a local mount point. Various factors may influence the speed at which this happens, including disk I/O speed, network speed etc.

If you know where to look, the data created in the Pod is visible now anywhere where the GlusterFS volume is mounted. This is exactly what I wanted to accomplish!

Conclusion: what can we learn from this simple experiment?

There are so many details that can be explored, but I will try to keep it simple by summarizing it using the following important observations:

I hope this blog post gave you something interesting to think about in your own research and experimentation with persistent volumes. I hope I can expand on this topics in other scenarios in the near future.

Tags

file-systems, kubernetes, persistent volumes, pv