HiveBrain v1.2.0
Get Started
← Back to all entries
snippetkubernetesMinor

How can I best deliver read only data assets to a Kubernetes application?

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
canapplicationreadkubernetesdeliverassetshowdataonlybest

Problem

I'm migrating an HPC app to Kubernetes, and am trying to determine the best way to provide read only data assets as a configuration managed snapshot.

Previously, my team had delivered our application as a set of RPMs, but as we move to Kubernetes, we're delivering Docker images. This works fine for our application binary, as instead of a pile of RPMs that all have to agree, we can just deliver a known working image.

The problem, however, comes with our read-only data assets (similar to a game's asset files). Several different Docker images might rely on a single set of data assets, and so we'd prefer not to bake them into the actual Docker image itself (plus we want the ability to change the assets without having to recompile the application images).

We're unsure as to the best approach for this. The first idea would be to create a "data container" that simply runs NFS and serves out the data. This successfully isolates the data from the application and allows us to collapse a set of data RPMs into a single tagged docker image, but to me it seems like it might be overkill.

I know that we're essentially looking for a persistent volume for Kubernetes, but the problem for us is bundling all of the data into a single package that has the same delivery convenience as a Docker image.

Is there a better way to provide this read-only data as a version controlled snapshot?

Solution

we'd prefer not to bake them into the actual Docker image itself (plus we want the ability to change the assets without having to recompile the application images).

This is indeed good properties to have. You are on the right track.


The problem, however, comes with our read-only data assets (similar to a game's asset files).

This can be solved with multiple solutions, it's hard to tell what is the best for you. Here are some alternatives.

Persistent Volume - ReadOnlyMany


I know that we're essentially looking for a persistent volume for Kubernetes, but the problem for us is bundling all of the data into a single package that has the same delivery convenience as a Docker image.

Yes, you could create a Persistent Volume with access modes:

accessModes:
- ReadWriteOnce
- ReadOnlyMany


then (by an automatic process)

  • Load your data onto the volume with a Job using access mode ReadWriteOnce



  • Load the volumes to your pods/services using access mode ReadOnlyMany



and keep the volume immutable, you create a new when you need to change content. You can also use multiple volumes this way.

CDN


our read-only data assets (similar to a game's asset files).

This inherently also sounds what typically is served by Content Deliver Networks. But it depends on your use case if this is a good solution or not.

Buckets

Data for e.g. CDN is usually stored by "buckets" at cloud providers, that may also be an option, then your applications load data using HTTP typically. Minio may be an alternative if you want to do this in your cluster.

Code Snippets

accessModes:
- ReadWriteOnce
- ReadOnlyMany

Context

StackExchange DevOps Q#11482, answer score: 1

Revisions (0)

No revisions yet.