HiveBrain v1.2.0
Get Started
← Back to all entries
debugkubernetesMinor

Can a kubernetes pod be forced to stay alive after its main command fails?

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
aftercanfailsmainkubernetesaliveitsforcedstaypod

Problem

After starting a long running kubernetes job we've found that the final result will fail to upload to its final location. We would like to force the pod to stay open after the main process fails so we can exec in and manually process the final results. If the main process fails and the pod exits before uploading the final result we will lose a large amount of time to re-process the job.

Is there a way to ensure the pod stays alive manually?

Solution

I want to add a quick and dirty solution to this. I often add sleep 600 to the end of a command that's failing unexpectedly so that I can exec in and poke around to see what happened. Yaml example (note: - >- enables multi-line concatenation, just a convenience):

command: ["bash", "-c"]
        args:
          - >-
            python myscript.py; sleep 600;


With that said, Karthick's answer is better, this is just a simple solution that doesn't follow best practices. I encourage you not to use sleep or sleep infinity, because it's very easy to forget that your pod is running and holding onto resources indefinitely.

Code Snippets

command: ["bash", "-c"]
        args:
          - >-
            python myscript.py; sleep 600;

Context

StackExchange DevOps Q#11732, answer score: 4

Revisions (0)

No revisions yet.