Understanding Pod Termination Signals in Kubernetes

This article explores Kubernetes' signal mechanisms during Pod deletion, detailing the differences between SIGTERM/SIGKILL, container graceful/forced termination workflows, and Pod lifecycle management strategies.

This article was published 726 days ago, some content may be outdated. If you have any questions, please leave a comment.

Kubernetes Pod Deletion Lifecycle

Deleting a container in Kubernetes involves the following steps and concepts. Specifically, container lifecycle management is handled by Pods and controllers (such as Deployments, ReplicaSets, etc.). You can delete a Pod or container using the kubectl delete command in Kubernetes, and the exact behavior depends on the resource type involved.

Process of Deleting a Container

Assume you execute the following command to delete a Pod (the container is one or more processes within the Pod):

1
kubectl delete pod <pod-name>  

When this command is executed, the following process occurs:

  • Pod Scheduling Management: Controllers in Kubernetes (e.g., Deployment or ReplicaSet) monitor the number of Pod replicas and ensure the running Pod count matches the desired state. If you delete a Pod and the replica count is set to greater than 1 (e.g., a Deployment with 3 replicas), the controller will automatically create a new Pod to replace the deleted one.

    If the replica count is set to 1 and there are no additional controllers managing the Pod, the Pod will not be recreated after deletion unless the replica count is manually adjusted.

  • Container Termination: When you delete a Pod, Kubernetes sends a termination signal (SIGTERM) to the containers running in the Pod. This gives the containers a grace period (default: 30 seconds) to perform cleanup operations and release resources.

  • Forced Termination: If a container does not stop within 30 seconds, it is forcibly terminated (via SIGKILL).

  • Pod Deletion: Once the containers terminate, the Pod itself is removed from the cluster, and associated resources (e.g., volumes) are cleaned up or reattached, depending on the volume type.

  • Resource Cleanup: The deleted container releases its allocated resources (CPU, memory, network connections, etc.). If the container uses mounted volumes (e.g., Persistent Volumes), their cleanup behavior is determined by the ReclaimPolicy (e.g., Retain or Delete).

Container Lifecycle Management

In Kubernetes, a container’s lifecycle is generally controlled by a Pod, while the Pod lifecycle itself is a process consisting of creation, execution, and termination phases.

Pod Lifecycle

  • Pending: The Pod is scheduled onto a node, but its containers have not yet started running. This is typically due to pending container image pulls or incomplete resource scheduling.
  • Running: The containers are actively running, and the Pod is executing.
  • Succeeded: All containers have terminated successfully, marking the end of the Pod’s lifecycle.
  • Failed: At least one container exited abnormally (e.g., by returning status code 1 or another non-zero exit status) and cannot be restarted, causing the Pod to enter the Failed state.
  • Unknown: Unable to retrieve the Pod’s status, usually due to node failures or network issues.

Container Lifecycle

A container’s lifecycle is managed by its Pod, but you can customize lifecycle behaviors in Pod configurations using lifecycle hooks, including:

  • PreStop: Invoked before container termination to perform cleanup tasks, such as closing network connections or saving temporary data.
  • PostStart: Invoked after container startup to execute initialization tasks.

Lifecycle Example

  1. Pod Creation:
    • The Pod is created, and the containers begin initializing.
    • The containers start pulling images and booting up.
    • During startup, the PostStart hook (if configured) is executed.
  2. Pod Execution:
    • Containers actively provide services during runtime.
    • Resources (e.g., containers, volumes) remain active.
  3. Pod Deletion:
    • When deleting the Pod, Kubernetes sends a SIGTERM signal to the containers to request graceful termination.
    • If a container fails to exit within the defined grace period, Kubernetes forces termination via SIGKILL.
    • Post-deletion, Pod-associated resources (e.g., network, volumes) are cleaned up, and the containers/Pod enter a terminated state.
  4. Container Exit:
    • When a container exits, its exit status code is recorded. A status code of 0 indicates success; otherwise, it is marked as a failure.
    • Based on the Pod’s restart policy (e.g., Always), Kubernetes may restart the container after termination.

Restart Policy and Replica Count

  • Always (Default): If a container crashes or is deleted, Kubernetes will attempt to restart it.
  • OnFailure: Containers are only restarted if they exit abnormally (return a non-zero exit code).
  • Never: Containers will not restart automatically and will remain permanently stopped after exiting.

Summary

When you delete a container, Kubernetes manages the Pod and container lifecycle through the following steps:

  1. Graceful Container Shutdown: Sends a SIGTERM signal to allow graceful termination (default 30 seconds).
  2. Forced Termination: If the container does not exit within the specified time, a SIGKILL signal is sent to forcefully terminate it.
  3. Pod Deletion: After the container stops, the Pod resource is cleaned up.
  4. Controller Behavior: If the Pod is managed by a Deployment or ReplicaSet, the controller ensures the Pod count matches the desired state, creating new Pods as needed to replace deleted ones.

If the replica count is 1, deleting the Pod will not automatically create a new Pod unless the replica count is manually adjusted or resources are recreated.

Graceful Termination vs. Forced Termination of Containers

In Kubernetes, container termination has two main phases: Graceful Termination and Forced Termination. These phases differ in how Kubernetes handles the exit process and whether the container has an opportunity to perform cleanup operations.

Container Graceful Termination

When you delete a Pod or container, Kubernetes will first attempt to terminate the container gracefully, which is known as Graceful Termination.

  • Sending SIGTERM Signal: When Kubernetes requests container termination, it sends a SIGTERM (termination signal). The container can catch this signal and begin normal shutdown operations.

    Upon receiving SIGTERM, the container can perform actions such as:

    • Closing network connections
    • Cleaning up occupied resources (e.g., file handles, temporary data)
    • Executing custom shutdown logic (e.g., database commits, logging)
  • Grace Period: After receiving SIGTERM, the container is granted a “Grace Period” to complete cleanup tasks. By default, this period is 30 seconds, which can be adjusted through the terminationGracePeriodSeconds field in the Pod manifest.

  • Graceful Exit: If the container exits normally within the grace period (completes cleanup and exits), Kubernetes marks the container status as Succeeded or Terminated. The Pod will then handle subsequent actions according to termination policies (e.g., deletion or rescheduling).

Forced Termination of Containers

If containers fail to exit normally within the grace period, Kubernetes initiates forced termination. This occurs when containers do not complete cleanup operations or do not respond to termination requests, prompting Kubernetes to take stricter measures to terminate them.

  • Sending SIGKILL Signal: If a container does not terminate gracefully within the time specified by terminationGracePeriodSeconds, Kubernetes sends a SIGKILL (force-kill signal) to stop the container immediately. Unlike SIGTERM, the SIGKILL signal cannot be captured or handled by the container, preventing any cleanup operations from being executed.
  • Immediate Termination: Upon receiving SIGKILL, the container is halted abruptly, and all running processes are killed. This means processes have no chance to release resources (e.g., file handles, temporary storage) and may leave behind uncleaned states.
  • Inability to Clean Up Resources: Forcibly terminated containers cannot perform any cleanup operations, potentially resulting in unreleased resources such as temporary files, memory, database connections, etc.

Differences Between Graceful Termination and Forced Termination of Containers

Feature Graceful Termination Forced Termination
Signal SIGTERM SIGKILL
Cleanup Time Allows cleanup (Default: 30 seconds, configurable) No cleanup time, immediate kill
Termination Process Container can catch SIGTERM for cleanup Container cannot catch SIGKILL
Resource Release Resources (file handles, DB connections) released Resources may leak
Container State Exits normally (Succeeded/Terminated) Forcibly stopped (Terminated)
Trigger Scenario Normal shutdown (e.g., deletion request) Unresponsive or grace period expired

Configuring Container Termination Behavior in Pod Specifications

To control container termination behavior, configure the terminationGracePeriodSeconds field in the Pod specification. Default: 30 seconds (configurable).

1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  terminationGracePeriodSeconds: 60  # Set grace period to 60 seconds
  containers:
    - name: mycontainer
      image: myimage

Container Lifecycle Hooks

Kubernetes also provides lifecycle hooks that allow you to insert custom behaviors during a container’s startup and shutdown processes, particularly executing additional operations when the container terminates:

  • PreStop: Executes before container termination. You can use this hook to perform cleanup tasks, send notifications, or other operations. For example, terminating external service connections or closing database links.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
    - name: mycontainer
      image: myimage
      lifecycle:
        preStop:
          exec:
            command: ["sh", "-c", "echo 'Container is stopping' > /tmp/shutdown.log"]

Summary

  • Graceful Container Termination: Kubernetes sends a SIGTERM signal to request the container to exit gracefully. The container has an opportunity to perform cleanup operations and release resources, typically completing within the grace period.
  • Forced Container Termination: If the container fails to exit normally within the grace period, Kubernetes sends a SIGKILL signal to forcibly terminate the container. The container cannot perform any cleanup and is immediately stopped, freeing up resources.

By appropriately configuring a Pod’s grace period and lifecycle hooks, you can control container termination behaviors, ensuring minimized resource leaks and service disruptions during container exits.

SIGTERM and SIGKILL

SIGTERM and SIGKILL are two commonly used Unix/Linux signals. Both are used to send termination requests to processes, but they differ in their purpose, behavior, and consequences. Below is a detailed explanation of these two signals:

SIGTERM (Signal Terminate)

  • Signal Number: 15

  • Action: SIGTERM is a signal requesting a process to terminate. It notifies the process to exit gracefully, allowing time to complete cleanup operations, save state, and release resources. This is the default signal for terminating processes.

  • Sending Method: You can send SIGTERM using the kill command or programmatically via kill(pid, SIGTERM).

    Example:

1
  kill <pid>  # Sends SIGTERM by default
  • Characteristics:

    • SIGTERM enables graceful exit and can be caught and handled by the process.
    • Upon receiving SIGTERM, a process may perform cleanup tasks such as closing database connections, saving temporary data, or releasing file handles.
    • If the process handles SIGTERM, it can execute cleanup logic before terminating.
    • If the process does not respond to SIGTERM, the operating system will wait by default for a period (usually 30 seconds) before sending SIGKILL.
  • Handling Behavior:

    • A process may choose to catch and handle SIGTERM, for example, by registering a signal handler function.
    • If the process ignores SIGTERM and fails to exit within the timeout, SIGKILL will be enforced.
  • Use Cases:

    • Used when allowing cleanup operations, e.g., closing network connections, saving data, or logging shutdown events.
    • In Kubernetes or Docker, SIGTERM is sent first when deleting a container to request a graceful shutdown.

SIGKILL (Signal Kill) - Forced Termination Signal

  • Number: 9

  • Function: SIGKILL is a signal that forces immediate termination of a process. It instructs the operating system to stop the target process without allowing any cleanup operations. This signal cannot be caught, ignored, or handled by the process.

    Example:

1
  kill -9 <pid>  # Send SIGKILL signal
  • Characteristics:

    • SIGKILL is a mandatory termination signal; the targeted process cannot perform any cleanup operations.
    • The operating system terminates the process immediately and releases its occupied resources, regardless of the process’s state.
    • Processes cannot intercept or ignore SIGKILL; it removes the process from memory instantly.
    • Upon sending SIGKILL, the process loses the opportunity to close files, free memory, write logs, or perform other essential cleanup tasks.
  • Use Cases:

    • Use SIGKILL to forcibly terminate a process when it does not respond to SIGTERM or is hung (e.g., frozen or stuck in an infinite loop).
    • In operating systems, SIGKILL is employed to terminate unresponsive or zombie processes, ensuring complete cleanup.

Comparison: SIGTERM vs. SIGKILL

Characteristic SIGTERM (15) SIGKILL (9)
Function Requests graceful termination, allowing the process to perform cleanup tasks Forces immediate process termination; cannot be captured or handled
Capturable/Handled Can be captured by the process; supports custom termination logic Cannot be captured or processed by the process
Graceful Exit Permits resource cleanup (e.g., closing files, saving data) Instantly terminates the process without cleanup
Ignorable Can be ignored by the process (if no handler is registered) Cannot be ignored; enforced termination
Default Behavior System typically waits for process exit (default: 30 seconds) Immediately kills the process without delay
Use Cases Safe process shutdown requiring resource release or state preservation Terminating unresponsive processes or overriding failed SIGTERM
OS Behavior OS waits for the process to exit within a timeout OS forcibly terminates the process and releases all resources

SIGTERM and SIGKILL in Kubernetes

In Kubernetes, when you delete a Pod or container, Kubernetes first sends a SIGTERM signal to the container, requesting its main process to gracefully exit. The process within the container has a specific period (30 seconds by default) to respond to SIGTERM and perform cleanup operations. If the container does not exit normally during this grace period, Kubernetes sends SIGKILL to forcibly terminate the container. This process aims to ensure graceful shutdown where possible, while using forceful termination if unresponsive.

SIGTERM Signal

  • Sent to: The main process within the container.

  • Behavior: When the container runs, there is a main process (typically the process defined by CMD or ENTRYPOINT at container startup). When Kubernetes requests to stop the container, it sends a SIGTERM signal to the container’s main process. SIGTERM is a graceful termination signal, meaning the process should handle it by performing cleanup tasks and exiting properly.

  • Behavior inside the container: The main process has the opportunity to capture the SIGTERM signal and execute cleanup operations within the allotted time, such as:

    • Closing database connections.
    • Saving log files or persistent data.
    • Cleaning up temporary files or caches.

    The container may capture the signal and execute custom termination logic (e.g., via the lifecycle hook PreStop) upon receiving SIGTERM.

  • Grace Period: Kubernetes provides a default of 30 seconds (configurable) for the container’s main process to shut down gracefully. If the process does not exit within this timeframe, Kubernetes will send a SIGKILL signal to forcibly terminate the container.

SIGKILL Signal

  • Target: Processes within the container (main process or any other processes)
  • Behavior: SIGKILL is an uncatchable and unignorable signal. It directly terminates processes in the container without allowing any cleanup operations. When a container fails to exit gracefully during the grace period after receiving SIGTERM, Kubernetes sends SIGKILL to forcefully terminate container processes.
    • Process Behavior: Upon receiving SIGKILL, container processes terminate immediately without performing any cleanup. All file handles, network connections, database transactions may be abruptly interrupted, potentially leaving resources not properly released.
  • Container Termination: With processes forcibly terminated, the container ends its lifecycle. The Pod controller (e.g., Deployment or ReplicaSet) will then reschedule new containers based on replica counts.

Relationship Between Containers and Processes

  • In Kubernetes, a container essentially serves as a runtime environment encapsulating one or multiple processes (typically a single main process). Discussions about container startup, shutdown, or restart fundamentally refer to managing the lifecycle of processes within the container.
  • When Kubernetes sends SIGTERM or SIGKILL, these signals are directed to processes within the container rather than the container itself. The container itself has no executable code or processes - it acts merely as a runtime environment, while the containerized processes are the actual entities handling these signals.

How Processes in Containers Respond to SIGTERM and SIGKILL

  • Response to SIGTERM: Processes within a container can capture the SIGTERM signal. The container’s main process may implement code logic to handle this signal for cleanup operations or resource release.

    For example, if the container runs an HTTP server process, it might stop accepting new requests and gracefully complete ongoing requests upon receiving SIGTERM, then exit.

    Example code (Python):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  import signal
  import time
  
  def graceful_shutdown(signum, frame):
      print("Received SIGTERM, shutting down gracefully...")
      # Perform cleanup operations, such as closing database connections, flushing caches, etc.
      time.sleep(2)  # Simulate cleanup tasks
      exit(0)
  
  signal.signal(signal.SIGTERM, graceful_shutdown)
  
  print("Running... Press CTRL+C to exit.")
  while True:
      time.sleep(1)
  • Response to SIGKILL: Processes in containers cannot capture or handle the SIGKILL signal. The SIGKILL signal terminates the process directly, causing the process to exit immediately without any opportunity for cleanup.

    When SIGKILL is received, the process is terminated abruptly by the system, leaving no time to save state or release resources. Recovery from this termination is not possible.

Summary

  • SIGTERM: A graceful termination request allowing processes to clean up resources and exit properly. If the process does not terminate within the specified time limit, the operating system sends SIGKILL.
  • SIGKILL: A forced termination signal that cannot be intercepted or handled by the process. The process is immediately terminated without any cleanup.

These two signals are commonly used in Unix/Linux system management. Understanding their roles and behaviors helps in effectively managing process lifecycles, particularly in containerized environments (e.g., Kubernetes) and automated operations.

SIGTERM and SIGKILL signals in Kubernetes are ultimately sent to the processes running inside the container, not directly to the container itself. A container is a runtime environment containing one or more processes. Therefore, these signals target the actual processes within the container.

  • SIGTERM is sent to processes inside the container to allow graceful exit and cleanup. If processes fail to terminate within the grace period (default 30 seconds), Kubernetes follows up with SIGKILL.
  • SIGKILL (sent to container processes) cannot be caught or ignored. Processes are forcibly terminated immediately, bypassing cleanup routines.

In summary, containers manage internal processes as runtime environments. The lifecycle management of a container depends on how its internal processes respond to these termination signals.

Facing the sea with spring blossoms.
Built with Hugo
Theme Stack designed by Jimmy