Questions: what capabilities does it have in process control here, how does it work, does it offer things like the "restart n times then alert" in Alex's list? What status reporting facilities does it have? Pod Phases: Pending-The Pod has been accepted by the Kubernetes cluster, but one or more of the containers has not been set up and made ready to run. This includes time a Pod spends waiting to be scheduled as well as the time spent downloading container images over the network. Running-The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting. Succeeded-All containers in the Pod have terminated in success, and will not be restarted. Failed-All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system. Unknown-For some reason the state of the Pod could not be obtained. This phase typically occurs due to an error in communicating with the node where the Pod should be running. Restarts: restartPolicy is a field of a PodSpec. (found in the .yaml) It can have the values of Always(default), OnFailure and Never. When containers in a pod exit, they are restarted with an exponentially increasing delay capped at 300s. The timer restarts when the container runs for 10 minutes uninterupted. Another useful field is backoffLimit, which takes an integer value. After this many restarts, a job will be failed. (((THEN WHAT))) Errors: When a pod fails, the job controller will make a new one. Thus, applications should know what to do when they get restarted in a new pod. Containers will output a termination message to the kubernetes logs, accessed with kubectl logs. This message can be customised by the developer. KUBERNETES DOES NOT DIRECTLY SUPPORT CLUSTER-LEVEL LOGGING However, there are workarounds. For example, a node-level logging agent can be run on each node, which exposes or pushes all logs to a backend. It is important to note that applications cannot be Liveness Probes: These are used to check whether a container is functioning correctly, or whether it needs to be restarted. It is coordinated by the kubelet. The kubelet checks whether there is a file inside of /tmp/healthy in the container. If there isn't then it kills the container then restarts it. Other types of liveness probe use HTTP and TCP. This is for servers running inside of the container. The kublet will try to connect to the server instead of checking for a file. Readiness probes work similarly, except they check if a container is ready to serve traffic, rather than whether it is broken. A container with negative readiness probes will not recieve traffic through services. Container Probes: These are a diagnostic tool used by the kubelet. There are three kinds. ExecAction- This executes a command inside of the container, and is successful if the command exits with status 0 TCPSocketAction- TCP checks against the pod's IP adress on a given port. It succeeds if the port is open. HTTPGetAction: Does a HTTP GET request to a given port&path. It succeeds if the response has 200 <= status < 400. These probes can return Success, Failure, or Unknown (unknown is when the diagnostic itself fails)