DevOps, SRE and Platform Engineering

CI/CD

  • 1.What’s CI/CD?
  • 2.Explain the phases of CI/CD
  • 3.Have you ever used GitHub actions? How does it work?
  • 4.Have you ever used GitLab? How does it differ from GitHub?

Docker and Containerization

  • 1.What’s the difference between a virtual machine and a container runtime?
  • 2.How does containerization work?
  • 3.Is Docker the only way to create a container?
  • 4.What is the purpose and power of an image?
  • 5.What is the registry and how is it used in containerization?
  • 6.What’s a Dockerfile?
  • 7.What is the first line of a Dockerfile?
  • 8.How do you attach a volume in a Docker container?
  • 9.How do you manage or orchestrate multiple containers?
  • 10.What’s the purpose of Docker Compose?
  • 11.Best practice for building a good Dockerfile
  • 12.Why too many lines are bad for a Dockerfile?

Cloud

  • 1.What is cloud computing?
  • 2.Define public, hybrid, and private cloud and give a use case for each
  • 3.What is serverless?
  • 4.Is I say edge computing, what do I mean?
  • 5.List some CNCF (Cloud Native Computing Foundation) graduated projects or projects you like

Site Reliability Engineering

  • 1.What is a receiver in an alert manager?
  • 2.What happens when you execute a `curl` command on a Prometheus exporter endpoint?
  • 3.Give me an example of an OpenMetrics payload
  • 4.What is a Service Level Objective (SLO)?
  • 5.What is a Service Level Indicator (SLI)?
  • 6.What is a Service Level Agreement (SLA)?
  • 7.If I say "Error Budget", what do I mean? SLA is 99.9%, what is the error budget?

Terraform and Infrastructure as Code

  • 1.What is Infrastructure as Code and why is it important in modern cloud development?
  • 2.Explain the following commands: terraform init, terraform plan, terraform apply
  • 3.What is the purpose of the Terraform state file?
  • 4.What is Terraform's remote state?
  • 5.Where should it be stored, for example, in AWS as a cloud provider?
  • 6.What is a Terraform provider and what’s the difference between a provider and a resource?
  • 7.What is the way of reusing Terraform templates?
  • 8.How can you use Terraform variables?
  • 9.What major cloud provider can Terraform work with?
  • 10.How does Terraform handle updates to existing resources?
  • 11.How does Terraform handle dependencies between resources?

Kubernetes, Helm, and Container Orchestration

  • 1.What is Kubernetes, and what are some of its main features and benefits?
  • 2.What are the components of a Kubernetes cluster, and how do they interact with each other?
  • 3.How can you interact with Kubernetes API?
  • 4.Explain these commands: k create, k get, k describe, k delete, k apply, k logs, k exec, scale, rollout
  • 5.How do you create a Kubernetes deployment, and what are some best practices to follow when defining deployments?
  • 6.What is an ImagePullPolicy and what values can this parameter hold?
  • 7.What is a Kubernetes port forward? Why do you use that? What’s the command to perform that?
  • 8.NodePort vs LoadBalancer?
  • 9.What is the role of the Scheduler?
  • 10.Explain node affinity, node taints, node selectors, and pod priority.
  • 11.What are Kubernetes pods, and how do they relate to containers?
  • 12.Explain the Pod lifecycle
  • 13.What is the CrashLoopBackOff state?
  • 14.Explain the following command
  • 15.What’s the difference between resource request and resource limit in Kubernetes?
  • 16.When is a Pod evicted?
  • 17.How does Kubernetes health checks work?
  • 18.What is a livenessProbe ?
  • 19.What is a readinessProbe ?
  • 20.What is a startupProbe ?
  • 21.How many kinds of probe types are there?
  • 22.NodeAffinity vs PodAffinity
  • 23.What is the POD disruption budget?
  • 24.What is a Kubernetes volume?
  • 25.How is a Kubernetes volume different from a container's file system?
  • 26.What are some types of Kubernetes volumes?
  • 27.How do you define a Kubernetes volume in a Pod's YAML configuration?
  • 28.How do you mount a Kubernetes volume to a container?
  • 29.Can multiple containers in a Pod share the same volume?
  • 30.What is a Persistent Volume (PV) in Kubernetes?
  • 31.How do you define a Persistent Volume in Kubernetes?
  • 32.How do you claim a Persistent Volume in Kubernetes (PVC)?
  • 33.How do you use a Persistent Volume in a Pod?
  • 34.What is a StorageClass and how does that work?
  • 35.What are the benefits of StorageClass?
  • 36.What is a Kubernetes namespace, and how can it be used to manage resources in a multi-tenant environment?
  • 37.What are Kubernetes Services, and how do they enable application discovery and load balancing?
  • 38.What is an Ingress and how does it work?
  • 39.What is a Kubernetes Controller and how does it work?
  • 40.Explain the difference between ReplicaSet controller, Deployment controller, StatefulSet controller, and DaemonSet controller
  • 41.What is a secret, and how can it be used to manage sensitive information like passwords and API keys?
  • 42.How do you scale a Kubernetes deployment, and what factors should you consider when determining the optimal number of replicas?
  • 43.Also, what metrics exist (natively) that can trigger a new pod?
  • 44.What kind of autoscaling is Kubernetes capable of?
  • 45.What is HPA and how does it work?
  • 46.What is Cluster Autoscaling and how does it work?
  • 47.What is a ConfigMap?
  • 48.What’s a Kubernetes Operator?
  • 49.Explain RBAC - Role Based Access Control
  • 50.What is the difference between Role and RoleBinding?
  • 51.Role vs ClusterRole, explain the difference
  • 52.What are some best practices for monitoring and logging Kubernetes clusters and applications running on them?
  • 53.Do you have any experience with Grafana+Prometheus, New Relic, Datadog, Dynatrace, or other similar products?
  • 54.How would you decide to separate Kubernetes clusters in an organisation? In what conditions would you have a single cluster?
  • 55.How do you manage multi-tenancy on Kubernetes in general and in terms of billing?
  • 56.How would you set up a high availability HA cluster in Kubernetes?
  • 57.How would you manage etcd? Stacked or unstacked?

Helm

  • 1.What is Helm?
  • 2.Explain helm create, helm package, and helm install commands
  • 3.What does this `replicas: {{ .Values.replicaCount }}` mean?
  • 4.How does helm templating work?
  • 5.What is Jinja?
  • 6.How can you see the story of the releases in helm?
  • 7.Describe how to perform a go to the previous version in helm

Platform Engineering

  • 1.What is Platform Engineering?
  • 2.What is the difference between DevOps, SRE and Platform Engineering?
  • 3.What is the tooling you use for Platform Engineering?
  • 4.Should I go Platform Engineering if I have one product and one team?

OpenShift

  • 1.Do you have any experience with OpenShift Container Platform?
  • 2.What is OpenShift?

Ansible

  • 1.What is the difference between Playbook and Role?
  • 2.How do you debug a Playbook?
  • 3.How would you use Ansible to automate the deployment of a web application?

AWS

  • 1.Define these AWS Services: EC2, S3, Route 53, Lambda, IAM
  • 2.What kind of different Load Balancers does AWS offer?
  • 3.Define these networking resources in AWS: VPC, ACL, NACL
  • 4.What’s the difference between EKS, ECS, and ECR?
  • 5.What is CloudFormation and how does it work?

Azure

  • 1.How would you deploy a web application to Azure App Service?
  • 2.Can you explain the difference between Azure Virtual Machines and Azure Kubernetes Service (AKS), and when you might choose one over the other?
  • 3.How do you configure Azure Active Directory for use in a single-sign-on (SSO) scenario?