14 Essential DevOps Interview Questions *

最好的DevOps工程师可以回答的全部基本问题. 在我们社区的推动下,我们鼓励专家提交问题并提供反馈.

Hire a Top DevOps Engineer Now
Toptal logo是顶级自由软件开发人员的专属网络吗, designers, finance experts, product managers, and project managers in the world. 顶级公司雇佣Toptal自由职业者来完成他们最重要的项目.

Interview Questions

1.

在创建DevOps管道时存在哪些挑战?

View answer

Database migrations and new features are common challenges increasing the complexity of DevOps pipelines.

特性标志是在CI环境中处理增量产品发布的常用方法.

如果数据库迁移不成功, but was run as a scheduled job, 系统现在可能处于不可用状态. 有多种方法可以预防和减轻潜在的问题:

  1. 部署实际上是通过多个步骤触发的. 管道中的第一步启动应用程序的构建过程. 迁移在应用程序上下文中运行. If the migrations are successful, 它们将触发部署管道,否则应用程序将不会被部署.
  2. 定义一个约定,所有迁移必须向后兼容. 在这种情况下,所有特性都是使用特性标志实现的. 因此,应用程序回滚独立于数据库.
  3. Create a Docker-based application that creates an isolated production mirror from scratch on every deployment. Integration tests run on this production mirror without the risk of breaking any critical infrastructure.

始终建议使用支持回滚的数据库迁移工具.

2.

Kubernetes中的容器如何通信?

View answer

Pod是Kubernetes中容器之间的映射. A Pod may contain multiple containers. Pods have a flat network hierarchy inside an overlay network and communicate to each other in a flat fashion, 这意味着从理论上讲,覆盖网络中的任何pod都可以与其他pod通话.

3.

如何限制Kubernetes Pods之间的通信?

View answer

取决于您使用的CNI网络插件, 如果它支持Kubernetes网络策略API, Kubernetes允许您指定限制网络访问的网络策略.

策略可以基于IP地址、端口和/或选择器进行限制. (Selectors are a Kubernetes-specific feature that allow connecting and associating rules or components between each other. 例如,您可以将特定的卷连接到特定的pod based on labels by leveraging selectors.)

申请加入Toptal的发展网络

and enjoy reliable, steady, remote Freelance DevOps Engineer Jobs

Apply as a Freelancer
4.

什么是虚拟私有云或VNet?

View answer

Cloud providers allow fine grained control over the network plane for isolation of components and resources. 一般来说,云提供商的使用概念有很多相似之处. But as you go into the details there are some fundamental differences between how various cloud providers handle this segregation.

在Azure中,这称为虚拟网络(VNet)。, 而AWS和谷歌云引擎(GCE)称之为虚拟私有云(VPC)。.

这些技术将网络与子网隔离,并使用非全局可路由的IP地址.

这些技术之间的路由是不同的. 而客户必须在AWS中自己指定路由表, Azure VNets中的所有资源都允许使用系统路由的流量.

不同云提供商之间的安全策略也存在显著差异.

5.

How do you build a hybrid cloud?

View answer

有多种方法可以构建混合云. 常见的方法是在内部网络和云VPC/VNet之间创建VPN隧道.

AWS Direct Connect or Azure ExpressRoute bypasses the public internet and establishes a secure connection between a private data center and the VPC. 这是大型生产部署的首选方法.

6.

CNI是什么,它是如何工作的,它是如何在Kubernetes中使用的?

View answer

The Container Network Interface (CNI) is an API specification that is focused around the creation and connection of container workloads.

CNI有两个主要命令:添加和删除. Configuration is passed in as JSON data.

When the CNI plugin is added, a virtual ethernet device pair is created and then connected between the Pod network namespace and the Host network namespace. 一旦ip和路由被创建和分配, 将信息返回给Kubernetes API服务器.

在后来的版本中添加的一个重要特性是链接CNI插件的能力.

7.

Kubernetes是如何编排容器的?

View answer

Kubernetes容器是根据它们的调度策略和可用资源来调度运行的.

Every Pod that needs to run is added to a queue and the scheduler takes it off the queue and schedules it. 如果失败,错误处理程序将其添加回队列以供以后调度.

8.

编排和经典自动化之间的区别是什么? 有哪些常见的业务流程解决方案?

View answer

Classic automation covers the automation of software installation and system configuration such as user creation, permissions, security baselining, while orchestration is more focused on the connection and interaction of existing and provided services. (配置管理涵盖了经典的自动化和编排.)

大多数云提供商都有应用服务器组件, caching servers, block storage, message queueing databases etc. 它们通常可以配置为自动备份和日志记录. Because all these components are provided by the cloud provider it becomes a matter of orchestrating these components to create an infrastructure solution.

The amount of classic automation necessary on cloud environments depends on the number of components available to be used. 现有的组件越多,就越不需要经典的自动化.

In local or On-Premise environments you first have to automate the creation of these components before you can orchestrate them.

对于AWS来说,一个常见的解决方案是CloudFormation,它周围有许多不同类型的包装器. Azure使用部署,而Google Cloud有Google部署管理器.

一个与云提供商无关的常见编排解决方案是Terraform. While it is closely tied to each cloud, 它提供了一种公共状态定义语言来定义资源(如虚拟机), networks, 以及子网)和数据(引用云上的现有状态).)

Nowadays most configuration management tools also provide components to manage the orchestration solutions or APIs provided by the cloud providers.

9.

CI和CD的区别是什么?

View answer

CI代表“持续集成”,CD代表“持续交付”或“持续部署”.“持续集成是持续交付和持续部署的基础. 持续交付和持续部署自动化了发布,而CI只自动化了构建.

而持续交付旨在生产可以随时发布的软件, 发布到生产环境仍然是由某人决定手动完成的. Continuous deployment goes one step further and actually releases these components to production systems.

10.

Describe some deployment patterns.

View answer

蓝绿部署和金丝雀发布是常见的部署模式.

在蓝绿色部署中,您有两个相同的环境. “绿色”环境承载着当前的生产系统. 部署发生在“蓝色”环境中.

“蓝色”环境被监视是否有故障,以及是否一切正常, 负载平衡和其他组件从“绿色”环境切换到“蓝色”环境.

Canary releases are releases that roll out specific features to a subset of users to reduce the risk involved in releasing new features.

11.

如何设置一个虚拟私有云(VPC)?

View answer

AWS上的vpc一般由一个CIDR和多个子网组成. AWS支持每个VPC配置一个IG (internet gateway), 哪一个是用来路由来往互联网的流量. 有IG的子网被认为是公共子网,其他所有子网都被认为是私有子网.

在AWS上创建VPC所需的组件如下:

  • 创建空VPC资源,并关联CIDR.
  • A public 子网,其中的组件可以从Internet访问. This subnet requires an associated IG.
  • A private 可以通过a访问Internet的子网 NAT gateway. NAT网关位于公网子网内部.
  • A route table for each subnet.
  • 两条路由:一条路由经过IG,一条路由经过NAT网关, 分配到各自的路由表.
  • 然后将路由表关联到各自的子网.
  • 然后由安全组控制允许哪些入站和出站流量.

这种方法在概念上类似于物理基础设施.

12.

描述IaC和配置管理.

View answer

Infrastructure as Code (IaC) is a paradigm that manages and tracks infrastructure configuration in files rather than manually or graphical user interfaces. This allows for more scalable infrastructure configuration and more importantly allows for transparent tracking of changes through usually versioning system.

Configuration management systems are software systems that allow managing an environment in a consistent, reliable, and secure way.

By using an optimized domain-specific language (DSL) to define the state and configuration of system components, 多人可以在一个地方工作并存储数千台服务器的系统配置.

CFEngine是用于配置管理的第一代现代企业解决方案之一.

Their goal was to have a reproducible environment by automating things such as installing software and creating and configuring users, groups, and responsibilities.

第二代系统为大众带来了配置管理. While able to run in standalone mode, Puppet and Chef are generally configured in master/agent mode where the master distributes configuration to the agents.

与前面提到的解决方案相比,Ansible是一种新的解决方案,因为简单而受欢迎. 配置存储在YAML中,没有中央服务器. 状态配置通过SSH(或WinRM)传输到服务器, on Windows) and then executed. 这个过程的缺点是,当管理数千台机器时,它可能会变得很慢.

13.

如何设计一个自我修复的分布式服务?

View answer

Any system that is supposed to be capable of healing itself needs to be able to handle faults and partitioning (i.e. 当系统的一部分在一定程度上不能访问系统的其他部分时.

对于数据库,处理分区容忍度的一种常用方法是为写操作使用仲裁. 这意味着每次写入内容时,必须有最少数量的节点确认写入.

从单个节点故障中正常恢复所需的最小节点数是三个节点. 这样,健康的两个节点就可以确认系统的状态.

对于云应用程序,通常将这三个节点分布在三个可用性区域.

14.

描述一个集中式日志解决方案.

View answer

日志解决方案用于监视系统运行状况. 事件和指标通常都被记录下来,然后由警报系统进行处理. Metrics could be storage space, memory, 加载或任何其他类型的持续监控的连续数据. 它允许检测偏离基线的事件.

In contrast, 基于事件的日志记录可能涵盖诸如应用程序异常之类的事件, 哪些将被发送到中心位置进行进一步处理, analysis, or bug-fixing.

一个常用的开源日志解决方案是Elasticsearch-Kibana-Logstash (ELK)堆栈. 像这样的堆栈通常由三个组件组成:

  1. A storage component, e.g. Elasticsearch.
  2. 日志或度量摄取守护进程,如Logstash或Fluentd. 它负责摄取大量数据,并在此过程中添加或处理元数据. 例如,它可能为IP地址添加地理位置信息.
  3. A visualization solution such as Kibana to show important visual representations of system state at any given time.

Most cloud solutions either have their own centralized logging solutions that contain one or more of the aforementioned products or tie them into their existing infrastructure. AWS CloudWatch, for example, 包含上述所有部分,并大量集成到AWS的每个组件中, 同时还允许将数据并行导出到AWS S3以获得廉价的长期存储.

Another popular commercial solution for centralized logging and analysis both on premise and in the cloud is Splunk. Splunk is considered to be very scalable and is also commonly used as Security Information and Event Management (SIEM) system and has advanced table and data model support.

面试不仅仅是棘手的技术问题, so these are intended merely as a guide. 并不是每一个值得雇佣的“A”候选人都能回答所有的问题, 回答所有问题也不能保证成为A级考生. At the end of the day, 招聘仍然是一门艺术,一门科学,需要大量的工作.

Why Toptal

Tired of interviewing candidates? 不知道该问什么才能让你得到一份好工作?

Let Toptal find the best people for you.

Hire a Top DevOps Engineer Now

我们的独家开发运维工程师网络

想找一份开发运维工程师的工作?

Let Toptal find the right job for you.

Apply as a DevOps Engineer

Job Opportunities From Our Network

Submit an interview question

提交的问题和答案将被审查和编辑, 并可能会或可能不会选择张贴, at the sole discretion of Toptal, LLC.

* All fields are required

Looking for DevOps Engineers?

Looking for DevOps Engineers? Check out Toptal’s DevOps engineers.

Dmitry Kireev

Freelance DevOps Engineer
United StatesToptal Member Since November 21, 2019

Dmitry is a cloud architect and site reliability engineer with nearly two decades of intense professional experience strictly adhering to the DevOps methodology. 他从零开始为现代云系统设计和构建了许多可扩展的基础设施. 德米特里拥有在大规模环境中实际操作的良好记录. He is also proficient with IaC, Kubernetes, automation, scripting, as well as monitoring and observability.

Show More

Sagi Kovaliov

Freelance DevOps Engineer
United StatesToptal Member Since March 30, 2017

Sagi is a top-performing, 微软认证高级Azure DevOps工程师,拥有十年扎实的DevOps实践经验, programming, scripting, and business intelligence. Sagi specializes in architecting and implementing DevOps processes using Azure DevOps and Azure Cloud platforms. 通过利用他在多个应用程序开发领域获得的经验, Sagi已经成为市场上最杰出的专家之一.

Show More

Arthur Lorotte de Banes

Freelance DevOps Engineer
FranceToptal Member Since February 21, 2018

In 2012, Arthur earned a master's degree in computer engineering but he soon learned his true north was in system administration. His programming background has helped him automate most of his tasks along the way and he eventually ended up in cloud computing as it gave him even more possibilities. Arthur is a full-stack DevOps who has particularly strong development skills with all things AWS—which his numerous certifications can attest to.

Show More

Toptal Connects the Top 3% of Freelance Talent All Over The World.

Join the Toptal community.

Learn more