Introduction
Terraform is an open-source Infrastructure as Code(IaC) tool developed by HashiCorp. It is used to define and provision the complete infrastructure using an easy-to-learn declarative language. It is an infrastructure provisioning tool where you can store your cloud infrastructure setup as codes. It’s very similar to tools such as CloudFormation, which you would use to automate your AWS infrastructure, but you can only use that on AWS. With Terraform, you can use it on other cloud platforms as well. Below are some of the benefits of using Terraform.- Does orchestration, not just configuration management
- Supports multiple providers such as AWS, Azure, GCP, DigitalOcean and many more
- Provide immutable infrastructure where configuration changes smoothly
- Uses easy to understand language, HCL (HashiCorp configuration language)
- Easily portable to any other provider
- Supports Client only architecture, so no need for additional configuration management on a server
Installation of Terraform
Terraform Core concepts
Terraform core uses two input sources to do its job. The first input source is a Terraform configuration that you, as a user, configure. Here, you define what needs to be created or provisioned. And the second input source is a state where terraform keeps the up-to-date state of how the current set up of the infrastructure looks like. So, what terraform core does is it takes the input, and it figures out the plan of what needs to be done. It compares the state, what is the current state, and what is the configuration that you desire in the end result. It figures out what needs to be done to get to that desired state in the configuration file. It figures what needs to be created, what needs to be updated, what needs to be deleted to create and provision the infrastructure.
The second component of the architecture are providers for specific
technologies. This could be cloud providers like AWS, Azure, GCP, or
other infrastructure as a service platform. It is also a provider for
more high-level components like Kubernetes or other
platform-as-a-service tools, even some software as a self-service tool.
It gives you the possibility to create infrastructure on different
levels.
For example - create an AWS infrastructure, then deploy Kubernetes on
top of it and then create services/components inside that Kubernetes
cluster.
Terraform has over a hundred providers for different technologies, and
each provider then gives terraform user access to its resources. So
through AWS provider, for example, you have access to hundreds of AWS
resources like EC2 instances, the AWS users, etc. With Kubernetes
provider, you access to commodities, resources like services and
deployments and namespaces, etc.
Below are the core concepts/terminologies used in Terraform:
- Provider: It is a plugin to interact with APIs of service and access its related resources.
- Resources: It refers to a block of one or more infrastructure objects (compute instances, virtual networks, etc.), which are used in configuring and managing the infrastructure.
- count and for_each Meta Arguments: - It allow us to create multiple instances of any resource.
- Data Source: It is implemented by providers to return information on external objects to terraform.
- State: It consists of cached information about the infrastructure managed by Terraform and the related configurations.
- Module: It is a folder with Terraform templates where all the configurations are defined
- Input Variables: It is key-value pair used by Terraform modules to allow customization.
- Local Variables: It work like standard variables,but their scope is limited to the module where they’re declared.
- Output Values: These are return values of a terraform module that can be used by other configurations.
Provider
A provider works pretty much as an operating system’s device driver. It exposes a set of resource types using a common abstraction, thus masking the details of how to create, modify, and destroy a resource pretty much transparent to users. Terraform downloads providers automatically from its public registry as needed, based on the resources of a given project. It can also use custom plugins, which must be manually installed by the user. Finally, some built-in providers are part of the main binary and are always available. Although not strictly necessary, it’s considered a good practice to explicitly declare which provider we’ll use in our Terraform project and inform its version. For this purpose, we use the version attribute available to any provider declaration:Resources
In Terraform, a resource is anything that can be a target for CRUD operations in the context of a given provider. Some examples are an EC2 instance, an Azure MariaDB, or a DNS entry. Let’s look at a simple resource definition:${expression} syntax, which is still available but
considered legacy.
This example also shows one of Terraform’s strengths: regardless of the
order in which we declare resources in our project, it will figure out
the correct order in which it must create or update them based on a
dependency graph it builds when parsing them.
count and for_each Meta Arguments
It allow us to create multiple instances of any resource. The main difference between them is that count expects a non-negative number, whereas for_each accepts a list or map of values. For instance, let’s use count to create some EC2 instances on AWS:Data Sources
Data sources work pretty much as “read-only” resources, in the sense that we can get information about existing ones but can’t create or change them. They are usually used to fetch parameters needed to create other resources. A typical example is the aws_ami data source available in the AWS provider, which we use to recover attributes from an existing AMI:State
The state of a Terraform project is a file that stores all details about resources that were created in the context of a given project. For instance, if we declare an azure_resourcegroup resource in our project and run Terraform, the state file will store its identifier. The primary purpose of the state file is to provide information about already existing resources, so when we modify our resource definitions, Terraform can figure out what it needs to do. An important point about state files is that they may contain sensitive information. Examples include initial passwords used to create a database, private keys, and so on. Terraform uses the concept of a backend to store and retrieve state files. The default backend is the local backend, which uses a file in the project’s root folder as its storage location. We can also configure an alternative remote backend by declaring it in a terraform block in one of the project’s .tf files:Modules
Terraform modules are the main feature that allows us to reuse resource definitions across multiple projects or simply have a better organization in a single project. This is much like what we do in standard programming: instead of a single file containing all code, we organize our code across multiple files and packages. A module is just a directory containing one or more resource definition files. In fact, even when we put all our code in a single file/directory, we’re still using modules - in this case, just one. The important point is that sub-directories are not included as part of a module. Instead, the parent module must explicitly include them using the module declaration:Input Variables
Any module, including the top, or main one, can define several input variables using variable block definitions:- -var command-line option
- .tfvar files, using command-line options or scanning for well-known files/locations
- Environment variables starting with TF_VAR_
- The variable’s default value, if present
Output Values
By design, a module’s consumer has no access to any resources created within the module. Sometimes, however, we need some of those attributes to use as input for another module or resource. To address those cases, a module can define output blocks that expose a subset of the created resources:Local Variables
Local variables work like standard variables, but their scope is limited to the module where they’re declared. The use of local variables tends to reduce code repetition, especially when dealing with output values from modules:Workspaces
Terraform workspaces allow us to keep multiple state files for the same project. When we run Terraform for the first time in a project, the generated state file will go into the default workspace. Later, we can create a new workspace with the terraform workspace new command, optionally supplying an existing state file as a parameter. We can use workspaces pretty much as we’d use branches in a regular VCS. For instance, we can have one workspace for each target environment - DEV, QA, PROD - and, by switching workspaces, we can terraform apply changes as we add new resources. Given the way this works, workspaces are an excellent choice to manage multiple versions - or “incarnations” if you like - of the same set of configurations. This is great news for everyone who’s had to deal with the infamous “works in my environment” problem, as it allows us to ensure that all environments look the same. In some scenarios, it may be convenient to disable the creation of some resources based on the particular workspace we’re targeting. For those occasions, we can use the terraform.workspace predefined variable. This variable contains the name of the current workspace, and we can use it as any other in expressions.Terraform Lifecycle
Terraform lifecycle consists of - init, plan, apply, and destroy.
- terraform init initializes the working directory which consists of all the configuration files
- terraform plan is used to create an execution plan to reach a desired state of the infrastructure. Changes in the configuration files are done in order to achieve the desired state.
- terraform apply then makes the changes in the infrastructure as defined in the plan, and the infrastructure comes to the desired state.
- terraform destroy is used to delete all the old infrastructure resources, which are marked tainted after the apply phase.