Containers appeared on the market of technology as “the next big step” on the evolutionary scale of application servers. However, for a while, doing something really serious (productive environments that needed scaling, and such) with containers was virtually impossible, even relying on the power of the public cloud. The reason for that was apparent: at that time, there was no technology associated with them that would allow us to perform some classic operations already trivial for conventional application servers, such as: scaling, load distribution, increase or decrease the number of underlying instances, network segmentation, and so forth. I mean, some technology that would allow us to orchestrate distributed efficiently and container-based environments.

Fortunately, it didn’t take long for the market to identify this gap and start doing something on that regard, and the first initiatives towards appeared soon. Apache came up with its DC/OS; Docker came along with its Swarm orchestrator, among many other ones. Finally, through an open-source initiative, Google came out to the scene with its platform that, by the time being, has been consolidating itself as the major one for containers orchestration on clusters – Kubernetes (or K8s). But this post ain’t about Kubernetes, though. From now on we’re going to explore a bit more the concepts related to Docker Swarm.

Docker Swarm

Docker Swarm (or simply Swarm) is Docker’s offer for orchestrating containers on Docker hosts clusters. It works seamlessly with the docker engine from version 1.12.0 on and doesn’t require any additional installation/configuration at the host if it’s running Docker 1.12.0 or higher.

I know that you may be wondering at this point: “Fabricio, if Kubernetes is the solution base that the market (Microsoft included) has been using, why should we discuss Swarm concepts and working model at this point?” Well, there are some reasons for that. I want to highlight the following:

  1. Working with Swarm is indeed simpler. The work with K8s requires a series of advanced knowledge of containers and also notions about clustering them. Thus, since Swarm presents a subset of simpler resources, for learning purposes, it is interesting to use it as an entry point to this universe.
  2. Docker and Swarm merge into the same technology. This is a positive aspect of Swarm. Although there are other approaches for containers, most of the projects that you are going to face at work will undoubtedly be using Docker as containerization’s technology. It does mean that to work with containers on real-life scenarios properly, you have to know Docker very well. It also means that, if you deepen your knowledge about Docker, you can also quickly evolve to Swarm, since they do converge to the same technology basis.
  3. There are many Swarm-based environments on the market. Swarm arrived before K8s did into the market and of course, because of this, many projects were structured on Swarm. It means that having a solid knowledge of Swarm will be useful to you in the market. Besides that, it is essential to note that everything you learn about Swarm will be helpful on an eventual ramp up on learning K8s.

Technologically speaking, Swarm has some very interesting (and extremely useful) features by design. The major ones are listed below:

  • One of them I have already mentioned earlier – the built-in integration with Docker engine. No additional software nor configuration is required. The only existing requirement is the version of Docker 1.12.0 or higher.
  • Autonomous design for cluster’s nodes. Swarm doesn’t requires the hosts added into the cluster to meet specific requirements (same type or dimension, for instance). In addition, Swarm abstracts the lower layers (host) from the deployment time. Every necessary adjustment is performed automatically at runtime, this way, you can create a cluster from a single disk, for example, and expand its capacity as needed at run time.
  • Declarative service model. You can compose a complete environment (frontend, backend, database and messaging, for example) in a declarative way, using docker-compose through YAML semantic for that purpose.
  • Scalability. Yes, Docker Swarm allows you to scale (up and down) the number of containers running each instance of your application in a very simplified fashion, as we will see later on.
  • Status management. The host identified as “manager” within the cluster will have as one of its duties do monitor the state of the containers running in the cluster. Imagine, for example, that you have requested five instances (containers) of a given service. If at some point the manager identifies that two of these five instances for some reason entered on “exited” state, it will automatically spin up two new instances of that service to ensure the reliability of the cluster according to a previously defined “scalability policy”.
  • Multi-networking. When deploying a new service, you can create network layers over the cluster’s internal network. The “manager” node will automatically assign internal IPs to these new incoming containers at the moment they are started.
  • Automatic DNS resolution. Every time a new service/app is pushed to the Swarm cluster, the node manager automatically assigns a unique DNS name to it. By doing this, a balancing layer is generated so that any request for the service is automatically resolved and the load automatically distributed among the cluster’s nodes.
  • Load balancing. Yes, it is possible to expose a service to the Internet so that external requests are distributed directly between the containers running within the cluster. You can choose the desired load distribution algorithm.
  • Update of rolling services. It is relatively simple to perform any update in execution time onto a given service/app. We will do this in profusion in the second article (yet to come).

How does it works?

Throughout this and the second post (coming soon), I want to talk a little bit about how to create and manage (basic level) from scratch a Docker hosts-based cluster that uses Swarm as an orchestrator and of course, ultimately orchestrate a simple application. However, first, we need to understand how Docker Swarm works. For this, consider Figure 1.

Figure 1. High-level view of a working docker cluster managed by Swarm

As mentioned before, Swarm is nothing but a layer of Docker orchestration software that is being distributed all across either physical or virtual (Docker) hosts. Before the management itself can be performed, a Swarm cluster will always need to pursue two different types of hosts: “Managers” and “Workers.” Also, keep in mind that a Docker host within a Swarm cluster will be formally known as a “Node”. Keeping these taxonomies in mind, let’s look at the functions of each one.

  • Managers: it is a Docker host in charge of managing the cluster as a whole. It does monitor all cluster nodes, takes care of internal routines when a new entrant service is published into the cluster, receives and applies updates imposed by the cluster administrator, among many other tasks. If a cluster has only one machine playing as a manager, it will also be the leader (node whereby the system administrator communicates with the other nodes within the cluster). If a Swarm cluster has more than one node manager, they will always be replicas of the leader and can replace it if it experiences outages unexpectedly. A node manager can also receive containers to run in its worker context.
  • Workers: As its name suggests, these are machines that simply do some stuff. They do receive workloads sent over by the manager leader and processes it. They have no administrative responsibilities at all.

As you can see, observing Figure 1, either the professional responsible for administering the cluster or some automated deployment process communicates directly with the “manager-leader” node. Any requests for administrative operation within the cluster (such as the creation of a new service, rollout update, and so forth) are performed through the manager – leader machine. It then distributes the result of these operations to the nodes underneath it.

The scenario presented by Figure 1 also describes the creation of a cluster for development purposes only, meaning, only one host manager-leader and then, three worker-nodes. However, for productive environments, Docker does recommend at least three manager-nodes (for redundancy purposes) and at least three worker-nodes.

The other nomenclatures related to Swarm we are going to see later on throughout the work with the cluster itself.

Manually creating a Swarm cluster on Azure

Microsoft Azure is undoubtedly the public cloud that is better democratizing the access to container services among all clouds. If you don’t believe me, do some research. It won’t take long for you to prove this statement. We could create our cluster on ACS (Azure Container Service Engine), for example. ACS already ships a ready-made VM Scale Set with load balancer and pre-configured orchestrators (including Swarm). However, because we want to understand the cluster creation process deeply, we’re going to create the cluster manually and do orchestrate a simple application afterward.

For that purpose, we’re going to take advantage of traditional VMs. We’ll create a minimal productive environment following Docker’s recommendations, namely:

  • 3 virtual machines running Ubuntu 17.10 as Managers.
  • 3 virtual machines running Ubuntu 17.10 as Workers.
  • We’re going to take Docker CE latest version (at the time this text was written, it was version 18.03).
  • We’re going to create a simple .NET Core app which will be managed and distributed all over the cluster.
  • We’re going to have two different versions of the same application to shows up the updating process later on.
  • The client machine (throughout I’m going to communicate with Swarm cluster and perform the administrative process) is Windows. I already have Docker client for Windows up and running in it. If you don’t have Docker configured yet, please give a click on this link to find the installation process that best suites to your environment.
  • Also, I’m going to use PUTTY and PUTTY Gen to communicate with the server machines and do generate SSH keys. You can also use the Linux Subsystem if you are on Windows environment.

Step 1: Creating the image base to serve the cluster

Let’s start by creating the six virtual machines which will be composing the cluster itself. We have two ways to get there: (1) Do create each machine individually and do repeat the same setup six times; or (2) Do create a machine that brings the whole set of settings needed (basically, Docker’s installation and configuration), and using this one as reference to the other five ones. I don’t know about you, but I will pick the option 2.

To visualize a comprehensive tutorial on how to create a virtual machine in Azure, please, refer to this link.

The Figure 2 does present the virtual machine “docker-image” up and running in my Azure subscription. So, let’s build this (Docker environment) up.

Figure 2. Template VM up and running on Azure

As I mentioned early on, because I’m performing the access to the Linux machine from a Windows client, I have opted by using Putty to get there. If you’re new with this “Putty thing”, please see how to apply it to access Linux environments through this link.

Also, because Docker’s configuration process is pretty well documented at their official website, I’m not going to go through it here. Please, refer to this link to see the steps I’ve performed.

I ended up this installation/configuration process with my “docker-image” machine ready, meaning that now I have Docker and its dependencies correctly installed and configured, as you can see through Figure 3.

Figure 3. Template virtual machine correctly configured

Now that I have Docker engine installed (latest version 18.03.1-ce, therefore meeting Swarm’s the minimum version requirement) I just ran a simple docker image called “Hello-World” to make sure that everything is working fine. Fortunately, it is!

Before starting the generalization process, please, make sure to save the information contained in the file “/etc/resolv.conf”. It should look like the content shown by Listing 1.

We already have our Linux-based machine ready to be generalized. I won’t describe the generalization process here because it is already very well documented at the Azure portal. To view this tutorial, please, follow up this link. To perform the procedure described in this tutorial, you will need Azure CLI 2.0.

If everything worked out, you might have ended up viewing in the resource listing within your Resource Group (RG) a new asset, similar to that shown by Figure 4. It is an image that is a mirror of our already configured machine (previously mounted). Now, we have what we need now to build up our cluster.

Figure 4. Image created as result of the generalization process just performed

Two essential observations at this point:

  • When you generalized the image, the “/etc/resolv.conf” file was removed (the CLI utility made you aware of it during the process). As soon as your new VM (the one created from the generalized image) comes back to life, you must re-create this file with the content presented by Listing 1.
  • If you ran the command “sudo waagent -deprovision + user” to deprovision the template virtual machine you’ll need to re-create a new user + SSH key. If you didn’t choose that option (i.e., you just ran “sudo waagent -deprovision”), then the user access data should still be there and be valid.

Listing 1. Content of “/etc/resolv.conf”

Step 2: Adding storage base and virtual network

Before we move forward with the process of creating VMs, we need to go after the following items:

  • Create a virtual network (or just VNet) in Azure whereby using local (internal) IPs, cluster’s nodes will be able to see each other. To view the process of creating a new VNet on Azure, please follow this link. In my case, I’ve created a network called “swarm”.
  • Create a storage account that will be hosting both node’s log files and node’s virtual disks. To visualize a tutorial on how to create a storage account, please give a click on this link. I’ve created a storage account called “swarmstg”.

If everything goes well, you should see these two new resources sitting in your resource group, as presented by Figure 4.

Figure 4. Both storage and virtual network successfully created

Step 3: Creating cluster’s nodes

Now that we have both the image base, communication network, and storage ready to use, we can go ahead and create our Swarm cluster’s nodes. In my case, I first created the 3 managers (which I’ve called “node-manager-1”, “node-manager-2” and “node-manager-3”), and then, the workers (whom I named “node-worker-1”, “node-worker-2” and “node-worker-3”). Be sure to join each of these new VMs into the network just created. Likewise, be sure to point out the newly created storage as default for each of those VMs at the time of their creation.

To visualize a comprehensive tutorial on how to use custom images for VMs in Azure, please, give a click on this link.

Done! As Docker was already configured within our default image base, we are now ready to start work with Swarm cluster. Figure 5 now shows the cluster’s machines up and running.

Figure 5. Cluster’s VMs up and running

Setting up the Swarm cluster

At this point, we have everything we need to make our Swarm cluster workable. Now we need to configure it. Let’s start by “node-manager-1”. To get into the VM itself, you’re going to need the VM’s credentials (defined when you created it).

Step 1: Initializing the cluster

Once inside the VM, let’s start the configuration by activating the cluster. To do this, we use the following command line.

docker swarm init --advertise-addr 10.0.0.4:2377 --listen-addr 10.0.0.4:2377

Here’s what I’m doing by executing it:

  • Initializing Docker Engine on Swarm mode;
  • Reporting that the manager (and leader) machine will be using an internal IP 10.0.0.4 as communication standard;
  • We are also defining that the default port whereby the communication between hosts is going to flow out will be 2377 (Swarm’s default);
  • By design, the machine which the cluster gets started will be named “manager-leader”, as you can see by executing the docker info command.

Result is presented below.

docker swarm join --token \
SWMTKN-1-1ejii7xwf44spkd6sdq6s3lb8etac74d93x077tsiak8vjloe0-514ml7z2p8q46igwbel55tici 10.0.0.4:2377

If everything went well, a message telling you that the cluster is up and running should be shown. Afterwards, you will be requested to respond to a command line which asks you if you want to add a new worker-node within the cluster. You must execute exactly what is being presented. In my case, below’s command line is what I was presented to.

To add a manager to this swarm, run "docker swarm join-token manager" and follow the instructions

Following the instructions and running “docker swarm join-token manager” (since we have two more managers to add into the cluster), we get another join’s token.

docker swarm join --token \
SWMTKN-1-1ejii7xwf44spkd6sdq6s3lb8etac74d93x077tsiak8vjloe0-f3i98wykkgenpr3xn68eid9z6 10.0.0.4:2377

These lines remind us about two extremely important concepts related to Swarm clusters. They are:

  • When we talked about some Swarm’s technical features as we started this post, I mentioned the aspect that it has the capability of abstracting the resource sizing process, thus making it possible to add new computational resources (nodes) on-the-fly. Can you recover it? Well, now you know that it is possible through the join model accompanied by an authentication token.
  • To make it possible (add both new managers and workers nodes into the cluster), docker-engine does generate an access token for each type. Obviously, these tokens are different between them. Therefore, you should be aware when adding a new node into the cluster, and pick the proper one depending on the type of computational resource you’re adding in.

Done! Cluster initialized and machine “node-manager-1” properly defined as manager and leader. Let’s move on and add the other elements into the cluster.

Step 2 – Adding new managers into the cluster

Ok. Based upon what we just saw by adding managers into out Swarm cluster, I’m assuming you already know what the steps we need to go through towards to ingress the other manager machines into the cluster. I mean, basically, what we need to do is: connect, respectively, to each of these machines we want to add (ie, “node-manager-2” and “node-manager-3”) and execute the command informed by the docker-engine when we started up our cluster in step 1. The only thing we’re going to do differently from that instruction is to add the reference to the local IP of the server we are adding in. In my case, “10.0.0.5 – node-manager-2” and “10.0.0.6 – node-manager-3”. This way, my command was configured as follows:

docker swarm join \
--token SWMTKN-1-1ejii7xwf44spkd6sdq6s3lb8etac74d93x077tsiak8vjloe0-f3i98wykkgenpr3xn68eid9z6 10.0.0.4:2377 \
--advertise-addr {ip-local-manager}:2377 --listen-addr {ip-local-manager}:2377

By doing so, if all went well, you should be seen the following message (indicating that the cluster now has more managers to help with the work) when you finish the process of adding each manager.

This node joined a Swarm as a manager.

Step 3 – Adding new worker nodes

Now that we already have our manager set up, it’s time for us to add the workers within the cluster. The process is exactly the same as that described in step 2 above; however, as both token and the local IPs of each machine to be added are different, we will need to change the command that adds them slightly. In my case, it was defined as follows:

docker swarm join \
--token SWMTKN-1-1ejii7xwf44spkd6sdq6s3lb8etac74d93x077tsiak8vjloe0-514ml7z2p8q46igwbel55tici 10.0.0.4:2377 \
--advertise-addr {ip-local-worker}:2377 --listen-addr {ip-local-worker}:2377

For you to keep up the records updated, my worker’s IPs are respectively: “10.0.0.7 – node-worker-1”, “10.0.0.8 – node-worker-2” and “10.0.0.9 – node-worker-3”.

Again, if everything went well, you should be able to see a success message similar to the one below for each of the added worker servers.

This node joined a Swarm as a worker.

Work completed with the workers. Our cluster seems to be ready for use. One final verification needs to be done, though. Let’s see if globally, everything is linked and working properly. To get there, I’m going to connect back into “node-manager-1” and do perform the following command. If you are asking yourself why you can’t go after it from a worker machine, the answer is pretty simple: Because only managers are allowed to access cluster data. We are connecting to the leader machine, but any of our three managers would be able to return such administrative information to us.

If all went well, out of the universe of the information returned by executing this command, you should be viewing a piece of data for the Swarm cluster, as shown in Figure 6.

Figure 6. Final verification within the cluster

As you can clearly see, our cluster has now three managers and six workers. Oops! What do you mean by six workers if we have added only three? Another good question. This is because each manager also has a worker context within and therefore, can receive tasks to be performed, so they are considered sort of workers as well. Nice, isn’t it?

Done. Our cluster is ready. As we already have a lot of information to digest so far, I will make a hard stop in here and leave the rest of the deployment process for the next article. In it, I will detail the process of distributing a .NET Core application (but could be any other platform) all over this cluster.

Enjoy!


0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *