Building a Solr Cluster with TerraForm – Part 1

So it’s no surprise that I very much have been talking about how amazing TerraForm is, and recently I’ve been doing a lot of investigation into Solr and how to build a scalable Solr Cluster.

So given the kubernetes template I wanted to try my hand at something similar. The goals of this project were the following:

  1. Build a generic template for creating a Solr cloud cluster with distributed shard.
  2. Build out the ability to scale the cluster for now using TerraForm to manually trigger increases to cluster size.
  3. Make the nodes automatically add themselves to the cluster.

And I could do this just using bash scripts and packer. But instead wanted to try my hand at cloud init.

But that’s going to be the end result, I wanted to walkthrough the various steps I go through to get to the end.  The first real step is to get through the installation of Solr on  linux machines to be implemented. 

So let’s start with “What is Solr?”   The answer is that Solr is an open source software solution that provides a means of creating a search engine.  It works in the same vein as ElasticSearch and other technologies.  Solr has been around for quite a while and is used by some of the largest companies that implement search to handle search requests by their customers.  Some of those names are Netflix and CareerBuilder.  See the following links below:

So I’ve decided to try my hand at this and creating my first Solr cluster, and have reviewed the getting started. 

So I ended up looking into it more, and built out the following script to create a “getting started” solr cluster.

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
sudo apt-get install -y gnupg-curl
sudo wget https://www.apache.org/dist/lucene/solr/8.0.0/solr-8.0.0.zip.asc | sudo apt-key add

sudo apt-get update -y
sudo apt-get install unzip
sudo wget http://mirror.cogentco.com/pub/apache/lucene/solr/8.0.0/solr-8.0.0.zip

sudo unzip -q solr-8.0.0.zipls
sudo mv solr-8.0.0 /usr/local/bin/solr-8.0.0 -f
sudo rm solr-8.0.0.zip -f

sudo apt-get install -y default-jdk

sudo chmod +x /usr/local/bin/solr-8.0.0/bin/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node1/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node2/solr
sudo /usr/local/bin/solr-8.0.0/bin/bin/solr -e cloud -noprompt

The above will configure a “getting started solr cluster” that leverages all the system defaults and is hardly a production implementation. So my next step will be to change this. But for the sake of getting something running, I took the above script and moved it into a packer template using the following json. The above script is the “../scripts/Solr/provision.sh”

{
  "variables": {
    "deployment_code": "",
    "resource_group": "",
    "subscription_id": "",
    "location": "",
    "cloud_environment_name": "Public"
  },
  "builders": [{   
    "type": "azure-arm",
    "cloud_environment_name": "{{user `cloud_environment_name`}}",
    "subscription_id": "{{user `subscription_id`}}",

    "managed_image_resource_group_name": "{{user `resource_group`}}",
    "managed_image_name": "Ubuntu_16.04_{{isotime \"2006_01_02_15_04\"}}",
    "managed_image_storage_account_type": "Premium_LRS",

    "os_type": "Linux",
    "image_publisher": "Canonical",
    "image_offer": "UbuntuServer",
    "image_sku": "16.04-LTS",

    "location": "{{user `location`}}",
    "vm_size": "Standard_F2s"
  }],
  "provisioners": [
    {
      "type": "shell",
      "script": "../scripts/ubuntu/update.sh"
    },
    {
      "type": "shell",
      "script": "../scripts/Solr/provision.sh"
    },
    {
      "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
      "inline": [
        "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
      ],
      "inline_shebang": "/bin/sh -e",
      "type": "shell"
    }]
}

The only other script mentioned is the “update.sh”, which has the following logic in it, to install the cli and update the ubuntu image:

#! /bin/bash

sudo apt-get update -y
sudo apt-get upgrade -y

#Azure-CLI
AZ_REPO=$(sudo lsb_release -cs)
sudo echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
sudo curl -L https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
sudo apt-get install apt-transport-https
sudo apt-get update && sudo apt-get install azure-cli

So the above gets me to a good place for being able to create an image with it configured.

For next steps I will be doing the following:

  • Building a more “production friendly” implementation of Solr into the script.
  • Investigating leveraging cloud init instead of the “golden image” experience with Packer.
  • Building out templates around the use of Zookeeper for managing the nodes.


TerraForm Kubernetes Template

Hello All, I wanted to get a quick blog post out here based on something that I worked on, and finally is seeing the light of day.  I’ve been doing a lot of work with TerraForm, and one of the use cases I found was standing up a Kubernetes cluster.  And specifically I’ve been working with Azure Government, which does not have AKS available.  So how can I build a kubernetes cluster and minimize the lift of creating a cluster and then make it easy to add nodes to the cluster.  So the end result of that goal is here.

Below is a description of the project, and if you’d like to contribute please do, I have some ideas for phase 2 of this that I’m going to build out but I’d love to see what others come up with.

Intention:

The purpose of this template is to provide an easy-to-use approach to using an Infrastructure-as-a-service deployment to deploy a kubernetes cluster on Microsoft Azure. The goal being that you can start fresh with a standardized approach and preconfigured master and worker nodes.

How it works?

This template create a master node, and as many worker nodes as you specify, and during creation will automatically execute the scripts required to join those nodes to the cluster. The benefit of this being that once your cluster is created, all that is required to add additional nodes is to increase the count of the “lkwn” vm type, and reapply the template. This will cause the newe VMs to be created and the cluster will start orchestrating them automatically.

This template can also be built into a CI/CD pipeline to automatically provision the kubernetes cluster prior to pushing pods to it.

This guide is designed to help you navigate the use of this template to standup and manage the infrastructure required by a kubernetes cluster on azure. You will find the following documentation to assist:

  • Configure Terraform Development Environment: This page provides details on how to setup your locale machine to leverage this template and do development using Terraform, Packer, and VSCode.
  • Use this template: This document walks you through how to leverage this template to build out your kubernetes environment.
  • Understanding the template: This page describs how to understand the Terraform Template being used and walks you through its structure.

Key Contributors!

A special thanks to the following people who contributed to this template:
Brandon Rohrer: who introduced me to this template structure and how it works, as well as assisted with optimizing the functionality provided by this template.