Building a Solr Cluster with TerraForm – Part 1

So it’s no surprise that I very much have been talking about how amazing TerraForm is, and recently I’ve been doing a lot of investigation into Solr and how to build a scalable Solr Cluster.

So given the kubernetes template I wanted to try my hand at something similar. The goals of this project were the following:

  1. Build a generic template for creating a Solr cloud cluster with distributed shard.
  2. Build out the ability to scale the cluster for now using TerraForm to manually trigger increases to cluster size.
  3. Make the nodes automatically add themselves to the cluster.

And I could do this just using bash scripts and packer. But instead wanted to try my hand at cloud init.

But that’s going to be the end result, I wanted to walkthrough the various steps I go through to get to the end.  The first real step is to get through the installation of Solr on  linux machines to be implemented. 

So let’s start with “What is Solr?”   The answer is that Solr is an open source software solution that provides a means of creating a search engine.  It works in the same vein as ElasticSearch and other technologies.  Solr has been around for quite a while and is used by some of the largest companies that implement search to handle search requests by their customers.  Some of those names are Netflix and CareerBuilder.  See the following links below:

So I’ve decided to try my hand at this and creating my first Solr cluster, and have reviewed the getting started. 

So I ended up looking into it more, and built out the following script to create a “getting started” solr cluster.

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
sudo apt-get install -y gnupg-curl
sudo wget https://www.apache.org/dist/lucene/solr/8.0.0/solr-8.0.0.zip.asc | sudo apt-key add

sudo apt-get update -y
sudo apt-get install unzip
sudo wget http://mirror.cogentco.com/pub/apache/lucene/solr/8.0.0/solr-8.0.0.zip

sudo unzip -q solr-8.0.0.zipls
sudo mv solr-8.0.0 /usr/local/bin/solr-8.0.0 -f
sudo rm solr-8.0.0.zip -f

sudo apt-get install -y default-jdk

sudo chmod +x /usr/local/bin/solr-8.0.0/bin/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node1/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node2/solr
sudo /usr/local/bin/solr-8.0.0/bin/bin/solr -e cloud -noprompt

The above will configure a “getting started solr cluster” that leverages all the system defaults and is hardly a production implementation. So my next step will be to change this. But for the sake of getting something running, I took the above script and moved it into a packer template using the following json. The above script is the “../scripts/Solr/provision.sh”

{
  "variables": {
    "deployment_code": "",
    "resource_group": "",
    "subscription_id": "",
    "location": "",
    "cloud_environment_name": "Public"
  },
  "builders": [{   
    "type": "azure-arm",
    "cloud_environment_name": "{{user `cloud_environment_name`}}",
    "subscription_id": "{{user `subscription_id`}}",

    "managed_image_resource_group_name": "{{user `resource_group`}}",
    "managed_image_name": "Ubuntu_16.04_{{isotime \"2006_01_02_15_04\"}}",
    "managed_image_storage_account_type": "Premium_LRS",

    "os_type": "Linux",
    "image_publisher": "Canonical",
    "image_offer": "UbuntuServer",
    "image_sku": "16.04-LTS",

    "location": "{{user `location`}}",
    "vm_size": "Standard_F2s"
  }],
  "provisioners": [
    {
      "type": "shell",
      "script": "../scripts/ubuntu/update.sh"
    },
    {
      "type": "shell",
      "script": "../scripts/Solr/provision.sh"
    },
    {
      "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
      "inline": [
        "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
      ],
      "inline_shebang": "/bin/sh -e",
      "type": "shell"
    }]
}

The only other script mentioned is the “update.sh”, which has the following logic in it, to install the cli and update the ubuntu image:

#! /bin/bash

sudo apt-get update -y
sudo apt-get upgrade -y

#Azure-CLI
AZ_REPO=$(sudo lsb_release -cs)
sudo echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
sudo curl -L https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
sudo apt-get install apt-transport-https
sudo apt-get update && sudo apt-get install azure-cli

So the above gets me to a good place for being able to create an image with it configured.

For next steps I will be doing the following:

  • Building a more “production friendly” implementation of Solr into the script.
  • Investigating leveraging cloud init instead of the “golden image” experience with Packer.
  • Building out templates around the use of Zookeeper for managing the nodes.


Running Entity Framework Migrations in VSTS Release with Azure Functions

Hello All, I hope your having a good week, I wanted to write up a problem I just solved that I can’t be the only one. For me, I’m a big fan of Entity Framework, and specifically Code First Migrations. Now its not perfect for every solution, but I will say that it does provide a method of versioning and controlling database changes in an more elegant method.

Now one of the problems I’ve run into before is that how do you leverage a dev ops pipeline with migrations. The question becomes how do you trigger the migrations to execute as part of an automated release. And this can be a pretty sticky situation if you’ve ever tried to unbox it. The most common answer is this.  Which is in itself, a fine solution.

But one scenario that I’ve run into where this doesn’t always play out, is the scenario where because the above link executes on App_Start, it can cause a slowdown for the first users, of in the scenario of load balancing can cause a performance issue when it hits that method of running the migration.  And to me, from a Dev Ops perspective, this doesn’t feel really that “clean” as I would like to deploy my database changes at the same time as my app, and know that when it says “Finished” everything is done and successful.  By leveraging this approach you run the risk of a “false positive” saying that it was successful even when it wasn’t, as the migrations will fail with the first user.

So an alternative I wrote was to create an azure function to provide an Http endpoint, that allows for me to trigger the migrations to execute.  Under this approach I can make an http call from my release pipeline, and execute the migrations in a controlled state, and if its going to fail, it will happen within my deployment rather than after.

Below is the code I leveraged in the azure function:

[FunctionName("RunMigration")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)]HttpRequestMessage req, TraceWriter log)
{
log.Info("Run Migration");

bool isSuccessful = true;
string resultMessage = string.Empty;

try
{
//Get security key
string key = req.GetQueryNameValuePairs()
.FirstOrDefault(q => string.Compare(q.Key, "key", true) == 0)
.Value;

var keyRecord = ConfigurationManager.AppSettings["MigrationKey"];
if (key != keyRecord)
{
throw new ArgumentException("Key Mismatch, disregarding request");
}

Database.SetInitializer(new MigrateDatabaseToLatestVersion<ApplicationDataContext, Data.Migrations.Configuration>());

var dbContext = new ApplicationDataContext();

dbContext.Database.Initialize(true);

//var list = dbContext.Settings.ToList();
}
catch (Exception ex)
{
isSuccessful = false;
resultMessage = ex.Message;
log.Info("Error: " + ex.Message);
}

return isSuccessful == false
? req.CreateResponse(HttpStatusCode.BadRequest, "Error: " + resultMessage)
: req.CreateResponse(HttpStatusCode.OK, "Migrationed Completed, Database updated");
}
}

Now a couple of quick things to note as you look at the code:
Line 12:  I am extracting a querystring parameter called “key” and then in line 16, I am getting the “MigrationKey” from the Functions App Settings.  The purpose of  this is to quickly secure the azure function.  This function is looking for a querystring value that matches the app settings value to allow the triggering of the migrations.  This prevents just anyone from hitting the endpoint.  I can then secure this value within my release management tool to pass as part of the http request.

Lines 22-26: this is what actually triggers the migration to execute by creating a context and setting the initializer.

Line 37: Allows for handling the response code that is sent back to the client for the http request.  This allows me to throw an error within my release management tool if necessary.

Azure Compute – Searchable and Filterable

Hello All, so a good friend of mine, Brandon Rohrer and I just finished the first iteration of a project we’ve been working on recently.  Being Cloud Solution Architects, we get a lot of questions about the different compute options that are available in Azure.  And it occurred to us that there wasn’t a way to consume this information in a searchable, and filterable format to get the information customers need.

So we created this site:

https://computeinfo.azurewebsites.net
This site scrapes through the documentation provided by Microsoft and extracts the information about the different types of virtual machines you can create in azure and provides it in a way that meets the following criteria:

  • Searchable
  • Filterable
  • Viewable on a mobile device

Hope this helps as you look at your requirements in Azure and build out the appropriate architecture for your solution.