Leveraging Azure Search with Python

So lately I’ve been working on a side project, to showcase some of the capabilities in Azure with regard to PaaS services, and the one I’ve become the most engaged with is Azure Search.

So let’s start with the obvious question, what is Azure Search? Azure Search is a Platform-as-a-Service offering that allows for implementing search as part of your cloud solution in a scalable manner.

Below are some links on the basics of “What is Azure Search?”

The first part is how to create a search service, and really I find the easiest way is to create it via CLI:

az search service create --name {name} --resource-group {group} --location {location}

So after you create an Azure Search Service, the next part is to create all the pieces needed. For this, I’ve been doing work with the REST API via Python to manage these elements, so you will see that code here.

  • Create the data source
  • Create an index
  • Create an indexer
  • Run the Indexer
  • Get the indexer status
  • Run the Search

Project Description:

For this post, I’m building a search index that crawls through the data compiled from the Chicago Data Portal, which makes statistics and public information available via their API. This solution is pulling in data from that API into cosmos db to make that information searchable. I am using only publicly consumable information as part of this. The information on the portal can be found here.

Create the Data Source

So, the first part of any search discussion, is that you need to have a data source that you can search. Can’t get far without that. So the question becomes, what do you want to search. Azure Search supports a wide variety of data sources, and for the purposes of this discussion, I am pointing it at Cosmos Db. The intention is to search the contents of a cosmos db to ensure that I can pull back relevant entries.

Below is the code that I used to create the data source for the search:

import json
import requests
from pprint import pprint

#The url of your search service
url = 'https://[Url of the search service]/datasources?api-version=2017-11-11'
print(url)

#The API Key for your search service
api_key = '[api key for the search service]'


headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

data = {
    'name': 'cosmos-crime',
    'type': 'documentdb',
    'credentials': {'connectionString': '[connection string for cosmos db]'},
    'container': {'name': '[collection name]'}
}

data = json.dumps(data)
print(type(data))

response = requests.post(url, data=data, headers=headers)
pprint(response.status_code)

To get the API key, you need the management key which can be found with the following command:

az search admin-key show --service-name [name of the service] -g [name of the resource group]

After running the above you will have created a data source to connect to for searching.

Create an Index

Once you have the above datasource, the next step is to create an index. This index is what Azure Search will map your data to, and how it will actually perform searches in the future. So ultimately think of this as the format your search will be in after completion. To create the index, use the following code:

import json
import requests
from pprint import pprint

url = 'https://[Url of the search service]/indexes?api-version=2017-11-11'
print(url)

api_key = '[api key for the search service]'

headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

data = {
     "name": "crimes",  
     "fields": [
       {"name": "id", "type": "Edm.String", "key":"true", "searchable": "false"},
       {"name": "iucr","type": "Edm.String", "searchable":"true", "filterable":"true", "facetable":"true"},
       {"name": "location_description","type":"Edm.String", "searchable":"true", "filterable":"true"},
       {"name": "primary_description","type":"Edm.String","searchable":"true","filterable":"true"},
       {"name": "secondary_description","type":"Edm.String","searchable":"true","filterable":"true"},
       {"name": "arrest","type":"Edm.String","facetable":"true","filterable":"true"},
       {"name": "beat","type":"Edm.Double","filterable":"true","facetable":"true"},
       {"name": "block", "type":"Edm.String","filterable":"true","searchable":"true","facetable":"true"},
       {"name": "case","type":"Edm.String","searchable":"true"},
       {"name": "date_occurrence","type":"Edm.DateTimeOffset","filterable":"true"},
       {"name": "domestic","type":"Edm.String","filterable":"true","facetable":"true"},
       {"name": "fbi_cd", "type":"Edm.String","filterable":"true"},
       {"name": "ward","type":"Edm.Double", "filterable":"true","facetable":"true"},
       {"name": "location","type":"Edm.GeographyPoint"}
      ]
     }

data = json.dumps(data)
print(type(data))

response = requests.post(url, data=data, headers=headers)
pprint(response.status_code)

Using the above code, I’ve identified the different data types of the final product, and these all map to the data types identified for azure search. The supported data types can be found here.

Its worth mentioning, that there are other key attributes above to consider:

  • facetable: This denotes if this data is able to be faceted. For example, in Yelp if I bring back a search for cost, all restuarants have a “$” to “$$$$$” rating, and I want to be able to group results based on this facet.
  • filterable: This denotes if the dataset can be filtered based on those values.
  • searchable: This denotes whether or not the field is having a full-text search performed on it, and is limited in the different types of data that can used to perform the search.

Creating an indexer

So the next step is to create the indexer. The purpose of the indexer is that this does the real work. The indexer is responsible for performing the following operations:

  • Connect to the data source
  • Pull in the data and put it into the appropriate format for the index
  • Perform any data transformations
  • Manage pulling in no data ongoing
import json
import requests
from pprint import pprint

url = 'https://[Url of the search service]/indexers?api-version=2017-11-11'
print(url)

api_key = '[api key for the search service]'

headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

data = {
    "name": "cosmos-crime-indexer",
    "dataSourceName": "cosmos-crime",
    "targetIndexName": "crimes",
    "schedule": {"interval": "PT2H"},
    "fieldMappings": [
        {"sourceFieldName": "iucr", "targetFieldName": "iucr"},
        {"sourceFieldName": "location_description", "targetFieldName": "location_description"},
        {"sourceFieldName": "primary_decsription", "targetFieldName": "primary_description"},
        {"sourceFieldName": "secondary_description", "targetFieldName": "secondary_description"},
        {"sourceFieldName": "arrest", "targetFieldName": "arrest"},
        {"sourceFieldName": "beat", "targetFieldName": "beat"},
        {"sourceFieldName": "block", "targetFieldName": "block"},
        {"sourceFieldName": "casenumber", "targetFieldName": "case"},
        {"sourceFieldName": "date_of_occurrence", "targetFieldName": "date_occurrence"},
        {"sourceFieldName": "domestic", "targetFieldName": "domestic"},
        {"sourceFieldName": "fbi_cd", "targetFieldName": "fbi_cd"},
        {"sourceFieldName": "ward", "targetFieldName": "ward"},
        {"sourceFieldName": "location", "targetFieldName":"location"}
    ]
}

data = json.dumps(data)
print(type(data))

response = requests.post(url, data=data, headers=headers)
pprint(response.status_code)

What you will notice is that for each field, two attributes are assigned:

  • targetFieldName: This is the field in the index that you are targeting.
  • sourceFieldName: This is the field name according to the data source.

Run the indexer

Once you’ve created the indexer, the next step is to run it. This will cause indexer to pull data into the index:

import json
import requests
from pprint import pprint

url = 'https://[Url of the search service]/indexers/cosmos-crime-indexer/run/?api-version=2017-11-11'
print(url)

api_key = '[api key for the search service]'

headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

reseturl = 'https://[Url of the search service]/indexers/cosmos-crime-indexer/reset/?api-version=2017-11-11'

resetResponse = requests.post(reseturl, headers=headers)

response = requests.post(url, headers=headers)
pprint(response.status_code)

By triggering the “running” the indexer which will load the index.

Getting the indexer status

Now, depending the size of your data source, this indexing process could take time, so I wanted to provide a rest call that will let you get the status of the indexer.

import json
import requests
from pprint import pprint

url = 'https://[Url of the search service]/indexers/cosmos-crime-indexer/status/?api-version=2017-11-11'
print(url)

api_key = '[api key for the search service]'

headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

response = requests.get(url, headers=headers)
index_list = response.json()
pprint(index_list)

This will provide you with the status of the indexer, so that you can find out when it completes.

Run the search

Finally if you want to confirm the search is working afterward, you can do the following:

import json
import requests
from pprint import pprint

url = 'https://[Url of the search service]/indexes/crimes/docs?api-version=2017-11-11'
print(url)

api_key = '[api key for the search service]'

headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

response = requests.get(url, headers=headers)
index_list = response.json()
pprint(index_list)

This will bring back the results of the search. This will bring back everything as it is an empty string search.

I hope this helps with your configuring of Azure Search, Happy searching :)!

Getting Started with Azure (developer perspective)

So there’s a common question I’ve been getting a lot lately, and that’s “I want to learn Azure, where do I start?” And this is ultimately a very reasonable question, because as much as the cloud has permuted much of the digital world, there are still some organizations who have only recently started to adopt it.

There are many reasons people would choose to adopt the cloud, scalability, cost, flexibility, etc. But for today’s post I’m going to focus on the idea that you have already decided to go to the Azure Cloud and are looking for resources to ramp up. So I wanted to provide those here:

MS Learn: The site provides videos, reading, and walk-through’s that can assist with learning this type of material:

MS Learn for Specific Services: There are several common services out there that many people think of when they think of the cloud, and I wanted to provide some resources here to help with those:

EDX Courses: EDX is a great site with a lot of well made courses, and there are a wealth of options for Azure and Cloud, here are a few I thought relevant, but it is not an exhaustive list.

  • Architecting Distributed Applications: One common mistake, that many make with regard to the cloud is that they think of it as “just another data center”, and that’s just not true. To build effective and scalable applications, they need to be architected to take advantage of distributed compute. This course does a great job of laying out how to make sure you are architected to work in a distributed fashion.
  • Microsoft Azure Storage: A great course on the basics of using Azure Storage.
  • Microsoft Azure Virtual Machines: The virtual machine is the cornerstone of azure, and provides many options to build an scale out effectively. This is a good introduction into the most basic service in Azure.
  • Microsoft Azure App Service: The most popular service in Azure, App Service enables developers to deploy and configure apps without worrying about the machine running under-the-covers. A great overview.
  • Microsoft Azure Virtual Networks: As I mentioned above, Software Based Networking is one of the key pieces required for the cloud and this gives a good introduction into how to leverage it.
  • Databases in Azure: Another key component of the cloud is the Database, and this talks about the options for leveraging platform-as-a-service offerings for databases to eliminate your overhead for maintaining the vms.
  • Azure Security and Compliance: A key component again is security, as the digital threats are constantly evolving, and Azure provides a lot of tools to protect your workload, this is an essential piece of every architecture.
  • Building your azure skills toolkit: A good beginner course for how to get your skills up to speed with Azure.

Additional Tools and Resources, I would recommend the following:

Those are just some of the many resources that can be helpful to starting out with Azure and learning to build applications for the cloud. It is not an exhaustive list, so if you have a resource you’ve found helpful, please post it in the comments below.

Building a Solr Cluster with TerraForm – Part 1

So it’s no surprise that I very much have been talking about how amazing TerraForm is, and recently I’ve been doing a lot of investigation into Solr and how to build a scalable Solr Cluster.

So given the kubernetes template I wanted to try my hand at something similar. The goals of this project were the following:

  1. Build a generic template for creating a Solr cloud cluster with distributed shard.
  2. Build out the ability to scale the cluster for now using TerraForm to manually trigger increases to cluster size.
  3. Make the nodes automatically add themselves to the cluster.

And I could do this just using bash scripts and packer. But instead wanted to try my hand at cloud init.

But that’s going to be the end result, I wanted to walkthrough the various steps I go through to get to the end.  The first real step is to get through the installation of Solr on  linux machines to be implemented. 

So let’s start with “What is Solr?”   The answer is that Solr is an open source software solution that provides a means of creating a search engine.  It works in the same vein as ElasticSearch and other technologies.  Solr has been around for quite a while and is used by some of the largest companies that implement search to handle search requests by their customers.  Some of those names are Netflix and CareerBuilder.  See the following links below:

So I’ve decided to try my hand at this and creating my first Solr cluster, and have reviewed the getting started. 

So I ended up looking into it more, and built out the following script to create a “getting started” solr cluster.

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
sudo apt-get install -y gnupg-curl
sudo wget https://www.apache.org/dist/lucene/solr/8.0.0/solr-8.0.0.zip.asc | sudo apt-key add

sudo apt-get update -y
sudo apt-get install unzip
sudo wget http://mirror.cogentco.com/pub/apache/lucene/solr/8.0.0/solr-8.0.0.zip

sudo unzip -q solr-8.0.0.zipls
sudo mv solr-8.0.0 /usr/local/bin/solr-8.0.0 -f
sudo rm solr-8.0.0.zip -f

sudo apt-get install -y default-jdk

sudo chmod +x /usr/local/bin/solr-8.0.0/bin/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node1/solr
sudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node2/solr
sudo /usr/local/bin/solr-8.0.0/bin/bin/solr -e cloud -noprompt

The above will configure a “getting started solr cluster” that leverages all the system defaults and is hardly a production implementation. So my next step will be to change this. But for the sake of getting something running, I took the above script and moved it into a packer template using the following json. The above script is the “../scripts/Solr/provision.sh”

{
  "variables": {
    "deployment_code": "",
    "resource_group": "",
    "subscription_id": "",
    "location": "",
    "cloud_environment_name": "Public"
  },
  "builders": [{   
    "type": "azure-arm",
    "cloud_environment_name": "{{user `cloud_environment_name`}}",
    "subscription_id": "{{user `subscription_id`}}",

    "managed_image_resource_group_name": "{{user `resource_group`}}",
    "managed_image_name": "Ubuntu_16.04_{{isotime \"2006_01_02_15_04\"}}",
    "managed_image_storage_account_type": "Premium_LRS",

    "os_type": "Linux",
    "image_publisher": "Canonical",
    "image_offer": "UbuntuServer",
    "image_sku": "16.04-LTS",

    "location": "{{user `location`}}",
    "vm_size": "Standard_F2s"
  }],
  "provisioners": [
    {
      "type": "shell",
      "script": "../scripts/ubuntu/update.sh"
    },
    {
      "type": "shell",
      "script": "../scripts/Solr/provision.sh"
    },
    {
      "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
      "inline": [
        "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
      ],
      "inline_shebang": "/bin/sh -e",
      "type": "shell"
    }]
}

The only other script mentioned is the “update.sh”, which has the following logic in it, to install the cli and update the ubuntu image:

#! /bin/bash

sudo apt-get update -y
sudo apt-get upgrade -y

#Azure-CLI
AZ_REPO=$(sudo lsb_release -cs)
sudo echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
sudo curl -L https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
sudo apt-get install apt-transport-https
sudo apt-get update && sudo apt-get install azure-cli

So the above gets me to a good place for being able to create an image with it configured.

For next steps I will be doing the following:

  • Building a more “production friendly” implementation of Solr into the script.
  • Investigating leveraging cloud init instead of the “golden image” experience with Packer.
  • Building out templates around the use of Zookeeper for managing the nodes.


Configuring Terraform Development Environment

So I’ve been doing a lot of work with a set of open source tools lately, specifically TerraForm and Packer. TerraForm at its core is a method of implementing truly Infrastructure as Code, and does so by providing a simple function style language where you can create basic implementations for the cloud, and then leverage resource providers to deploy. These resource providers allow you to deploy to variety of cloud platforms (the full list can be found here). It also provides robust support for debugging, targeting, and supports a desired state configuration approach that makes it much easier to maintain your environments in the cloud.

Now that being said, like most open source tools, it can require some configuration for your local development environment and I wanted to put this post together to describe it. Below are the steps to configuring your environment.

Step 1: Install Windows SubSystem on your Windows 10 Machine

To start with, you will need to be able to leverage bash as part of the Linux Subsystem. You can enable this on a Windows 10 machine, by following the steps outlined in this guide:

https://docs.microsoft.com/en-us/windows/wsl/install-win10

Once you’ve completed this step, you will be able to move forward with VS Code and the other components required.

Step 2: Install VS Code and Terraform Plugins

For this guide we recommend VS Code as your editor, VS code works on a variety of operating systems, and is a very light-weight code editor.

You can download VS Code from this link:

https://code.visualstudio.com/download

Once you’ve downloaded and installed VS code, we need to install the VS Code Extension for Terraform.

Then click “Install” and “Reload” when completed. This will allow you to have intelli-sense and support for the different terraform file types.

Step 3: Opening Terminal

You can then perform the remaining steps from the VS Code application. Go to the “View” menu and select “integrated terminal”. You will see the terminal appear at the bottom:

By default, the terminal is set to “powershell”, type in “Bash” to switch to Bash Scripting. You can default your shell following this guidance – https://code.visualstudio.com/docs/editor/integrated-terminal#_configuration

Step 4: Install Unzip on Subsystem

Run the following command to install “unzip” on your linux subsystem, this will be required to unzip both terraform and packer.

sudo apt-get install unzip

Step 5: Install TerraForm

You will need to execute the following commands to download and install Terraform, we need to start by getting the latest version of terraform.

Go to this link:

https://www.terraform.io/downloads.html

And copy the link for the appropriate version of the binaries for TerraForm.

Go back to VS Code, and enter the following commands:

wget {url for terraform}
unzip {terraform.zip file name}
sudo mv terraform /usr/local/bin/terraform
rm {terraform.zip file name}
terraform --version

Step 6: Install Packer

To start with, we need to get the most recent version of packer. Go to the following Url, and copy the url of the appropriate version.

https://www.packer.io/downloads.html

Go back to VS Code and execute the following commands:

wget {packer url} 
unzip {packer.zip file name} 
sudo mv packer /usr/local/bin/packer
rm {packer.zip file name}

Step 7: Install Azure CLI 2.0

Go back to VS Code again, and download / install azure CLI. To do so, execute the steps and commands found here:

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-apt?view=azure-cli-latest

Step 8: Authenticating against Azure

Once this is done you are in a place where you can run terraform projects, but before you do, you need to authenticate against Azure. This can be done by running the following commands in the bash terminal (see link below):

https://docs.microsoft.com/en-us/azure/azure-government/documentation-government-get-started-connect-with-cli

Once that is completed, you will be authenticated against Azure and will be able to update the documentation for the various environments.

NOTE: Your authentication token will expire, should you get a message about an expired token, enter the command, to refresh:

az account get-access-token 

Token lifetimes can be described here – https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-token-and-claims#access-tokens

After that you are ready to use Terraform on your local machine.

Building out Azure Container Registry in Terraform

So I’ve previously done posts on the TerraForm template that I built to support creating a kubernetes cluster. The intention behind this was to provide a solution for standing up a kubernetes cluster in Azure Government. To see more information on that cluster I have a blog post here.

Now one of the questions I did get with it, is “How do we integrate this with Azure Container Registry?” And for those not familiar, Azure Container Registry is a PaaS offering that Azure provides that allows you to push your container images to a docker registry and not have to manage the underlying VM, patching, updates, and other maintenance. This allows you to just pay for the space to store the container images, which admittedly are very small.

The first part of implementing this logic was to create the Container Registry in TerraForm by using the following.

A key note is that the use of the “count” variable is to enable that this registry will not be created unless you create a “lkma” which is the VM that operates as the master.

resource "azurerm_container_registry" "container-registry" {
  count = "${lookup(var.instance_counts, "lkma", 0) == 0 ? 0 : 1}"
  name                = "containerRegistry1"
  resource_group_name = "${azurerm_resource_group.management.name}"
  location            = "${var.azure_location}"
  admin_enabled       = true
  sku                 = "Standard"

  depends_on = ["azurerm_role_assignment.kub-ad-sp-ra-kv1"]
}

So honestly didn’t require that much in the way of work. For the next part it is literally just adding a few lines of code to enable the connection between the registry and the kubernetes cluster. Those lines are the following :

echo 'Configure ACR registry for Kubernetes cluster'
kubectl create secret docker-registry <SECRET_NAME> --docker-server $5 --docker-email $6 --docker-username=$7 --docker-password $8
echo 'Script Completed'

So really that is about it. I’ve already made these changes to the GitHub template, so please check it out. The above lines of code allow a user principal information that I pass to the script to be used to connect the azure container registry to my cluster. That’s really about it.

Migrating VMs to Azure DevTest Labs

So here’s something I’ve been running into a lot, and at the heart of it is cost management. No one wants to spend more money than they need to and that is especially true if you are running your solutions in the cloud. And for most organizations, a big cost factor can be running lower environments.

So the situation is this, you have a virtual machine, be it a developer machine, or some form of lower environment (Dev, test, QA, etc). And the simple fact is that you have to pay for this environment to be able to run testing and ensure that everything works when you deploy to production. So its a necessary evil, and extremely important if you are adhering to DevOps principals. Now the unfortunate part of this, is that you likely only really use these lower environments during work hours. They probably aren’t exposed to the outside world, and they probably aren’t being hit by customers in any fashion.

So ultimately, that means that you are paying for this environment for 24/7 but only using it for 12 hours a day. Which means you are basically paying for double what you need.

Enter DevTest Labs, which allows you to create VMs and artifacts to make it easy to spin-up and spin-down environments without any effort on your part so that you can save money with regard to running these lower environments.

Now the biggest problem then becomes, “That’s great, but how do I move my existing VMs into new DevTest lab, how do I do that?” The answer is it can be done, via script, and I’ve written a script to that here.

#The resource group information, the source and destination
resourceGroupName="<Original Resource Group>"
newResourceGroupName="<Destination Resource Group>"

#The name of the VM you wish to migrate
vmName="<VM Name>"
newVmName="<New VM Name>"

#The OS image of the VM
imageName="Windows Server 2016 Datacenter"
osType="Windows"

#The size of the VM
vmsize="<vm size>"
#The location of the VM
location="<location>"
#The admin username for the newly created vm
adminusername="<username>"

#The suffix to add to the end of the VM
osDiskSuffix="_lab.vhd"
#The type of storage for the data disks
storageType="Premium_LRS"

#The VNET information for the VM that is being migrated
vnet="<vnet name>"
subnet="<Subnet name>"

#The name of the lab to be migrated to
labName="<Lab Name>"

#The information about the storage account associated with the lab
newstorageAccountName="<Lab account name>"
storageAccountKey="<Storage account key>"
diskcontainer="uploads"

echo "Set to Government Cloud"
sudo az cloud set --name AzureUSGovernment

#echo "Login to Azure"
#sudo az login

echo "Get updated access token"
sudo az account get-access-token

echo "Create new Resource Group"
az group create -l $location -n $newResourceGroupName

echo "Deallocate current machine"
az vm deallocate --resource-group $resourceGroupName --name $vmName

echo "Create container"
az storage container create -n $diskcontainer --account-name $newstorageAccountName --account-key $storageAccountKey

osDisks=$(az vm show -d -g $resourceGroupName -n $vmName --query "storageProfile.osDisk.name") 
echo ""
echo "Copy OS Disks"
echo "--------------"
echo "Get OS Disk List"

osDisks=$(echo "$osDisks" | tr -d '"')

for osDisk in $(echo $osDisks | tr "[" "\n" | tr "," "\n" | tr "]" "\n" )
do
   echo "Copying OS Disk $osDisk"

   echo "Get url with token"
   sas=$(az disk grant-access --resource-group $resourceGroupName --name $osDisk --duration-in-seconds 3600 --query [accessSas] -o tsv)

   newOsDisk="$osDisk$osDiskSuffix"
   echo "New OS Disk Name = $newOsDisk"

   echo "Start copying $newOsDisk disk to blob storage"
   az storage blob copy start --destination-blob $newOsDisk --destination-container $diskcontainer --account-name $newstorageAccountName --account-key $storageAccountKey --source-uri $sas

   echo "Get $newOsDisk copy status"
   while [ "$status"=="pending" ]
   do
      status=$(az storage blob show --container-name $diskcontainer --name $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.status')
      status=$(echo "$status" | tr -d '"')
      echo "$newOsDisk Disk - Current Status = $status"

      progress=$(az storage blob show --container-name $diskcontainer --name $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.progress')
      echo "$newOsDisk Disk - Current Progress = $progress"
      sleep 10s
      echo ""

      if [ "$status" != "pending" ]; then
      echo "$newOsDisk Disk Copy Complete"
      break
      fi
   done

   echo "Get blob url"
   blobSas=$(az storage blob generate-sas --account-name $newstorageAccountName --account-key $storageAccountKey -c $diskcontainer -n $newOsDisk --permissions r --expiry "2019-02-26" --https-only)
   blobSas=$(echo "$blobSas" | tr -d '"')
   blobUri=$(az storage blob url -c $diskcontainer -n $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey)
   blobUri=$(echo "$blobUri" | tr -d '"')

   echo $blobUri

   blobUrl=$(echo "$blobUri")

   echo "Create image from $newOsDisk vhd in blob storage"
   az group deployment create --name "LabMigrationv1" --resource-group $newResourceGroupName --template-file customImage.json --parameters existingVhdUri=$blobUrl --verbose

   echo "Create Lab VM - $newVmName"
   az lab vm create --lab-name $labName -g $newResourceGroupName --name $newVmName --image "$imageName" --image-type gallery --size $vmsize --admin-username $adminusername --vnet-name $vnet --subnet $subnet
done 

dataDisks=$(az vm show -d -g $resourceGroupName -n $vmName --query "storageProfile.dataDisks[].name") 
echo ""
echo "Copy Data Disks"
echo "--------------"
echo "Get Data Disk List"

dataDisks=$(echo "$dataDisks" | tr -d '"')

for dataDisk in $(echo $dataDisks | tr "[" "\n" | tr "," "\n" | tr "]" "\n" )
do
   echo "Copying Data Disk $dataDisk"

   echo "Get url with token"
   sas=$(az disk grant-access --resource-group $resourceGroupName --name $dataDisk --duration-in-seconds 3600 --query [accessSas] -o tsv)

   newDataDisk="$dataDisk$osDiskSuffix"
   echo "New OS Disk Name = $newDataDisk"

   echo "Start copying disk to blob storage"
   az storage blob copy start --destination-blob $newDataDisk --destination-container $diskcontainer --account-name $newstorageAccountName --account-key $storageAccountKey --source-uri $sas

   echo "Get copy status"
   while [ "$status"=="pending" ]
   do
      status=$(az storage blob show --container-name $diskcontainer --name $newDataDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.status')
      status=$(echo "$status" | tr -d '"')
      echo "Current Status = $status"

      progress=$(az storage blob show --container-name $diskcontainer --name $newDataDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.progress')
      echo "Current Progress = $progress"
      sleep 10s
      echo ""

      if [ "$status" != "pending" ]; then
      echo "Disk Copy Complete"
      break
      fi
   done
done 

echo "Script Completed"

So the above script breaks out into a couple of key pieces. The first part is the following parts needs to happen:

  • Create the destination resource group
  • Deallocate the machine
  • Create a container for the disks to be migrated to
echo "Create new Resource Group"
az group create -l $location -n $newResourceGroupName

echo "Deallocate current machine"
az vm deallocate --resource-group $resourceGroupName --name $vmName

echo "Create container"
az storage container create -n $diskcontainer --account-name $newstorageAccountName --account-key $storageAccountKey

The next process is to identify the OS disk for the VM, and copy the disk from its current location, over to the storage account associated with DevTest lab. The next key part is to create a custom image based on that VM and then create a VM based on that image.

osDisks=$(az vm show -d -g $resourceGroupName -n $vmName --query "storageProfile.osDisk.name") 
echo ""
echo "Copy OS Disks"
echo "--------------"
echo "Get OS Disk List"

osDisks=$(echo "$osDisks" | tr -d '"')

for osDisk in $(echo $osDisks | tr "[" "\n" | tr "," "\n" | tr "]" "\n" )
do
   echo "Copying OS Disk $osDisk"

   echo "Get url with token"
   sas=$(az disk grant-access --resource-group $resourceGroupName --name $osDisk --duration-in-seconds 3600 --query [accessSas] -o tsv)

   newOsDisk="$osDisk$osDiskSuffix"
   echo "New OS Disk Name = $newOsDisk"

   echo "Start copying $newOsDisk disk to blob storage"
   az storage blob copy start --destination-blob $newOsDisk --destination-container $diskcontainer --account-name $newstorageAccountName --account-key $storageAccountKey --source-uri $sas

   echo "Get $newOsDisk copy status"
   while [ "$status"=="pending" ]
   do
      status=$(az storage blob show --container-name $diskcontainer --name $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.status')
      status=$(echo "$status" | tr -d '"')
      echo "$newOsDisk Disk - Current Status = $status"

      progress=$(az storage blob show --container-name $diskcontainer --name $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey --output json | jq '.properties.copy.progress')
      echo "$newOsDisk Disk - Current Progress = $progress"
      sleep 10s
      echo ""

      if [ "$status" != "pending" ]; then
      echo "$newOsDisk Disk Copy Complete"
      break
      fi
   done

   echo "Get blob url"
   blobSas=$(az storage blob generate-sas --account-name $newstorageAccountName --account-key $storageAccountKey -c $diskcontainer -n $newOsDisk --permissions r --expiry "2019-02-26" --https-only)
   blobSas=$(echo "$blobSas" | tr -d '"')
   blobUri=$(az storage blob url -c $diskcontainer -n $newOsDisk --account-name $newstorageAccountName --account-key $storageAccountKey)
   blobUri=$(echo "$blobUri" | tr -d '"')

   echo $blobUri

   blobUrl=$(echo "$blobUri")

   echo "Create image from $newOsDisk vhd in blob storage"
   az group deployment create --name "LabMigrationv1" --resource-group $newResourceGroupName --template-file customImage.json --parameters existingVhdUri=$blobUrl --verbose

   echo "Create Lab VM - $newVmName"
   az lab vm create --lab-name $labName -g $newResourceGroupName --name $newVmName --image "$imageName" --image-type gallery --size $vmsize --admin-username $adminusername --vnet-name $vnet --subnet $subnet
done 

Now part of the above is to use a ARM template to create the image. The template is below:

{
  "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "existingLabName": {
      "type": "string",
      "defaultValue":"KevinLab",
      "metadata": {
        "description": "Name of an existing lab where the custom image will be created or updated."
      }
    },
    "existingVhdUri": {
      "type": "string",
      "metadata": {
        "description": "URI of an existing VHD from which the custom image will be created or updated."
      }
    },
    "imageOsType": {
      "type": "string",
      "defaultValue": "windows",
      "metadata": {
        "description": "Specifies the OS type of the VHD. Currently 'Windows' and 'Linux' are the only supported values."
      }
    },
    "isVhdSysPrepped": {
      "type": "bool",
      "defaultValue": true,
      "metadata": {
        "description": "If the existing VHD is a Windows VHD, then specifies whether the VHD is sysprepped (Note: If the existing VHD is NOT a Windows VHD, then please specify 'false')."
      }
    },
    "imageName": {
      "type": "string",
      "defaultValue":"LabVMMigration",
      "metadata": {
        "description": "Name of the custom image being created or updated."
      }
    },
    "imageDescription": {
      "type": "string",
      "defaultValue": "",
      "metadata": {
        "description": "Details about the custom image being created or updated."
      }
    }
  },
  "variables": {
    "resourceName": "[concat(parameters('existingLabName'), '/', parameters('imageName'))]",
    "resourceType": "Microsoft.DevTestLab/labs/customimages"
  },
  "resources": [
    {
      "apiVersion": "2018-10-15-preview",
      "name": "[variables('resourceName')]",
      "type": "Microsoft.DevTestLab/labs/customimages",
      "properties": {
        "vhd": {
          "imageName": "[parameters('existingVhdUri')]",
          "sysPrep": "[parameters('isVhdSysPrepped')]",
          "osType": "[parameters('imageOsType')]"
        },
        "description": "[parameters('imageDescription')]"
      }
    }
  ],
  "outputs": {
    "customImageId": {
      "type": "string",
      "value": "[resourceId(variables('resourceType'), parameters('existingLabName'), parameters('imageName'))]"
    }
  }
}

So if you run the above, it will create the new VM under the DevTest lab.

Creating Terraform Scripts from existing resources in Azure Government

Lately I’ve been doing a lot of work with TerraForm lately, and one of the questions I’ve gotten a lot is the ability to create terraform scripts based on existing resources.

So the use case is the following: You are working on projects, or part of an organization that has a lot of resources in Azure, and you want to start using terraform for a variety of reasons:

  • Being able to iterating in your infrastructure
  • Consistency of environment management
  • Code History of changes

The good new is there is a tool for that. The tool can be found here on github along with a list of pre-requisites.  I’ve used this tool in Azure Commercial and have been really happy with the results. I wanted to use this with Azure Commercial.

NOTE => The Pre-reqs are listed on the az2tf tool, but one they didn’t list I needed to install was jq, using “apt-get install jq”.

Next we need to configure our environment for running terraform.  For me, I ran this using the environment I had configured for Terraform.  In the Git repo, there is a PC Setup document that walks you through how to configure your environment with VS code and Terraform.  I then was able to clone the git repo, and execute the az2tf tool using a Ubuntu subsystem on my Windows 10 machine. 

Now, the tool, az2f, was built to work with azure commercial, and there is one change that has to be made for it to leverage azure government 

Once you have the environment created, and the pre-requisites are all present, you can open a “Terminal” window in vscode, and connect to Azure Government. 

In the ./scripts/resources.sh and ./scripts/resources2.sh files, you will find the following on line 9:

ris=`printf “curl -s  -X GET -H \”Authorization: Bearer %s\” -H \”Content-Type: application/json\” https://management.azure.com/subscriptions/%s/resources?api-version=2017-05-10” $bt $sub`

Please change this line to the following:

ris=`printf “curl -s  -X GET -H \”Authorization: Bearer %s\” -H \”Content-Type: application/json\” https://management.usgovcloudapi.net/subscriptions/%s/resources?api-version=2017-05-10” $bt $sub`

You can then run the “az2tf” tool by running the following command in the terminal:

./az2tf.sh -s {Subscription ID} -g {Resource Group Name}

This will generate the script, and you will see a new folder created in the structure marked “tf.{Subscription ID}” and inside of it will be all configuration steps to setup the environment.

TerraForm Kubernetes Template

Hello All, I wanted to get a quick blog post out here based on something that I worked on, and finally is seeing the light of day.  I’ve been doing a lot of work with TerraForm, and one of the use cases I found was standing up a Kubernetes cluster.  And specifically I’ve been working with Azure Government, which does not have AKS available.  So how can I build a kubernetes cluster and minimize the lift of creating a cluster and then make it easy to add nodes to the cluster.  So the end result of that goal is here.

Below is a description of the project, and if you’d like to contribute please do, I have some ideas for phase 2 of this that I’m going to build out but I’d love to see what others come up with.

Intention:

The purpose of this template is to provide an easy-to-use approach to using an Infrastructure-as-a-service deployment to deploy a kubernetes cluster on Microsoft Azure. The goal being that you can start fresh with a standardized approach and preconfigured master and worker nodes.

How it works?

This template create a master node, and as many worker nodes as you specify, and during creation will automatically execute the scripts required to join those nodes to the cluster. The benefit of this being that once your cluster is created, all that is required to add additional nodes is to increase the count of the “lkwn” vm type, and reapply the template. This will cause the newe VMs to be created and the cluster will start orchestrating them automatically.

This template can also be built into a CI/CD pipeline to automatically provision the kubernetes cluster prior to pushing pods to it.

This guide is designed to help you navigate the use of this template to standup and manage the infrastructure required by a kubernetes cluster on azure. You will find the following documentation to assist:

  • Configure Terraform Development Environment: This page provides details on how to setup your locale machine to leverage this template and do development using Terraform, Packer, and VSCode.
  • Use this template: This document walks you through how to leverage this template to build out your kubernetes environment.
  • Understanding the template: This page describs how to understand the Terraform Template being used and walks you through its structure.

Key Contributors!

A special thanks to the following people who contributed to this template:
Brandon Rohrer: who introduced me to this template structure and how it works, as well as assisted with optimizing the functionality provided by this template.

Running Entity Framework Migrations in VSTS Release with Azure Functions

Hello All, I hope your having a good week, I wanted to write up a problem I just solved that I can’t be the only one. For me, I’m a big fan of Entity Framework, and specifically Code First Migrations. Now its not perfect for every solution, but I will say that it does provide a method of versioning and controlling database changes in an more elegant method.

Now one of the problems I’ve run into before is that how do you leverage a dev ops pipeline with migrations. The question becomes how do you trigger the migrations to execute as part of an automated release. And this can be a pretty sticky situation if you’ve ever tried to unbox it. The most common answer is this.  Which is in itself, a fine solution.

But one scenario that I’ve run into where this doesn’t always play out, is the scenario where because the above link executes on App_Start, it can cause a slowdown for the first users, of in the scenario of load balancing can cause a performance issue when it hits that method of running the migration.  And to me, from a Dev Ops perspective, this doesn’t feel really that “clean” as I would like to deploy my database changes at the same time as my app, and know that when it says “Finished” everything is done and successful.  By leveraging this approach you run the risk of a “false positive” saying that it was successful even when it wasn’t, as the migrations will fail with the first user.

So an alternative I wrote was to create an azure function to provide an Http endpoint, that allows for me to trigger the migrations to execute.  Under this approach I can make an http call from my release pipeline, and execute the migrations in a controlled state, and if its going to fail, it will happen within my deployment rather than after.

Below is the code I leveraged in the azure function:

[FunctionName("RunMigration")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)]HttpRequestMessage req, TraceWriter log)
{
log.Info("Run Migration");

bool isSuccessful = true;
string resultMessage = string.Empty;

try
{
//Get security key
string key = req.GetQueryNameValuePairs()
.FirstOrDefault(q => string.Compare(q.Key, "key", true) == 0)
.Value;

var keyRecord = ConfigurationManager.AppSettings["MigrationKey"];
if (key != keyRecord)
{
throw new ArgumentException("Key Mismatch, disregarding request");
}

Database.SetInitializer(new MigrateDatabaseToLatestVersion<ApplicationDataContext, Data.Migrations.Configuration>());

var dbContext = new ApplicationDataContext();

dbContext.Database.Initialize(true);

//var list = dbContext.Settings.ToList();
}
catch (Exception ex)
{
isSuccessful = false;
resultMessage = ex.Message;
log.Info("Error: " + ex.Message);
}

return isSuccessful == false
? req.CreateResponse(HttpStatusCode.BadRequest, "Error: " + resultMessage)
: req.CreateResponse(HttpStatusCode.OK, "Migrationed Completed, Database updated");
}
}

Now a couple of quick things to note as you look at the code:
Line 12:  I am extracting a querystring parameter called “key” and then in line 16, I am getting the “MigrationKey” from the Functions App Settings.  The purpose of  this is to quickly secure the azure function.  This function is looking for a querystring value that matches the app settings value to allow the triggering of the migrations.  This prevents just anyone from hitting the endpoint.  I can then secure this value within my release management tool to pass as part of the http request.

Lines 22-26: this is what actually triggers the migration to execute by creating a context and setting the initializer.

Line 37: Allows for handling the response code that is sent back to the client for the http request.  This allows me to throw an error within my release management tool if necessary.

Getting Started

So I thought I would start this new direction for the blog with a post about a topic I get asked about a lot.

“I want to get started in programming, how do I do that?”

And this is a great question, and one that makes a lot of sense to me as the lines between technology and business are blurring.  And more and more people are interacting with developers in their daily life as part of their current jobs, and its leading to people’s eyes being opened to the opportunities in this place.

My next question is normally “Why?” and at first that usually takes people back, but this is an important thing to ask yourself.  I ask this because to be honest, switching fields and taking on something like becoming a developer isn’t an easy journey, and if your motives aren’t clear than your going to set yourself up for failure.  I generally think it’s smart to ask if this is a worthwhile investment of your time.  Because as much as I love this industry, it can be quite brutal at times.

Don’t get me wrong, I’m not making these statements as some grand arbiter who decides if you are worthy of becoming an almighty developer.  I make these statements because the simple truth is that to work as a developer and achieve success you need to be willing to accept the reality, which is less Minority Report and steve jobs, and more the craziness of Silicon Valley.

I would tell you to ask the following questions:

  • Do you like continual education?  Are you willing to read about this stuff in your spare time?
  • Do you like to tinker with things?
  • Do you have “Grit”?

For the final question, specifically I’m referring to the fantastic book by Angela Duckworth, that describes Grit as basically being the intersection of Passion and Perseverance, and that it is the most important part of any equation where someone is hoping for success.  And I would argue, even more so true for developers.

If you look online at the “successful developers” they all have one thing in common…they live for this stuff.  And spend a lot of time doing it, and finding new ways to challenge themselves and push boundaries.  They are constantly looking for ways to change their mindset to find new opportunities and directions.  I don’t claim to be a famous developer, but I can tell you that I’m proud of where I’ve gone in my career and I genuinely love what I do, and much like those “famous developers”, my wife describes me as a “well documented nerd”.

So now the important question is, did I lose you?  If not, I think this is a rewarding career option that can take you in some interesting directions, but you need to know that it will be a slow burn.  This is not something where you will be writing award winning apps by Monday if you start on Friday.

Below are some links to help you out, feel free to reach out with questions, I’ve tried to provide a lot of training material and some notes about each link:

Visual Studio 2017 Community Edition – This is the primary IDE (integrated developer environment) for all things in the Micr0soft stack. And the community edition is free, which is even better.

https://www.visualstudio.com/downloads/

When you go to install it, its going to ask you to customize the install, by selecting different packages and what not.

Great start: C# fundamentals for Absolute Beginners:

https://mva.microsoft.com/en-US/training-courses/16169?l=Lvld4EQIC_2706218949

For training materials I would recommend the following:

Microsoft Virtual Academy – https://mva.microsoft.com/

This is a great site for a lot of training content Microsoft generates to help. I would point you to the absolute beginner classes as well as the learning paths. They also do a good job of categorizing training (100 level, 200 level etc)

C# Courses:

https://mva.microsoft.com/training-topics/c-app-development#!jobf=Developer&lang=1033

Visual Studio Training:

https://mva.microsoft.com/product-training/visual-studio-courses#!jobf=Developer&lang=1033

Getting started with Visual Studio 2017

https://mva.microsoft.com/en-US/training-courses/getting-started-with-visual-studio-2017-17798?l=9oIw0FD6D_3611787171

Learning Paths:

https://mva.microsoft.com/LearningPaths.aspx

Channel 9 – https://channel9.msdn.com/

Great site for general videos, and is updated all the time.

I recommend web as a good place to start, the .net web platform is called ASP.NET and uses HTML, C#, and some javascript to work. (C# and Javascript syntax are pretty close)

https://www.asp.net/get-started

https://www.asp.net/mvc/overview/getting-started

This is probably a good start.