Browsed by
Tag: storage

Copying blobs between storage accounts / regions

Copying blobs between storage accounts / regions

So a common question I get is copying blobs. So if you are working with azure blob storage, it’s sort of inevitable that you would need to do a data copy. Whether that be for a migration, re-architecture, any number of reasons … you will need to do a data copy.

Now this is something where I’ve seen all different versions of doing a data copy. And I’m going to talk through those options here, and ultimately how best to execute a copy within Azure Blob Storage.

I want to start with the number 1, DO NOT DO, option. That option is “build a utility to cycle through and copy blobs one by one.” This is the least desirable option for moving data for a couple of reasons:

  • Speed – This is going to be a single threaded, synchronous operation.
  • Complexity – This feels counter-intuitive, but the process of ensuring data copies, building fault handling, etc…is not easy. And not something you want to take on when you don’t have to.
  • Chances of Failure – Long running processes are always problematic, always. As these processes can fail, and when they do they can be difficult to recover from. So you are opening yourself up to potential problems.
  • Cost – At the end of the day, you are creating a long running process that will need to have compute running 24/7 for an extended period. Compute in the cloud costs money, so this is an additional cost.

So the question is, if I shouldn’t build my own utility, how do we get this done. There are really two options that I’ve used in the past to success:

  • AzCopy – This is the tried and true option. This utility provides an easy command line interface for kicking off copy jobs that can be run either in a synchronous or asynchronous method. Even in its synchronous option, you will see higher throughput for the copy. This removes some of the issues from above, but not all.
  • Copy API – a newer option, the Rest API enables a copy operation. This provides the best possible throughput and prevents you from having to create a VM, allowing for asynchronous copy operations in azure to facilitate this operation. The API is easy to use and documentation can be found here.

Ultimately, there are lots of ways and scenarios you can leverage these tools to copy data. The other one that I find usually raises questions, is if I’m migrating a large volume of data, how do I do it to minimize downtime.

The way I’ve accomplished this, is to break your data down accordingly.

  • Sort the data by age oldest to newest.
  • Starting with the oldest blobs, break them down into the following chucks.
  • Move the first 50%
  • Move the next 30%
  • Move the next 10-15%
  • Take a downtime window to copy the last 5-10%

By doing so, you gain the ability to minimize your downtime window while maximizing the backend copy. Now the above process only works if your newer data is accessed more often, it creates a good option for moving your blobs, and minimizing downtime.

Weekly Links – 4/27

Weekly Links – 4/27

So I know this post got out a little late today, and for that I do apologize. Home schooling has been interesting at our house. Its definitely been a transition for everyone involved. But overall its been going well. My kids have been real troopers. My wife is the most amazing woman on the planet, and has been the expert in all of this.

My daughter has setup a desk in my office, so its nice to think that during COVID, my home office has basically become a Wework.

See the source image

Down to business…

Fun Stuff:

So I’ve made no secret of my enjoyment of Outside cooking, and over the past two weeks I’ve done a lot of work with my Weber smoker. And this past week turned out great. I went from a small chicken, to a 11 lb Brisket! And then 12 hours later, we had an amazing meal.

Weekly links – 10/14

Weekly links – 10/14

So this past week, I spent every free moment working on a shed in my backyard, and like any constructive project its had a slew of delays. But we are powering through:

See the source image

Down to business…

Development:

Cloud:

Audio / Video:

Fun Stuff:

So as always I’m a big comic fan, and I’ve said before I’m a fan of the CW Arrowverse. For as much as DC movies are terrible, their TV shows are quite excellent. And the standout last year was Supergirl, it really tapped into what makes for the best Superman / Supergirl stories. The best stories are all based around problems that they can’t “super power their way out of”. Last season tackled real topics like trust of the media, xenophobia, racism, and others. This season is already moving towards tackling technology and its ability to change the way that we view reality and connect with each other.

Weekly Links – 9/23

Weekly Links – 9/23

So I know I’m a little late this week, but here they are. I was away at a conference in sunny Las Vegas for the week, and it was quite the week.

See the source image

But anyway down the business:

Development:

  • Cascadia Code is live: Normally don’t care about a font, but this is pretty cool because of its support of ligatures. Makes code much easier to read which is pretty awesome.
  • .NET Conf: Really cool virtual conference with more materials and announcements. Next week should have a lot of new annoucements.

Cloud:

Audio / Video:

Fun Stuff:

As always to live up to our name, here’s a nerd topic for the links. I’m a big batman fan, always have been. I’m pretty sure my kids knew who Batman was long before they knew Big Bird. With that I’ve been enjoying the current comic run, with Tom King as the writer, and it is coming to an end and they announced the new writer, James Tynion IV, who is a great writer who wrote Batman Eternal, so I’ve very excited. To read more, look here.

See the source image