Eli Weinstock-Herman

Real World Azure: Lease Container bug in Azure Storage API

Original post posted on November 16, 2015 at LessThanDot.com

Recently we’ve been working with the raw Azure Storage API to try and get to a more stable solution then the far more aggressively changing Azure Storage SDK. One of the goals is to be able to work equally well locally, against the emulator, and in production. We’re used to cases where the Emulator diverges from production or the documentation, but recently we found a case where the emulator and documentation match, but the production services appear to be wrong.

Real World Azure

There are a lot of great resources out there on Azure, from demos to webcasts to white papers filled with architectural diagrams. This is to be expected. Microsoft products tend to focus on the 15 minute demo or polished architecture diagram in an enterprise whitepaper, a controlled exposure of only a subset of the functionality you will use in the real world.

I have used Azure daily for years on live business and personal projects, not demos. From supporting production systems running hundred of millions of storage transactions to figuring out why a change to the Azure Management API limits sends certain legacy code into a death spiral to working directly with the APIs in 3-4 different languages to months where we had 2-4 active support cases at any time. These are examples found in the real, production world.

The prior “real world azure” post (September 2013) was a Azure API Queue bug that is still present today.

This newer bug is more minor, unless you are relying on the error codes to be correct, in which case it’s kind of painful. It’s also concerning because, while we don’t build Storage APIs and SDKs for a living, we caught this in our integration tests relatively quickly, but it appears to have been missed in Microsoft testing thus far.

What is Azure Blob Storage?

The shortest explanation I can provide for Azure Blob service is to think of it as an infinitely wide file system. Azure blobs reside in Containers (folders). We have the ability to Lease Containers or Blobs (think of leases as similar to file locks that have the option of automatically releasing at a future time). Once a Container or Blob is Leased, only operations that include the correct lease are allowed to operate on them (except some cases where having no lease is still allowed, like read/download).

Leasing Non-Existent Containers

The Azure REST API outlines all of the errors you can expect to get back, nicely broken down into a common set of errors and service-specific lists (Blob Service Errors).

The two error codes we are looking at are:

ContainerNotFound: Not Found (404) – The specified container does not exist.
BlobNotFound: Not Found (404) – The specified blob does not exist.

A test for the Lease Container Operation can be implemented using the SDK like this:

ContainerNotFoundReturnsWrongError.cs

/// <summary>
/// Emulator: Returns 404 Container Not Found (tested with 3.3 and other versions)
/// Azure API: Returns 404 Blob Not Found (tested with 3.3 and other versions)
/// </summary>
[Test]
public void AcquireLease_NonExistentContainer_ReturnsContainerNotFoundError()
{
    var blobClient = _account.CreateCloudBlobClient();
    var containerReference = blobClient.GetContainerReference("nonexistent-container");

    int statusCode = -1;
    string status = "not defined";
    try
    {
        containerReference.AcquireLease(TimeSpan.FromSeconds(15), Guid.NewGuid().ToString());
    }
    catch (StorageException exc)
    {
        statusCode = exc.RequestInformation.HttpStatusCode;
        status = exc.RequestInformation.HttpStatusMessage;
    }

    Assert.AreEqual(ErrorCode_NotFound, statusCode);
    Assert.AreEqual(ErrorStatus_ContainerNotFound, status);
}

(There are also examples of raw HTTP implementations in that same test file to verify it is not an SDK error, which is also why we’ll look at the response at the network level using fiddler).

On the local emulator, this will return the following details (fiddler):

LeaseContainer - local Emulator response (Fiddler)

Against a production API, it returns the following details (fiddler):

LeaseContainer - Live Azure Response (Fiddler)

In this case, the emulator is correct, but the production Storage API returns the wrong error.

I tested this against multiple versions of the API, locally and in the cloud, and got the same results: the production Storage API returns the wrong error code for LeaseContainer operations.

Comments are available on the original post at lessthandot.com