Article: Scaling Azure Table and Blob Storage

This article is excerpted from Chapter 4: Scaling Azure Table and Blob Storage of the Wrox book Cloud Computing with the Windows Azure Platform by Roger Jennings, and is reused by permission of the publisher. This may not be reused without publisher permission.

Scaling Azure Table and Blob Storage

Achieving high scalability requires cloud-based applications and services to be stateless so as not to rely on the data center’s load balancing devices or software to route successive requests from a specific client to a particular logical node. However, most applications and services require access to data persisted in tables that share some characteristics of relational database tables, as well as individual binary large objects (blobs) for storing unstructured data such as images and text documents. Azure Storage Services consist of highly scalable and available persistent storage for the following three types of data:
 

Tables are structured tabular data stored in an Entity-Attribute-Value (EAV) data model; the maximum size of all attribute values of an entity is 1MB. Entities can be grouped into storage partitions, which are maintained in a single location.

Blobs consist of unstructured file-based data stored in an array of bytes; containers store sets of individual blobs up to 50GB in size in hierarchical groups, which emulate a directory structure. Only blob containers and their content are available for public access.

Queues contain an unlimited number of messages stored in tables for processing by global services (often Worker Cloud services); messages have a maximum size of 8KB. Messages usually are deleted after the process that reads them handles them. Queues are the subject of Chapter 8, ‘‘Messaging with Azure Queues.’’

To assure availability and reliability, all stored data consists of a master and two or more replicas stored on different Fault Domains.When Microsoft makes the Azure Services Platform available for general use, data can be replicated to multiple data centers to assure access in the event of a data center’s destruction.

SQL Azure Database (SADB, formerly SQL Data Services, SDS, and SQL Server Data Services, SSDS) is an alternative to Azure Tables that offers many features of relational tables. Microsoft charges a premium for SADB storage and network ingress/egress traffic. SADB is the subject of Chapter 12, ‘‘Managing SQL Azure Database Accounts, Databases and DataHubs,’’ and 13, ‘‘Exploiting SQL Azure Database’ Relational Features.’’

Part I: Introducing the Windows Azure Platform

Creating Storage Accounts

Early Azure CTPs provided testers with two Storage Accounts with each Hosted Service account. A single Storage Account provides a separate URI for tables, blobs, and queues. When you receive a token GUID as the result of a request to sign up for an Azure beta account and create a Hosted Service account as described in Chapter 2’s ‘‘The Windows Azure Developer Portal’’ section, the token enables you to create two Storage Accounts with different prefix names.

Create the First Storage Account with a Hosted Service Token

To create a Storage Account with a token, click the Account tab and its Manage My Tokens link to open the Tokens page, copy and paste the token GUID into the Resource Token ID text box, and click Claim Token to add a Compute Only bucket for the GUID to the Gated Entity’s Storage Accounts group, as shown in Figure 4-1.
 


Figure 4-1:The Azure Developer Portal’s Tokens page with the initial Storage Account for a Hosted Service token.

Clicking the Claim Token and Continue buttons opens the My Projects page. Click the Project Name link to open the project page and click the New Service link to open the Project | Create a new service component page with choices for Storage Account and Hosted Services (see Figure 4-2).
 

 

Figure 4-2: Clicking the Tokens Page’s Claim Token button with a token valid for the first Storage Account leads to the Project page.

The Storage Account selection displays the number of Storage Accounts available (project(s) remaining) for the Hosted Service tokens you’ve redeemed. Click the Storage Account icon to open the Create a Project – Project Properties page. Type a unique Project Label and add a Project Description as shown in Figure 4-3.

Click the Next button to open the Create a Project – Storage Account page, add a unique Service Name DNS prefix consisting of lowercase letters and numerals, and click the Check Availability button to ensure the prefix is globally unique for Storage Accounts within all Azure data centers). To keep data in the same data center as the related hosted service, mark the ‘‘Yes, this service is related…’’ and ‘‘Create a new Affinity Group’’ buttons, select from the list of available data centers (regions) and type a name for the region in the text box (see Figure 4-4.) When this book was written, only the USA – Northwest (Quincy, WA) and USA – Southwest (San Antonio, TX) data centers were active.
 


Figure 4-3: Assigning a unique Project Label to a new blob Storage Account in the Create a Project – Project Properties page.

Click Create to generate the new Storage Account and open the Service Name page, which displays an http://dns_prefix.data_type.core.windows.net endpoint for each of the three data types and displays Primary Access Key and Secondary Access Key values for the three endpoints (see Figure 4-5.)

Keep the Primary Access Key and Secondary Access Key values confidential because either of the two keys permits access to your storage services when added to the ServiceConfiguration.cscfg file.

Create an Additional Storage Account with a Hosted Service Token

To create the second Storage Account for a Hosted Service, click the New Project link in the Development Portal’s left panel to open the Project–Create a New Service Component page, which contains links for new Storage Accounts (refer to Figure 4-2) and proceed with the steps shown in Figures 4-3 through 4-5.
 


Figure 4-4: Assigning a globally unique prefix for a new blob Storage Account in the Create a Project –Storage Account page.

Using or Wrapping the Azure Storage Services’ REST APIs

You don’t need to use a Hosted Service to take advantage of Azure’s scalable storage features. In fact, you can access Storage Accounts and their data with any popular computer language, such as PHP, Python, IronPython, Ruby, IronRuby, Java, C#, or Visual Basic, that’s capable of interacting with web resources by invoking HTTP’s GET, POST, PUT, and other standard methods. Azure Storage Services provides official Representational State Transfer (REST) APIs for the Storage Account and each storage type. REST methods create, retrieve, update, or delete resources that are identified by Uniform Resource Identifiers (URIs). As you’ll see in the sections that follow, using the official REST APIs directly requires passing many HTTP header value strings to system.Net.HttpWebRequest methods and parsing header strings returned in System.Net.HttpWebRequest objects.

Programming with ‘‘magic strings’’ rather than CLR objects is contrary to .NET’s strongly typed objectoriented methodology. Therefore, Azure Tables support a restricted feature set of the .NET Client Library for ADO.NET Data Services (formerly code-named and still called Astoria) to access data in Table Storage with queries composed with the LINQ to REST dialect that return .NET collections. SQL Azure Database (SADB) also supports and Astoria interface. Astoria uses the Atom Syndication Format (www.ietf.org/rfc/rfc4287.txt) for table data retrieval and the Atom Publishing Protocol (AtomPub, www.ietf.org/rfc/rfc5023.txt) for table insertions, updates, and deletions, as mentioned in Chapter 2.
 


Figure 4-5: The last (Service Name) page in the process of creating a Storage Account.
Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *