Microsoft Azure RBAC.

Microsoft Azure RBAC

What is Azure RBAC?

RBAC is role based access control. A bit of history to begin with might help understand this better. Before RBAC there were two basic roles in Azure- 1) service administrator and 2) co-admin for a subscription (of course this is outside of the EA roles like Account/Department admin)

Azure Subscription roles

Azure Subscription roles

Members who are part of these roles can do everything in a subscription right from VNet creation, VM creation and accessing logs and what not. Co-Admins are added from the management portal- and you can have more than one co-admin. Service Administrator is added by the Account Manager (added/managed from enterprise portal- Members in both of these roles cannot see billing details (in the case of Enterprise Account) and for any tickets you open with Microsoft support, the email of service administrator is used for communication.

 The distributed world of application development and infrastructure management requires much more than that. You may want few members to just take care of VMs, few others for Storage and so on. You may want to further assign different privileges within that. Like some could only view while others could create and so on. That basically is fine grained control. That’s what RBAC is all about. RBAC is not as complete as that as of this writing (02/01/2015) but it’s let’s say v1 towards that goal.

 As of this writing you have three built-in roles (Owner, Contributor and Reader) available for assignment to Users, Groups and Services on Azure scopes: Subscription, Resource Group and Resources. You can manage the access using Azure portal, Command Line Tools & REST API for bulk operations.

What you can achieve today in RBAC is depicted in the graphic below-



The important architectural element worth noting is that these roles can actually be your IDMS roles (from your on premise AD which is federated). So that’s all for the introduction to Azure RBAC. I will further elaborate this with an example in a separate blog post.


Posted in Azure IAM | Tagged , | 1 Comment

Windows azure caching 101

In Windows Azure there are three options for caching- Shared Cache Service, In-role caching and Azure Cache Service:

  1. “Shared Caching Service”, this was caching on a shared cluster and one could access the cache using the secret key. This is a multi-tenant offering and enforced throttling behavior and hence many windows azure customers didn’t like this. This is being retired sometime Aug 2014. This btw, is not even there in the current HTML portal and hence many people don’t know about the existence of this mechanism.
  2. “In-role cache”

This was an offering where you could mention that a portion of your webrole or worker role be used for caching purpose.

webrole for caching

webrole for caching

worker role for dedicated caching

worker role for dedicated caching

You can mention this in a cloud service project in Visual Studio:

Visual Studio Settings for caching

In the web role properties, you can select the Caching section and turn on caching by checking the “Enable Caching” select box.

You can specify which percentage of the web role memory you want towards cache size if using “co-located role” model. If you select dedicated role (2nd graphic above) (please note: dedicated role caching is only supported on worker roles and cannot be configured on web roles), the worker role is dedicated for caching purpose. Billing for in-role caching is same as compute web/worker role billing. And it’s available in small (1.75), medium (3.5), large (7) and extra-large (14) sizes.  Although the compute roles have the said memory, please note that some resources would however be used by the OS.

3.  “Cache Service”

Cache service is the latest offering. It brings the best of both worlds. While in-role caching was available for use only from within the cloud service, cache service makes the cache data available on a public end point by use of a secret key. It has few other really wonderful aspects like highly available data for the cache data, in the sense that the data is cached in a cluster from a failover perspective. Both the service and the data itself are highly available in this case. It’s not a shared service as the originally available cache service, so no throttling! There are also few advanced features like notification to the client when cache changes and so on, which makes it really best a reliable and advanced cache service offering.

Note on memcached:

memcached is a high-performance, distributed memory object caching system, generic in nature, but originally intended for use in speeding up dynamic web applications by alleviating database load. The system uses a client–server architecture. The servers maintain a key–value associative array; the clients populate this array and query it. Keys are up to 250 bytes long and values can be at most 1 megabyte in size. Clients use client-side libraries to contact the servers which, by default, expose their service at port 11211. Each client knows all servers; the servers do not communicate with each other.

If you have existing applications which use memcached, you can readily use them in windows azure. Windows Azure Caching supports almost every API that other Memcache implementations support. Memcached in windows azure works with the in-role caching mechanism as of the writing of this blog post. In future it could be expected to be made available with “cache service” too.

Posted in Azure | Tagged , , | Leave a comment

Windows Azure Portal View restructured to filter by directory

Just noticed today, right now (22OCT2013), that the WindowsAzure portal now filters the view based on Azure Active Directory. Have a look at the graphic below and you will understand what I am talking about-

azure subscription by directory

azure subscription by directory

If you were to login to the portal today, you will see the following message come up- “The subscriptions view has been restructured to filter by directory. Resources shown here belong to subscriptions associated with the ‘Default Directory’ directory only.” And you cannot view details for all directories at once.

What does this mean?

Basically this means that when a subscription is created for a user, a Microsoft Azure directory (<<your Microsoft account alias>> is created and the current Microsoft account (user) becomes a part of that directory. And now you add co-admins as part of this directory. Essentially a directory container is added to the azure subscription.

Essentially, think about the number of Azure AD tenants being provisioned today? Basically all the azure subs got a directory, isn’t it? Phew! Far too many. And are the users are getting authenticated against Azure AD/ACS? yes.

Why is Microsoft doing this? My thinking is that this will help lot of enterprise customers to start federating on premise AD to Azure AD and getting authenticated from there. Avoids the hassle of creating one more Microsoft account(s). This will also mean rapid adoption and understanding of Windows Azure Active Directory. Sweet!



Posted in Windows Azure Active Directory | Tagged , | Leave a comment

Our setup of HortonWorks HDP 1.1 on Windows VMs


Many of our customers have started talking about Big Data, and how they can best make sense of the data they have collected over many years. Apache Hadoop is one among the tools we suggest to them.

Being Windows guys, and having very basic hands-on skills in linux administration and management, we faced challenges with setting up Hadoop on Linux.

HortonWorks changed this and made it much more cooler easier to use Hadoop on Windows with their release of easy-to-use Hadoop releases.

We started off by playing  with the HortonWorks Sandbox and liked what we saw. Very simple to set up (Just install VMWare or VirtualBox) and having a web GUI for everything. We sailed thru their very comprehensive tutorials on Hive and Pig.

Our next step was to ssh (to the Windows guys – “remote desktopped”) to the virtual machine, and play around with the CLI. But we did not see too many changes while using the CLI.

But the sandbox limits us to a single node, and so we were unable to test out the limits of Hadoop or any of the components. (having admin acccess on the machine, we did think about modifying the setup, to build a cluster, but decided not to).

We went thru the excellent article of setting up release on Linux machines (, but it did not solve our requirement (should be simple to manage / administrate for Windows users). We read about beta release of HDInsight (, and signed up for the beta.

Then we focused on  the HortonWorks Hadoop release for Windows and decided to set up a cluster. For our hardware, rather than requisitioning our own hardware we decided to instantiate multiple Windows VMs on Windows Azure. (later realized advantage – we captured the image of the hadoop slave, and now we can instantiate slaves on-demand).

We decided to use a 2 node setup (master=1, slave=1 – master has secondary name node also on same machine)

The steps to install are clearly documented at

We tried deviating from the documentation, but realized that we should not have. But all in all following the steps in the link above is the best way to install it.

The points we deviated on, but we suggest you do as per the documentation –

  • We thought that installing JRE would be good enough. But it looks like oozie is dependent on JDK, and does not work with just the JRE. 
  • After setting up and realizing the problems with Oozie, we uninstalled JRE and installed JDK instead. Assuming that correcting the JAVA_HOME environment variable will be enough, we tried, in vain to restart the hadoop services. Finally we solved the problem with some modifications. (
  • We had not seen the fourth column in the table at (, and had assumed that the third column will be “default values”, so had ignored many of the variables in our configuration. Point to note – all variables are mandatory.
  • If installation of the MSI fails, there is no failure reported by the MSI. So remember to check the installation log file, and check the installation folder, to confirm that it is installed.
  • If by chance the MSI does not install correctly, remember to go to control panel and uninstall the “HortonWorks Data Platform 1.1.0 for Windows” before trying the MSI again, else the installation will keep failing (checking the logs, it looks as if the MSI tries to uninstall the product, but since the product is not installed, the uninstall fails, chicken and egg problem).

Note the above steps before installing HDP, and setting it up will be a breeze.

Now that we have our cluster setup, there will be more posts coming soon about our experiences

Posted in Azure, Big data | Tagged , , , | 1 Comment