How Policy will drive the Software Defined Data Center
Many companies trying to take advantage of cloud computing are embracing the moniker of the “Software Defined Data Center” as one way to understand and communicate the benefits of moving towards an infrastructure resource utility model. VMware has taken on the term SDDC to mean doing everything in your data center with software and not requiring any custom hardware. Other companies sell “software-defined” products which do require particular hardware for various reasons but the functionality can be programmatically controlled and requested all in software. Whether your definition of “software-defined” mandates hardware or not the general premise (nothing to do with premises!) is being able to deliver and scale IT resources programmatically.
This is great but I think SDDC is just a stepping stone to what we are really trying to achieve which is the “Policy Defined Data Center”.
Once you can deliver IT resources in software, the next step is ensuring those IT resources are following your business rules and processes, what you would probably call business intelligence policy enforcement. These are the things that your business asks of IT partly for regulatory reasons like data retention and storing credit cards securely but also encompasses a huge amount of what you do in IT.
Here are a few examples of what kinds of policies may you have:
- Users need to change their passwords every 30 days.
- Local admin access to servers is strictly controlled by AD groups.
- Developers cannot have access to production systems.
- You can only RDP to servers over a management connection.
- Critical services need to be replicated to a DR site, some synchronously, others not.
- Production servers need to get priority over test and development servers.
- Web server connections need to be secured with SSL.
- SQL Server storage needs to have higher priority over say print servers.
- Oracle VMs need to run on particular hosts for licensing considerations.
- Load balanced web servers need to sit in different blade chassis in different racks.
- Your trading application needs to have maximum x latency and minimum y IOPS
- Your widget application needs to be recoverable within an hour and no be more than 2 hours out of date.
- Your credit card database storage needs to be encrypted
- All production servers need to be backed up, some need to be kept for 7 years.
I’m sure you can think of any number of rules that you already have in your data center to govern who has access to what and how resources are allocated. This is policy. These policies often have to be audited and may be legal requirements with hefty fines if not adhered to.
Notice how none of these policies are tied to how they are actually implemented. To get a VM replicated synchronously to another site, it needs to be placed on a particular LUN/datastore. To encrypt a credit card database file it needs to be stored on a particular super secure SAN. Your backup product will have retention policies which need to be targeted to particular VMs.
Policy is your business requirement, implementation is how technically you achieve that but they should be treated as two separate things.
Companies deploy Windows in their environments for many reasons but one of the major benefits is Active Directory and Group Policies (GPOs) in particular. The name says it all, Policy. With GPOs you define a list of business requirement that you encode in a policy. There are GPO settings for pretty much everything you can do with Windows, enforcing password changes, controlling local administrator access by group, etc. You then apply these GPOs to your entire infrastructure or parts of it either by applying them to an organisational unit (OU) and placing particular AD objects within that OU or targeting groups of objects by group membership or even by WMI query results. You can have multiple GPOs applied to an object and they can be ordered by preference in case there are conflicts.
Active Directory GPOs are extremely powerful, yet still flexible. Sure they can sometimes get complicated and unwieldy to manage but that’s often due to poor implementation rather than an issue with the GPO system itself. Policies are easy to change. You can change the local administrators group on thousands of servers or workstations by editing a single policy and as soon as the policy refreshes the setting is applied to them all, you are not editing individual objects. If someone changes a workstation setting, the policy can reverse this change automatically. They are also very easily auditable. You can show an auditor a group membership or where a user or server object sits in an OU within AD and show the policy that is applied and satisfy the auditor that the policies you have defined are implemented. This is compliance.
Policy is not the same as automation. You can automate settings across thousands of servers using any number of ways but the only way to check compliance is running more checking scripts. If you change the policy you need to run more scripts to check compliance, if someone changes a setting, you won’t know until you next run the checking script. With policy you can be assured you are continually in compliance.
So, how does policy tie to the Software Defined Data Center. Well, if all your configuration is delivered in software and you can programmatically control this configuration, you can create policy rules that control this software. Think GPOs for all infrastructure, not just AD objects. VMware is adding far more emphasis on policy. In fact some of these policies you have been using for years and there’s likely plenty more to come.
If you have used resource pools you have used a form of policy. You had a resource group for PROD servers and one for TEST servers. The policy defined that when resources were constrained, PROD servers were to be given priority over TEST servers. Change the policy and all VMs in that resource group have their resources adjusted automatically to meet the updated policy.
If you use affinity or anti-affinity rule groups you are already using a form of policy. You are specifying where VMs should or should not be placed. Change the groups and VMs are automatically moved around to satisfy the policy.
To see where VMware is heading look no further than VSAN. With VSAN you create a single datastore per cluster. You don’t partition this datastore for PROD/TEST/VDI/whatever VMs. You define policies for what resources you want to give to your VMs and how well the data should be protected. The VSAN policy engine then places VM disk blocks within the VSAN cluster on particular disks to ensure the VMs meet this policy, creating multiple replicas for protection and/or to satisfy IOPS requirements. You don’t go and replicate anything yourself, you define the policy and VSAN implements your rules.
VMware’s future enhancement is Virtual Volumes (VVols) which I initially wrote about at VMworld 2011. Looking at the amount of recent public chatter about it, VVols looks to be a big part of the VMworld announcements. VVols take the same VSAN storage policy based approach and extend it to 3rd party storage which needs to be able to support VVols but anyone wanting to sell storage that can run VMware VMs is rushing to ensure they can handle it.
With VVols, the storage array tells vCenter what functionality it is able to deliver via VASA. This could be disk encryption, synchronous replication, IOPS reservations/limits, snapshot retention, backups, anything software wise your array may be able to deliver.
You then create VVol storage policies based on these functions. Finally, we will also be moving away from the current silly FC/iSCSI/NFS provisioning which dictated your disk format based on your protocol choice. With VVols there are no traditional LUNs (yay!) or mount points, just a transport layer using FC/iSCSI/NFS connecting to your array running VVol supporting software which stores a VMs configuration and disk files all as separate objects, you no longer need to group VMs together for storage, they have their storage needs met individually . This is a simplistic explanation but gets the point across. For more info on VVols, look at Cormac Hogan’s Virtual Volumes Tech Preview post.
You then provision your VMs against the policies you have defined. Notice as with VSAN you don’t provision your VMs against datastores but against a policy which then talks to your storage array which based on what functionality of the policy you are using creates a VVol to store your VM disk file and decides where to store it. As an example, you could have a template for your credit card storage database server which you deploy. The VM may have 3 disk files, one for the OS, one for the page file and one for the database files. You could associate different policies for each disk and even multiple policies for each disk. The OS disk may have a policy from the storage that specifies a particular disk tier and hourly replication. The array would create a VVol for this disk based on this policy and ensure it is stored on for example SAS disks and is replicated hourly. For the page file disk, you don’t need it to be replicated so the storage array won’t. For the database files, you specify that it needs to be on your highest disk tier and also encrypted so your array would create a VVol backed against SSD disks and guarantee minimum IOPS and maximum latency and also encrypt the storage. Then, at a later stage you then decide for all your database VMs you do want to replicate the page file, you change the policy once and the storage array will do whatever it needs to do to replicate the disk. You don’t move it yourself to another replicated datastore, you amend the policy and the storage array complies with it. See the difference between placement and policy? If you upgrade the software running on your array and it is now able to do something new, you can incorporate this functionality into your policies and when these are applied, your array goes off and complies. You can make these policies as part of a VM template so any new VMs you create are automatically provisioned with what the policies specify.
So, that’s VSAN and VVols for storage but policy should be able to be applied to anything in your data center.
Take networking for example. Your network can bubble up placement and security functionality from which you define policies. Your web servers need to be behind particular DMZ firewalls and only accessed via port 443, your app tier needs to be behind other internal firewalls and may need other ports open. If this could all be specified in policy, how powerful would that be? You could deploy a new monitoring application that needs to run over port 8000. You could update the policy to include inbound port 8000 and every VM that has that policy applied would have port 8000 open. Even more interesting is decommissioning an application. Currently if you decommission a monitoring application for example do you ever go and tidy up all the old firewall rules? Unlikely, but with a policy based approach it would be far simper to make this change. It would be as simple as amending a GPO and the firewall rule would be removed. Even better is to ensure these policies can be stored as part of the VM config itself so the VM could be deployed from a template and have the network policies automatically applied. This would in effect have firewall rules included as part of a VM template.
The true power of policy is to be able to tie all these policies from network, storage, compute placement, security etc. together. You have four ultra critical servers that run your company public front-end that need to be placed carefully by policy. One policy would ensure that two are hosted on ESXi hosts in different racks in one data center and another pair in another data center also split by rack but perhaps in a particular geographic zone due to data locality laws. As there are also legacy licensing considerations, these VMs all need to be on a particular subset of hosts within your larger cluster. Policy would ensure they are placed on particular networks which are secured and load balanced between all four boxes. Storage wise, one of the disks is encrypted and all disks are snapshotted every minute, again achieved by policy. Access to who can manage these VMs remotely is also done via policy whether that is firewall access with IPS to ensure only DBAs and IT support can RDP/SSH through the firewall or have VM console access.
If this was all visible and managed within a single place, maybe your future federated, distributed, infinitely scalable and highly available vCenter (hint, hint, VMware!), that would be incredibly powerful. If you are confident your policies accurately reflect your business rules and your VMs are configured with these policies, you can be sure they are placed where they should be and given what resources they need. You will need a mechanism to manage these policies, whether it is groups or tags or an OU-like structure or all three. You also will need a way to create a hierarchy and be able to easily see and resolve conflicting settings.
If the auditors want you to prove anything to them, it would be a matter of showing them the policy rather than having to run placement and configuration scripts or take screen shots to show where everything is. Policy makes it much easier to achieve compliance and keeping auditors off your back is always a good thing!
The Software Defined Data Center is a great start but let’s do it properly and head towards a Policy Defined Data Center.