So the question is: what *kind* of security services should we build for the Cloud? What do want them to do? What are the goals?
Of course, as security professionals we know the goal is Confidentiality, Integrity and Availability. Right?
Wrong. The worst goal you can have when building security into a system is Confidentiality, Integrity and Availability. These are not actionable goals. How are you going to have get Confidentiality, Integrity and Availability across Data, App, Services, VM, Server, Storage, Network and Organizational layers? You're not. Its the old military saying - when you defend everywhere, you defend nowhere.
This is a big part of the failing of Infosec, and its explainable because Information Security is a 21st Century oxymoron like "jumbo shrimp."
We have "information" this beautiful, messy, unpredictable, fractal-like thing. What is 3M's stock going to do tomorrow? Will it go up or go down? I don't know, but I can make model and use some information to make a guess, but its very unpredictable.
On the other side we have "security" which is an illusion that we can separate the good stuff from the bad stuff.
And in the middle we have this debate, and our job as security professionals is to make sure this is a constructive debate not a destructive debate as it so often becomes.
So the framework that I use has three parts
* Identity & Access Services: Letting good guys do their work - Claims enabled services. People think of authentication, authorization, and so on as security, but its really just ticketing that allows law abiding citizens to follow the rules.
* Defensive Services: keeping bad guys out - conservative services that deal with threats and vulnerabilities
* Enablement Services: making it all work - services managing business enabling such as capabilities provisioning, federation, identity, and secure integration
If you study the brief history of access control models [11] you will see that it begins with the Reference Monitor which had three properties: 1) its always invoked, 2) its tamperproof, and 3) its small enough to be analyzed.
The way that this has rolled out in practice in the enterprise began with RACF a resource focused model, this means that controls are set on the resource or server side. So there is minimal knowledge at all about the client side computing environment, user or context. That lack of context is great for phisherfolk, because it gives them an ocean of space, but not so great for security [12].
The next step is typically Role Based Access Control (RBAC), where Subjects, Objects, Sessions, and Roles are mapped together. This accomplishes some basic context propagation from the client side to the role, but is at the same time 1) pretty coarse grained and 2) needs a tremendous amount of a priori knowledge.
A more constructive way to deal with access control in the Cloud is Claims Based Access Control (CBAC) where a Claim represents an assertion of truth that requires evaluation. The Subject making the Claim and the Object evaluating the Claim are executing independent operations. The way the claim is constructed and how its evaluated can vary and can be based on dynamic attributes. This model fits the current computing landscape much better.
There's a tremendous amount of ferment in the identity world, its one of the most active areas in computing. The weakness that is identity on the Web is being attacked and addressed by various standards like SAML, Information Cards, oauth, and others[14].
SAML is a good example of how this access control model plays out in the real world. SAML assertions are made by an Identity Provider on behalf of a subject and evaluated by a Relying Party on behalf of an Object. Again two independent operations, that then of course provide the glue to communicate security-critical information across the stacks.
SAML assertions can be authentication-related assertions (how a user was authenticated for example), attribute statements (name-value pairs), and authorization decisions assertions. The SAML Producer-Consumer model decouples these concerns and so by allowing for multiple authorities, SAML gives us a way to navigate the multiple namespace problem which is at the root of the Cloud.
So in the case of signing onto Google apps by using SAML, Google doesn't have any knowledge of your Identity Provider, and you don't own/control Google's SAML service provider. Protocols like SAML that allow for this are necessary conditions for access control in the Cloud.
But SAML is of course not a silver bullet, it has some nice properties but it doesn't fix everything. We can use the Threat Model + Attack Surface to see where SAML can play a role
Another Identity & Access Service is Information Cards, which I can sum up a problem statement for thusly "Passwords are Tired"
As Kim Cameron says, we've trained an entire generation of users to type in their username and password every time they see a prompt. And then we make fun of them for getting phished.
So Information Cards are one way to combat this, there are various Card managers out there like Cardspace, Azigo and DigitalMe, and basically the way it works is that instead of logging in with a username/password, we instead click on use an Information Card
The browser plug-in then selects the appropriate card and schema for the site, but notice none of the values are filled.
Then once the user authenticates to their Identity Provider (IdP), the values are populated, digitally signed and encrypted and sent to the site (Relying Party).
This seems like a small thing but the differences are extraordinary. Think of how things typically work today. The Resources are on the server (data, apps and so on), the Subjects are on the server (users, groups), and THE SECRETS are stored on the server (passwords). Does anyone see a problem with this? Shipping dynamite and detonators in the same truck is not a good idea. But that's what we've been doing.
So what kinds of threats can Information Cards address?
oauth gives another axis to play, oauth is a standard that enables authorization on web services without requiring the dreaded username/password combination. So again we have a way to decouple authentication, authorization, and attribution.
So there is quite a bit of activity in identity and access, that is a quick recap of some of the main drivers and themes, but again Identity and Access is not the sum total of security its only about letting the good guys do what they need to do, its probably not even the most important part of security. Because the one thing that we infosec folks can really help developers and architects do is - design for failure. Developers and architects work to make sure the system does what its supposed to, but because of minimal focus on security and the fact that software development as a whole is quite immature, the rabbit holes that are not as often explored are the failure modes. This is where infosec can play a key role, if we can make our role a constructive one, we are in a great position to help bring more focus on the design for failure issue[15].
Access control can really look like security, there's a real beauty inherent in how say authorization is performed, but the reality is that people and systems do unexpected things, that designs and implementations didn't account for.
So this is why we need Defensive Services, to hedge our bets when the other Security Services might fail. To begin, remember the Claims we talked about in CBAC? It would be a lot more reliable to evaluate Claims that have some way to verify integrity and authenticity. So the Claims must be shipped with a Security token that we can use to hash, sign, encrypt and otherwise protect the Claim.
Another way to hedge our bets when designing for failure is audit logging [16], using a checklist of auditable events such as
Authentication, authorization, and access: Authentication/authorization decisions, system access, data access
Changes: System/application changes (especially privilege changes), data changes (including creation and destruction)
Threats: Invalid input (input, schema validation failures)
Exhausted resources, capacity: Limits reached (message throughput, replays), mixed availability issues
Startups and shutdowns: Startups and shutdowns of systems and services
Faults and errors, backup success/failure: Errors, faults, and other application issues
To review what we've seen so far, here are the countermeasures that we briefly described so far in this talk
The Threat Model + Attack Surface matrix tracks what issues we've tried to deal with and where. And what we've deferred due to cost, complexity or other reason.
So Identity and Access Services enable the systems to work, the Defensive Services allow us to deal with inevitable failures. And finally, I will describe the role of Enablement Services.
Security mechanisms must enable not block use cases
We need to know where software and security begin and end
And making this happen is a Governance problem, its not just delivering services, but integrating them correctly
One of the themes in Identity & Access Services is about enabling decentralized security services, this is required for our security services to form fit to something as decentralized as the Cloud. We have decent amount of experience in the field with this happening with SAML and other protocols. What we don't have yet is a lot of experience with decentralized policy.
XACML provides a way to enable more granular control on authorization decisions. Through mapping Subjects, Resources and Actions to a polcy target.
As Bill Gates said security should depend on policy not topology, so what we need is a way to move policy around in the system. Now you might ask yourself - we have made it this far without decentralized policy, why does it matter now all of a sudden?
Think about how many layers require policies, and inside of each of these layers? How will you achieve consistency?
WS-SecurityPolicy has a slightly different approach than XACML, in WS-SecurityPolicy the policies are defined by security policy types such as Channel (like requiring TLS/SSL), message security (like requiring SAML tokens), and then attached to the inflow, outflow or creation of the service.
So where does policy like XACML and WS-SecurityPolicy fit in our Threat Model?
Actually it doesn't - its what makes the above work. More specifically, its what allows you to query the system after you have built it on your assumptions and see if they in fact hold up.
This is in tandem with the way that vulnerability scanning is likely to go in the Cloud[17]. Because scans can cause disruption and be generally unpredictable, plus the providers' having to support many tenants, its not as simple as before to load up the vulnerability scanner of choice and fire it off. Policy will play a role in supporting decentralized access management by making it more predictable and it will likely play a role here too.
The last Enablement service I would like to address is the one that likely holds the most near to mid term promise: the Security Token Service (STS). The STS has three main functions: validating security tokens, issuing security tokens and revoking security tokens.
As with policy, we can see why the STS matters
We have a Constanza wallet full of tokens, what we don't have yet, but need, is a policy-based way to move the Claims around in the systems. So a user logged on to Active Directory with a Kerberos ticket that needs to talk to a Unix web service that only speaks SAML, the client calls the STS with the kerberos ticket, the STS validates it, issues the SAML assertion and the Unix system performs authorization against SAML.
In almost every system there are impersonations, delegations and identity mapping across the board, but the STS gives us a way to do this utilizing security policies and through isolating the weakest link. How many times have you seen a system where someone says "the mainframe only supports six character password" so everything has to be six characters. Instead its better to isolate the weak link, use the highest security token you can in each case and only transform to the lowest for the last hop.
We have already seen one example of the STS which is Information Cards
STS is also commonly associated with a IdP
But it can be used for authorization mapping as well, fronting a Relying Party
Because technology evolutions and revolutions tend to combine the theoretical and practical (J2EE was cool idea, but Weblogic was the first big success story), my guess is that a number of these ideas I described here will land on a physical/logical system and that system will be an STS.
Tim O'Reilly said “Everything we think of as a computer today is really just a device that connects to the big computer that we are all collectively building”, and I would simply add …let’s collectively build security in
Thanks for your time.
**
References
11. A Survey of Access Control Models, NIST http://csrc.nist.gov/news_events/privilege-management-workshop/PvM-Model-Survey-Aug26-2009.pdf
12. Brian Krebs Notes from a System Melting Down Before Our Eyes, http://1raindrop.typepad.com/1_raindrop/2009/10/brian-krebs-notes-from-a-system-melting-down-before-our-eyes.html
13. "The Laws of Identity", Kim Cameron http://www.identityblog.com/?p=352
14. "Identity Venn", Eve Maler, http://www.xmlgrrl.com/blog/categories/venn/
15. A Conversation with Bruce Linsay, Steve Bourne, http://queue.acm.org/detail.cfm?id=1036486
16. Logging in the Age of Web Services, Anton Chuvakin & Gunnar Peterson, http://arctecgroup.net/pdf/82-85.pdf
17. Vulnerabilty Scanning and Clouds, Craig Balding, http://cloudsecurity.org/2009/06/28/vulnerability-scanning-and-clouds-an-attempt-to-move-the-dialog-on/
Gunnar,
The problem with policy is it is a set of incompleat guessed at rules running on multiple insecure levels, relying on correct implementation on unrelated and probably desparate systems.
I was aware of some of the issues to do with this many years ago having been involved with the design of secure communications equipment.
Back in the early 90's when teaching students I had to point out to them one of the pitfalls of security which is putting human understanding onto computers that have no understanding.
At the time I used to tell them,
A Computer has
R - rules
I - information
P - processes
A human understands,
I - integrity
C - communications
E - entities
That is the understanding of human to human interaction does not map conveniantly or even inconveniantly onto the basic principals of computer Operating Systems, it simply does not fit. And the rules (policy of today) would have to be all encompasing for all situations.
If you remember back then, the standard security model was based on Mainframes and Terminals and PC's did not have meaningfull security (16bit co-operative multitasking Windows was sitting on 16 bit single user DOS ontop of a BIOS bassed on a late 1970's 8bit OS called C/PM which had no notion of security).
Security then developed down one of two paths.
1, Apps + data on server.
2, Apps on PC data on server.
And "nair the twain met"
From this various models came fourth such as thin client and middleware and more recently web2.
However the two security models did not change and even today you can clearly see desktop security and server security mindsets in the minds of application developers.
It is a Castle mentality at the PC where it is assumed that the only procecess running are benign and under the control of a single entity. Thus you defend only against external attackers, not those within.
And a Prison mentality at the Server where it is assumed that each process runs in a jail where it is held and access to resources data and communications is strictly controled by the OS.
Thus each system is independent of each other and it's OS's view of the processess running and their access to resources are radicaly different as are the mind sets of not just the application developers but the OS developers as well (for the more common of paid for OS's on i386 and above hardware).
One problem with server side application development was that input checking/validation and error handeling was invariably moved as far to the left (input side) as possible and exception handeling was minimal (the data in the files is always valid and available ;). The reason for this was simplicity of design and an assumption that the whole application stoped if there was an exception (ie it either cored out or gave an error message prior to terminating) and cleanup was automatic...
The advent of the web bassed application saw a sea state change in the way things worked. Applications where quickly decoupled into "front ends" and "back ends". Unfortunatly no real attempt was made to change the design process at the time and often the server side was an existing app that just had the user front end ripped out and passed on to "web app" developers.
You ended up with the security and error checking in the front end (if at all) and the business logic in the back end. All of a sudden the prison mentality was in conflict with the castle mentality. With the result that things became horibly insecure. Due in the main to little or no input error checking in the backend, and no exception handeling in the front end, all seperated by an unreliable and often insecure communications system.
Worse the lack of basic controls at either end opened up the assumptions of the two mind sets like a handgrenade in an oil drum.
Part of the ad hoc solution was to put in place middle ware servers that took care of input errors and validation for the back end and exception handeling for the front end with more secure communication to the front end. However this has done nothing to solve the issues to do with the differing mind sets and also offers more vectors by which the overal system can be assulted (especialy with web servers using shared memory and privalege levels and allocating access to other resources to multiple clients that may have different privalages under different policies).
Worse still, at the client end the chosen application is a web browser that likewise allows multiple applications to run in the same memory space and relies not on the OS to mediate access to resources but the browser which coincidently runs at the same privalege level as the application (often on clients at an adminastrative privalage level).
Thus the prison mentality application segregation and data control has been effectivly obviated by the castle mentality at the client. And the castle mentality security at the client has been obviated by the prison mentality at the server mish mashed by the middleware.
It is this argument that has supposedly given rise to Google's Chrome browser and OS etc. The attempt is to provide continuance of the jail mentality into the client.
However even though you appear to have security parity and thus Peer-to-Peer security you do not.
There is the issue of extending conflicting policies onto the client PC etc thus data can leak from one secure application to the other simply due to trying to get "workability" between conflicting polices sent down from seperate servers (that is policies will sink to the point of acceptable business use, not security).
Then ontop of this there is the issue of covert channel communication. A savey developer of a server or middleware application will be able to tell a lot about what is going on on a client system via various techneiques using amongst other things standard and expected OS/Comms calls (ie an "enumeration" process).
For instance a DNS request made from the client at the behest of the middleware/server can reveal if the client has made a request to a given server within the local cache timeout simply by the response time (likewise via it's local DNS server etc). Thus the application developer can make a reasonable guess as to what other servers are being accessed by the client (and there are hundreds if not thousands of other tricks that can be used to get information including encryption keys).
If the client does not have an appropriate policy and OS support to stop this then there is no way the policy of another server can stop it happening except by exclusivity of client use which would break most business use models.
This issue then becomes two way in that with a cloud based service it may be in the cloud service owners interests to "enumerate" clients to see what other services are in use.
To prevent this sort of covert channel attack you need to "clock the inputs and clock the outputs" to remove timing information. This security mechanism however only works when the resources are only lightly loaded at best.
Thus "efficient systems" with high utilisation (load) will always have covert channels that can be exploited, and "Cloud computing" is only viable if it's resources are used efficiently...
All of which means that use of the cloud will always be insecure to malware unless exclusivity of resources is employed which breaks basic business models across the board...
Thus the question that should be asked is not how do I make the cloud secure (you cannot except by exclusivity) but how do I limit the bandwidth of any channels or limit the information that can leak.
That is I cannot reliably lock up the "container" (the Cloud) so I have to lock the data for a time period until it's value has minimised.
Which brings you around to the question of impact minimisation by other methods.
This is where the notion of parallel processing comes to mind. Data can be split up in many ways across many systems and the use of even simple encryption can help (obsfication) if adiquate data normalisation has been used.
A little fore thought on such issues would make the task of impact minimisation considerably easier, and should be a part of the data preperation phase of any "cloud project" to make it workable on atleast two seperate service providers systems.
As has often been noted "For a ha'pence of tar the ship was lost" or "P155 Poor Planning leads to P155 Poor Performance"
Posted by: Clive Robinson | November 01, 2009 at 12:31 PM