Ned over at the Barking Seal uses the recent Macondo example to illustrate what Richard Bejtlich calls Building Visibility In:
In the class I teach for developers to build Audit Logging into their applications, we build on the good work that's been driven out of PCI DSS - namely the spec created demand for best of breed audit logging tools - a competitive marketplace. So now there are legions of tools in the space, at a relatively decent cost. People in security love to slag off PCI, but you know what - if you went back 8 years pre PCI, you would not find a market for audit logging tools, it would have been two guys in a basement in West Texas. Now there's a nice niche market, and real tools.Steven Newman, the CEO of Transocean, said during a recent senate hearing, “There is some delay in the replication of our data, so our operational data, our sequence of events ends at 3 o’clock in the afternoon on the 20th. And so the VMS system, along with the logs of the VMS system, would have gone down with the vessel.” The blowout and massive explosion happened at 10, taking eleven lives and seven hours of VMS data to the bottom of the ocean. Representative Bruce Braley from Iowa followed up with “So you have no mirrored backup data device so that that information is recorded at some other location than on the rig itself?”. Newman replied, “We do not have real-time off-rig monitoring of what’s going on on the vessel”.
But as always, the tools only go so far and so its necessary to build the audit loggers into the code so that you can make the audit log manager useful. It would be nice if you had consistent event models, types and reporting as well. What we spend a lot of time on in training is - placement of the audit logger.
Location, Location, Location
Think about about a three tier architecture. Now think about an attack - let's SQL injection. What does SQL Injection look like at the presentation tier? Probably not much HTTP request, possibly with some funky characters, but its likely a "nothing to report" situation. (Yes I know I am glossing over the possibility of the input validator catching it up stream, but bear with me) A little lower in the middle tier, mapping to business objects and applying business rules, it might trip a wire, but even there maybe not. Then at the data tier, formatting the query the logger may finally catch an exception or see something malformed and be able to 1) identify it as such and 2) report it.
Then we do other exercises trying to audit log CSRF, XSS and other areas. In each case you will likely find that where in the code you choose to locate your audit logger is just as important as the events you are looking to gather.
Richard Bejtlich mentioned the topic of monitoring in a post, where he was posed a simple question (always the hardest kind) by a CISO Can you tell me when something bad happens to any of my 100 servers?"
Its worth reading Richard's whole post, but the part I want to include here is this part of his answer:
Its a good list, and I would add a few more - One of the smartest clients I have asked me to put together the Audit Logging training class (rule 1 in consulting - listen to your clients, especially the smart ones). He had been down the PCI road, had tooling and a basic event model, but needed concrete ideas, examples, patterns and practices on how hands on developers could integrate the audit logging beast into their apps. I put the class together and was leery that even though it seemed important topic, that anyone else would care, but it turned out to be a popular class. One thing I have learned from reading Richard Bejtlich and studying real world security (ever lose a credit card?) responses, is that access control can only get you so far. When the stuff hits the fan its all about monitoring and response. Its hard to get the mindset to build security into systems up front and harder still to get people to think about building visibility in, but that's the best shot at mapping auditable events to the customers, users, identities, apps and data that you care about. To those about to audit log, we salute you.
OMG ... so they had no forensic data to review because it was lost in the blast. You forget about the possibility of complete data loss at the time of the incident if you leave data at the local collector for any length of time. This is a good example of why you want to get the data off the device ASAP, and consider redundancy in tiered data collection models. Log poisoning, explosion, whatever.
Posted by: Adrian Lane | June 09, 2010 at 09:14 PM
One thing I noticed years ago about logging and process validation is that developers push error checking to the left and business logic to the right.
That is they try to check everything at the front door whilst doing business out the backdoor with presumed OK input...
The results if you are lucky is two (or more) logs that just don't hang together in a usefull way for diagnoses or measurment.
Worse the logged event usually only has mangled data from previous processing and lacks the original input so little information is available to developers and maintanence staff to trap and walk through an event.
A simple solution is "one step back logging", that also has the original input kept along with the last ok change to the input by the previous stage in the process chain of the business logic to that which is actually currently processing the input (ie stage n-1).
Thus if the current processing stage (stage n) either barfs or does not notify the previous stage (n-1) via exception handaling that it has processed the input properly the previous stage can log both the original input and the input that caused the current stage to barf. Thus the fault (if determanistic) can be followed through the system.
But more importantly it also it alows you to "roll back" on a fault exception, which is way more important these days than it used to be.
If for instance you have a distibuted system (front middle back end etc) if something occurs like the coms link goes down between two parts of the system then this can be switched to an alternative system etc...
This method of dealing with faults migrates from single systems to fault tolerant distributed systems relativly painlessly as the designers and programers are effectivly forced from the outset to do it in a manner that will scale.
As old as the idea is (goes back to when objects where decidedly odd ;) it still appears new to most designers / programers who never seem to consider that they might on exception roll the process back and go down a diferent path.
Posted by: Clive Robinson | June 11, 2010 at 06:25 AM
As for BP it's a long while since I was rig hopping.
Old habits die hard one of which is the assumption of "minimal coms" this is despite the likes of the Alexander Keelan, Piper Alfa and a few other disasters. Jackets are viewd as "vessels" not "plant" even though the "captin" is known as the Offshore Instalation Manager (OIM).
I must admit I have a certain degree of sympathy for their position they are just one of a number of very deep sea drillers and have to be competative. and they are all broadly the same thus it could just have easily been a US or other countries oil company in the same position.
Unfortunatly the oil industry is a little like NASA and the space shuttle, they regard each non accident day as proof they are doing it right, not that the dice have not landed on a one.
What is odd is that it appears (from what little has realy been said) that the blow out preventer failed. This is generaly a standard bit of kit and usualy is designed to work in a fail safe manner.
The fact that it has failed due to currently unknown reasons is quite worrying, because of it's standard design that is used not just for very deep sea drilling but all offshore drilling.
What has not helped is the political rhetoric of "Drill baby Drill" that pervaded the US Presidential campaign. It is an indicator of the endemic view point of "get it up and out cheep" so the SUV's can keep rolling on a buck a gallon.
God alone knows what will happen when the antarctic treaty ends and we start drilling in realy hostile waters down there.
Posted by: Clive Robinson | June 11, 2010 at 06:53 AM