February 26, 2015

Three Types of Tokens

One of the most annoying administrative issues in Keystone is The MySQL backend to the token database filling up. While we have a flush scrit, it needs to be scheduled via cron. Here is a short over view of the types of tokens, why the backend is necessary, and what is being done to mitigate the problem.

DRAMATIS PERSONAE:

Amanda: The companies OpenStack system Admin

Manny: The IT manager.

ACT 1 SCENE1: Small conference room. Manny has called a meeting with Amanda.

Manny: Hey Amanda, What are these keystone tokens and why are they causing so many problems?

Amanda: Keystone tokens are an opaque blob used to allow caching of an authentication event and some subset of the authorization data associated with the user.

Manny: OK…backup. what does that mean?

Amanda: Authentication means that you prove that you are who you claim to be. For the most of OpenStack’s history, this has meant handing over a symmetric secret.

Manny: And a symmetric secret is …?

Amanda: A password.

Manny:Ok Got it. I hand in my password to prove that I am me. What is the authorization data?

Amanda: In OpenStack, it is the username and the user’s roles.

Manny: All their roles?

Amanda: No. only for the scope of the token. A token can be scoped to a project. Also to a domain, but in our set up, only I ever need a domain scoped token.

Manny: The domain is how I select between the customer list and our employees out of our LDAP server, right?

Amanda: Yep. There is another domain just for admin tasks, too. It has the service users for Nova and so on.

Manny: OK, so I get a token, and I can see all this stuff?

Amanda: Sort of. For most of the operation we do, you use the “openstack” command. That is the common command line, and it hides the fact that it is getting a token for most operations. But you can actually use a web tool called curl to go direct to the keystone server and request a token. I do that for debugging sometimes. If you do that, you see the body of the token data in the response. But that is different from being able to read the token itself. The token is actually only 32 characters long. It is what is known as a UUID.

Manny (slowly): UUID? Universally Unique Identifier. Right?

Amanda: Right. Its based on a long random number generated by the operating system. UUIDs are how most of OpenStack generates remote identifiers for VMs, images, volumes and so on.

Manny: Then the token doesn’t really hold all that data?

Amanda: It doesn’t. The token is just a…well, a token.

Manny: Like we used to have for the toll machines on route 93. Till we all got Easy pass!

Amanda: Yeah. Those tokens showed that you had paid for the trip. For OpenStack, a token is a remote reference to a subset of your user data. If you pass a token to Nova, it still has to go back to Keystone to validate the token. When it validates the token, it gets the data. However, our OpenStack deployment is so small, Nova and Keystone are on the same machine. Going back to Keystone does not require a “real” network round trip.

Manny: So when now that we are planning on going to the multi host set up, validating a token will require a network round trip?

Amanda: Actually, when we move to the multi-site, we are going to switch over to a different form of token that does not require a network round trip. And that is where the pain starts.

Manny: These are the PKI tokens you were talking about in the meeting?

Amanda: Yeah.

Manny: OK, I remember the term PKI was Public Key…something.

Amanda: The I is for infrastructure, but you remembered the important part.

Manny: Two keys, Public versus private: you encode with one and decode with the other.

Amanda: Yes. In this case, it is the token data that is encoded with private key, and decode with the public key.

Manny: I thought that made it huge. Do you really encode all the data?

Amanda: No, just a signature of the data. A Hash. This is called message signing, and it is used in a lot of places, basically to validate that the message is both unchanged and that it comes from the person you think it comes from.

Manny: OK, so…what is the pain.

Amanda: Two things. One, the tokens are bigger, much bigger, than a UUID. They have all of the validation data in them. To include the service catalog. And our service catalog is growing on the multi-site deployment, so we’ve been warned that the tokens might get so big that it causes problems.

Manny: Let’s come back to that. What is the other problem?

Amanda: OK…since a token is remotely validated, there is the possibility that something hass changed on Keystone, and the token is no longer valid. With our current system, Keystoen knows this immediately, and just dumps the token. So When Nova comes to validate it, its no longer valid and the user has to get another token. With remove validation, Nova has to periodically request a list of revoked tokens.

Manny: So either way Keystone needs to store data. What is the problem?

Amanda: Well, today we store our tokens in Memcached. Its a simple Key value store, its local to the Keystone instance, and it just dumps old data that hasn’t been used in a while. With revocations, if you dump old data, you might lose the fact that a token was revoked.

Manny: Effectively un-revoking that token?

Amanda: Yep.

Manny: OK…so how do we deal with this?

Amanda: We have to move from storing token in Memcached to MySQL. According to the docs and upstream discussions, this can work, but you have to be careful to schedule a job to clean up the old tokens, or you can fill up the token database. Some of the larger sites have to run this job very frequently.

Manny: Its a major source of pain?

Amanda: It can be. We don’t think we’ll be at that scale at the multisite launch, but it might happen as we grow.

Manny: OK, back to the token size thing, then. How do we deal with that?

Amanda: OK, when we go multi-site, we are going to have one of everything at each site: Keystone, Nova, Neutron, Glance. We have some jobs to synchronize the most essential things like the glance images and the customer database, but the rest is going to kept fairly separate. Each will be tagged as a region.

Manny: So the service catalog is going to be galactic, but will be sharded out by Keystone server?

Amanda: Sort of. We are going to actually make it possible to have the complete service catalog in each keystone server, but there is an option in Keystone to specify a subset of the catalog for a given project. So when you get a token, the service catalog will be scoped down to the project in question. We’ve done some estimates of size and we’ll be able to squeak by.

Manny: So, what about the multi-site contracts? Where a company can send there VMs to either a local or remote Nova?

Amanda: for now they will be separate projects. But for the future plans where we are going to need to be able to put them in the same project, we are stuck.

Manny: Ugh. We can’t be the only people with this problem.

Amanda: Some people are moving back to UUID tokens, but there are issues both with replication of the token database and also with cross site network traffic. But there is some upstream work that sounds promising to mitigate that.

Manny: The lightweight thing?

Amanda: Yeah, lightweight tokens. Its backing off the remotely validated aspect of Keystone tokens, but doesn’t need to store the tokens themselves. They use a scheme called Authorized Encryption which puts a minimal amount of info into the token to be able to recreate the whole authorization data. But only the Keystone server can expand that data. Then, all that needs to be persisted is the revocations.

Manny: Still?

Amanda: Yeah, and there are all the same issues there with flushing of data, but the scale of the data is much smaller. Password changes and removing roles from users are the ones we expect to see the most. We still need a cron job to flush those.

Manny: No silver bullet, eh? Still how will that work for multisite?

Amanda: Since the token is validated by cryptography, the different sites will need to synchronize the keys. There was a project called Kite that was part of Keystone, and then it wasn’t, and then it was again, but it is actually designed to solve this problem. So all of the Keystone servers will share their keys to validate tokens locally.

Manny: We’ll still need to synchronize the revocation data?

Amanda: No silver bullet.

Manny: Do we really need the revocation data? What if we just … didn’t revoke. Made the tokens short lived.

Amanda: Its been proposed. The problem is that a lot of the workflows were being built around the idea of long lived tokens. The Tokens went from 24 hours valid to 1 hour valid by default, and that broke some things. Some people have had to crank the time back up again. We think we might be able to get away with shorter tokens, but we need to test and see what it breaks.

Manny: Yeah, I could see HA having a problem with that…wait, 24hours…how does heat do what it needs to. It can restart a machine a mong afterwards. DO we just hand over the passwords to HEAT?

Amanda: Heh..used to,. But Heat uses a delegation mechanism called trusts. A user creates a trust, and that effectively says that Heat can do something on the users behalf, but Heat has to get its own token first. It first proves that it is Heat, and then it uses the trust to get a token on the users behalf.

Manny: So…trusts should be used everywhere?

Amanda: Something like trusts, but more lightweight. Trusts are deliberate delegation mechanisms, and a re set op on a per user bases. TO really scale, it would have to be something where the admin set up the delegation agreement as a template. If that were the case, then these long lived work flows would not need to use the same token.

Manny: And we could get rid of the revocation events. OK, that is time, and I have a customer meeting. Thanks.

Amanda: No problem.

EXIT

The Sax Doctor
Distinctive ring that holds the bell of a Selmer Paris Saxophone to the main body of the horn.

Ring of a Selmer Paris Saxophone

Dropped my Sax off at Emilio Lyon’s house and workshop. My folks bought it for me from him at Rayburn Music in Boston back when I was a High School Freshman. I still remember him pointing to the sticker on it that indicated “This is my work.”

As someone who loves both the saxophone and working with my hands, I have to admit I was looking forward to meeting him. I was even a little nervous. He has a great reputation. Was he going to chastise me for the state of my horn? It hadn’t been serviced in…way too long. I was a little worried that the lack of changing the oil on the rods would have worn down some of the metal connections.

I spent the best forty minutes in his workshop as he explained what the horn needed; an overhaul, which meant pulling all the pads off and replacing them. I expected this.

I showed him how the bottom thumb rest didn’t fit my hand right…it was the stock Selmer piece, and it had always cut into my thumb a little. He had another from a different, older horn, that was brass. He shaped it with a hammer and .. it felt good. Very good. He gave me that piece and kept mine, in case it would work for someone else.

There was another minor issue with the left thumb catching a rough spot near the thumb rest and he covered it with some epoxy. Not magic, but a magic touch none-the-less.

To say he told me it needed an overhaul doesn’t do it justice. He explained how he would do it, step by step, especially the task of setting the pads. I know from elsewhere that this is real artistry, and takes years of experience to get right. He talked about taking the springs off and leaving the pads resting on the cups. I asked why he took the springs off?

Its about touch. You shouldn’t work hard to cover the holes of the saxophone, it should be light. Emilio understands how the top saxophonists in the world interact with their horns. He talked about advising Sonny Rollins to use a light touch “how he would like playing the horn better. Sonny tried for a week and then called back to apologize: he just couldn’t play that way. We talked about how other people played..Joe Lovano and George Garzone, heavy. David Sanborne, very light.

The corks and felts all need to be replaced. He has a new technique, I had heard about, using little black rubber nubs. He showed me, and how quiet they were. “I never liked the base.” I think he meant the base set of padding that came with the Selmers.

He assured me that the metal was fine. This is a good horn. A good number.

He quoted me the price for the overhaul. I told was less than other people had quoted me.

He didn’t take any contact info, told me to contact him. I get my horn back in two weeks. I’ll make due with playing my beat on student horn alto and EWI.

I’m really looking forward to getting back my Selmer Mark VI with the overhaul from Emilio. Will it play like new? I don’t know, the horn was at least 6 years old by the time I was born, and 20 when I first played it.

But I suspect it will play better than new.

February 25, 2015

Common Criteria

What is Common Criteria?

Common Criteria (CC) is an international standard (ISO/IEC 15408) for certifying computer security software. Using Protection Profiles, computer systems can be secured to certain levels that meet requirements laid out by the Common Criteria. Established by governments, the Common Criteria treaty has been signed by 17 countries, and each country recognizes the other’s certifications.

In the U.S., Common Criteria is handled by the National Information Assurance Partnership (NIAP). Other countries have their own CC authorities. Each authority certifies CC labs, which do the actual work of evaluating products. Once certified by the authority, based on the evidence from the lab and the vendor, that certification is recognized globally.

Your certification is given a particular assurance level which, roughly speaking, represents the strength of the certification. Confidence is higher at a level EAL4 than at EAL2 for a certification. Attention is usually given to the assurance level, instead of what, specifically, you’re being assured of, which is the protection profiles.
CC certification represents a very specific set of software and hardware configurations. Software versions and hardware model and version is important as differences will break the certification.

How does the Common Criteria work?

The Common Criteria authority in each country creates a set of expectations for particular kinds of software: operating systems, firewalls, and so on. Those expections are called Protection Profiles. Vendors, like Red Hat, then work with a third-party lab to document how we meet the Protection Profile. A Target of Evaluation (TOE) is created which is all the specific hardware and software that’s being evaluated. Months are then spent in the lab getting the package ready for submission. This state is known as “in evaluation”.
Once the package is complete, it is submitted to the relevant authority. Once the authority reviews and approves the package the product becomes “Common Criteria certified” for that target.

Why does Red Hat need or want Common Criteria?

Common Criteria is mandatory for software used within the US Government and other countries’ government systems. Other industries in the United States may also require Common Criteria. Because it is important to our customers, Red Hat spends the time and energy to meet these standards.

What Red Hat products are Common Criteria certified?

Currently, Red Hat Enterprise Linux (RHEL) 5.x and 6.x meet Common Criteria in various versions. Also, Red Hat’s JBoss Enterprise Application Platform 5 is certified in various versions. It should be noted that while Fedora and CentOS operating systems are related to RHEL, they are not CC certified. The Common Criteria Portal provides information on what specific versions of a product are certified and to what level. Red Hat also provides a listing of all certifications and accreditation of our products.

Are minor releases of RHEL certified?

When a minor release, or a bug fix, or a security issue arises, most customers will want to patch their systems to remain secure against the latest threats. Technically, this means falling out of compliance. For most systems, the agency’s Certifying Authority (CA) requires these updates as a matter of basic security measures. It is already understood that this breaks CC.

Connecting Common Criteria evaluation to a specific minor versions is difficult, at best, for a couple of reasons:

First, the certifications will never line up with a particular minor version exactly. A RHEL minor version is, after all, just a convenient waypoint for what is actually a constant stream of updates. The CC target, for example, began with RHEL 6.2, but the evaluated configuration will inevitably end up having packages updated from their 6.2 versions. In the case of FIPS, the certifications aren’t tied to a RHEL version at all, but to the version of the certified package. So OpenSSH server version 5.3p1-70.el6 is certified, no matter which minor RHEL version you happen to be using.

This leads to the second reason. Customers have, in the past, forced programs to stay on hopelessly outdated and unpatched systems only because they want to see /etc/redhat-release match the CC documentation exactly. Policies like this ignore the possibility that a certified package could exist in RHEL 6.2, 6.3, 6.4, etc., and the likelihood that subsequent security patches may have been made to the certified package. So we’re discouraging customers from saying “you must use version X.” After all, that’s not how CC was designed to work. We think CC should be treated as a starting point or baseline on which a program can conduct a sensible patching and errata strategy.

Can I use a product if it’s “in evaluation”?

Under NSTISSP #11, government customers must prefer products that have been certified using a US-approved protection profile. Failing that, you can use something certified under another profile. Failing that, you must ensure that the product is in evaluation.

Red Hat has successfully completed many Common Criteria processes so “in evaluation” is less uncertain than it might sound. When a product is “in evaluation”, the confidence level is high that certification will be awarded. We work with our customers and their CAs and DAAs to help provide information they need to make a decision on C&A packages that are up for review.

I’m worried about the timing of the certification. I need to deploy today!

Red Hat makes it as easy as possible for customers to use the version of Red Hat Enterprise Linux that they’re comfortable with. A subscription lets you use any version of the product as long as you have a current subscription. So you can buy a subscription today, deploy a currently certified version, and move to a more recent version once it’s certified–at no additional cost.

Why can’t I find your certification on the NIAP website?

Red Hat Enterprise Linux 6 was certified by BSI under OS Protection Profile at EAL4+. This is equivalent to certifying under NIAP under the Common Criteria mutual recognition treaties. More information on mutual recognition can be found on the CCRA web site. That site includes a list of the member countries that recognize one another’s evaluations.

How can I keep my CC-configured system patched?

A security plugin for the yum update tool allows customers to only install patches that are security fixes. This allows a system to be updated for security issues while not allowing bug fixes or enhancements to be installed. This makes for a more stable system that also meets security update requirements.

To install the security plugin, from a root-authenticated prompt:

# yum install yum-plugin-security
# yum updateinfo
# yum update --security

Once security updates have been added to the system, the CC-evaluated configuration has changed and the system is no longer certified.  This is the recommended way of building a system: starting with CC and then patching in accordance with DISA regulations. Consulting the CA and DAA during the system’s C&A process will help establish guidelines and expectations.

You didn’t answer all my questions. Where do I go for more help?

Red Hat Support is available anytime a customer, or potential customer, has a question about a Red Hat product.

Additional Reading

February 24, 2015

What Can We Do About Superfish?

Perhaps the greatest question about Superfish is what can we do about it. The first response is to throw technology at it.

The challenge here is that the technology used by Superfish has legitimate uses:

  • The core Superfish application is interesting – using image analysis to deconstruct a product image and search for similar products is actually quite ingenious! I have no reservations about this if it is an application a user consciously selects and installs and deliberately uses.
  • Changing the html data returned by a web site has many uses – for example, ad blocking and script blocking tools change the web site. Even deleting tracking cookies can be considered changing the web site! Having said that, changing the contents of a web site is a very slippery slope. And I have real problems with inserting ads in a web site or changing the content of the web site without making it extremely clear this is occurring.
  • Reading the data being exchanged with other sites is needed for firewalls and other security products.
  • Creating your own certificates is a part of many applications. However, I can’t think of many cases where it is appropriate to install a root certificate – this is powerful and dangerous.
  • Even decrypting and re-encrypting web traffic has its place in proxies, especially in corporate environments.

The real problem with Superfish is how the combination of things comes together and is used. And quality of implementation – many reports indicate poor implementation practices, such as a single insecure password for the entire root certificate infrastructure. It doesn’t matter what encryption algorithm you are using if your master password is the name of your company!

Attempting a straight technology fix will lead to “throwing the baby out with the bath water” for several valuable technologies. And a technical fix for this specific case won’t stop the next one.

The underlying issue is how these technologies are implemented and used. Attempting to fix this through technology is doomed to failure and will likely make things worse.

Yes, there is a place for technology improvements. We should be using dnssec to make sure dns information is valid. Stronger ways of validating certificate authenticity would be valuable – someone suggested DANE in one of the comments. DANE involves including the SSL certificate in the dns records for a domain. In combination with dnssec it gives you higher confidence that you are talking to the site you think you are, using the right SSL certificate. The issue here is that it requires companies to include this information in their dns records.

The underlying questions involve trust and law as well as technology. To function, you need to be able to trust people – in this case Lenovo – to do the right thing. It is clear that many people feel that Lenovo has violated their trust. It is appropriate to hold Lenovo responsible for this.

The other avenue is legal. We have laws regulate behavior and to hold people and companies responsible for their actions. Violating these regulations, regardless of the technology used, can and should be addressed through the legal system.

At the end of the day, the key issues are trust, transparency, choice, and following the law. When someone violates these they should expect to be held accountable and to pay a price in the market.


February 23, 2015

Samba vulnerability (CVE-2015-0240)

Samba is the most commonly used Windows interoperability suite of programs, used by Linux and Unix systems. It uses the SMB/CIFS protocol to provide a secure, stable, and fast file and print services. It can also seamlessly integrate with Active Directory environments and can function as a domain controller as well as a domain member (legacy NT4-style domain controller is supported, but the Active Directory domain controller feature of Samba 4 is not supported yet).

CVE-2015-0240 is a security flaw in the smbd file server daemon. It can be exploited by a malicious Samba client, by sending specially-crafted packets to the Samba server. No authentication is required to exploit this flaw. It can result in remotely controlled execution of arbitrary code as root.

We believe code execution is possible but we’ve not yet seen any working reproducers that would allow this.

This flaw arises because of an uninitialized pointer is passed to the TALLOC_FREE() funtion. It can be exploited by calling the ServerPasswordSet RPC api on the NetLogon endpoint, by using a NULL session over IPC.
Note: The code snippets shown below are from samba-3.6 shipped with Red Hat Enterprise Linux 6. (All versions of samba >= 3.5 are affected by this flaw)
In the _netr_ServerPasswordSet() function, cred is defined as a pointer to a structure. It is not initialized.

1203 NTSTATUS _netr_ServerPasswordSet(struct pipes_struct *p,   
1204                          struct netr_ServerPasswordSet *r)   
1205 {   
1206         NTSTATUS status = NT_STATUS_OK;   
1207         int i;  
1208         struct netlogon_creds_CredentialState *creds;

Later netr_creds_server_step_check() function is called with cred at:

1213  status = netr_creds_server_step_check(p, p->mem_ctx,   
1214                               r->in.computer_name,   
1215                               r->in.credential,   
1216                               r->out.return_authenticator,   
1217                               &creds);

If netr_creds_server_step_check function fails, it returns and cred is still not initialized. Later in the _netr_ServerPasswordSet() function, cred is freed using the TALLOC_FREE() function which results in an uninitialized pointer free flaw.
It may be possible to control the value of creds, by sending a number of specially-crafted packets. Later we can use the destructor pointer called by TALLOC_FREE() to execute arbitrary code.

As mentioned above, this flaw can only be triggered if netr_creds_server_step_check() fails. This is dependent on the version of Samba used.

In Samba 4.1 and above, this crash can only be triggered after setting “server schannel = yes” in the server configuration. This is due to the
adbe6cba005a2060b0f641e91b500574f4637a36
commit, which introduces NULL initialization into the most common code path. It is still possible to trigger an early return with a memory allocation failure, but that is less likely to occur. Therefore this issue is more difficult to exploit. Red Hat Product Security team has rated this flaw as having important impact on Red Hat Enterprise Linux 7.

In older versions of Samba (samba-3.6 as shipped with Red Hat Enterprise Linux 5 and 6. samba-4.0 as shipped with Red Hat Enterprise Linux 6) the above mentioned commit does not exist. An attacker could call _netr_ServerPasswordSet() function with a NULLED buffer, which could trigger this flaw. Red Hat Product Security has rated this flaw as having critical impact on all other versions of samba package shipped by Red Hat.

Lastly the version of Samba 4.0 shipped with Red Hat Enterprise Linux 6.2 EUS is based on an alpha release of Samba 4, which lacked the password change functionality and thus the vulnerability. The same is true for the version of Samba 3.0 shipped with Red Hat Enterprise Linux 4 and 5.

Red Hat has issued security advisories to fix this flaw and instructions for applying the fix are available on the knowledgebase.  This flaw is also fixed in Fedora 20 and Fedora 21.

February 21, 2015

Superfish – Man-in-the-Middle Adware

Superfish has been getting a lot of attention – the Forbes article is one of the better overviews.

Instead of jumping in and covering the details of Superfish, let’s look at how it might work in the real world.

Let’s say that you are looking for a watch and you visit Fred’s Fine Watches. Every time you want to look at a watch, someone grabs the key to the cabinet from Fred, uses a magic key creator to create a new key, opens the cabinet, grabs the watch from Fred, studies the watch, looks for “similar” watches, and jams advertising fliers for these other watches in your face – right in the middle of Fred’s Fine Watches! Even worse, they leave the key in the lock, raising the possibility that others could use it. Further, if you decide to buy a watch from Fred, they grab your credit card, read it, and then hand it to Fred.

After leaving Fred’s Fine Watches you visit your bank. You stop by your doctor’s office. You visit the DMV for a drivers license renewal. And, since this article is written in February, you visit your accountant about taxes. Someone now has all this information. They claim they aren’t doing anything with it, but there is no particular reason to trust them.

How does all this work? Superfish is a man-in-the-middle attack that destroys the protection offered by SSL (Secure Sockets Layer). It consists of three basic components: the Superfish adware program, a new SSL Root Certificate inserted into the Windows Certificate Store, and a Certificate Authority program that can issue new certificates.

SSL serves two purposes: encryption and authentication. SSL works by using a certificate that includes a public encryption key that is used to negotiate a unique encryption key for each session. This encryption key is then use to uniquely encrypt all traffic for that session. There are two types of SSL certificates: public and private.

Public certificates are signed. This means that they can be verified by your browser as having been created from another certificate – you have at least some assurance of where the certificate came from. That certificate can then be verified as having been created from another certificate. This can continue indefinitely until you reach the top of the certificate tree, where you have a master or root certificate. These root certificates can’t be directly verified and must be trusted.

Root certificates are connected to Internet domains. For example, Google has the google.com root certificate, and is the only one who can create a signed certificate for mail.google.com, maps.google.com, etc.

Bills Browser Certificates, Inc., can only create signed certificates for billsbrowsercertificates.com. The details are a bit more complex, but this is the general idea – signed certificates can be traced back to a root certificate. If the owner of that root certificate is cautious, you can have a reasonable level of trust that the certificate is what it claims to be.

Your browser or OS comes with a (relatively small) list of root certificates that are considered trusted. Considerable effort goes into managing these root certificates and ensuring that they are good. Creation of new signed certificates based on these root certificates is tightly controlled by whoever owns the root certificate.

Certificate signing is a rather advanced topic. Let’s summarize it by saying that the mathematics behind certificate signing is sound, that implementations may be strong or weak, and that there are ways of over-riding the implementations.

Private certificates are unsigned. They are the same as public certificates, work in exactly the same way, but can’t be verified like public certificates can. Private certificates are widely used and are a vital part of communications infrastructure.

According to reports, Lenovo added a new Superfish root certificate to the Microsoft Certificate Store on certain systems. This means that Superfish is trusted by the system. Since Superfish created this certificate, they had all the information that they needed to create new signed certificates. Which they did by including a certificate authority program which creates new certificates signed by the Superfish root certificate – on your system while you are browsing. These certificates are completely normal, and there is nothing unusual about them – except the way they were created.

Again, according to reports, Superfish hijacked web sessions. Marc Rogers shows an example where Superfish has created a new SSL certificate for Bank of America. The way it works is that Superfish uses this certificate to communicate with the browser and the user. The user sees an https connection to Bank of America, with no warnings – there is, in fact, a secure encrypted session in place. Unfortunately, this connection is to Superfish. Superfish then uses the real Bank of America SSL certificate to communicate with Bank of America. This is a perfectly normal session, and BOA has no idea that anything is going on.

To recap, the user enters their bank id and password to login to the BOA site. This information is encrypted – and sent to Superfish. Superfish decrypts the information and then re-encypts it to send to BOA using the real BOA SSL certificate. Going the other way, Superfish receives information from BOA, decrypts it, reads it, re-encrypts it with the Superfish BOA certificate, and sends it back to you.

Superfish apparently creates a new SSL certificate for each site you visit. The only reason that all this works is that they were able to add a new root certificate to the certificate store – without this master certificate in the trusted certificate store they would not be able to create new trusted certificates.

Superfish can also change the web page you receive – this is the real purpose of of Superfish. In normal operation Superfish will modify the web page coming back from the web site you are visiting by inserting new ads. Think about it – you have no idea of what the original web site sent, only what Superfish has decided to show you!

Superfish is sitting in the middle of all your web sessions. It reads everything you send, sends arbitrary information to an external server (necessary for the image analysis it claims to perform, but can be used for anything), forges encryption, and changes the results you get back.

The real threat of Superfish is that it contains multiple attack vectors and, by virtue of the root certificate, has been granted high privileges. Further, the private key Superfish is using for their root certificate has been discovered, meaning that other third parties can create new signed certificates using the Superfish root certificate. There is no way to do secure browsing on a system with Superfish installed. And no way to trust the results of any browsing you do, secure or not.


February 19, 2015

RC4 prohibited

Originally posted on securitypitfalls:

After nearly half a year of work, the Internet Engineering Task Force (IETF) Request for Comments (RFC) 7465 is published.

What it does in a nutshell is disallows use of any kind of RC4 ciphersuites. In effect making all servers or clients that use it non standard compliant.

View original


February 17, 2015

SCAP Workbench

SCAP Workbench allows you to select SCAP benchmarks (content) to use, tailor an SCAP scan, run an SCAP scan on a local or remote system, and to view the results of a scan. The SCAP Workbench page notes:

The main goal of this application is to lower the initial barrier of using SCAP. Therefore, the scope of very narrow – scap-workbench only scans a single machine and only with XCCDF/SDS (no direct OVAL evaluation). The assumption is that this is enough for users who want to scan a few machines and users with huge amount of machines to scan will just use scap-workbench to test or hand-tune their content before deploying it with more advanced (and harder to use) tools like ​spacewalk.

SCAP Workbench is designed to hide the complexity of the SCAP tools and CLI. I can vouch for the ease of use of SCAP Workbench – I’ve been using it to run SCAP and find it the easiest and most flexible way to perform SCAP scans.

SCAP Workbench is an excellent tool for tailoring SCAP benchmarks. SCAP Workbench allows you to select which Benchmark to use, and then displays a list of all the rules in the Benchmark, allowing you to select which rules to evaluate.

SCAP Workbench Tailoring
In addition, SCAP Workbench allows you to modify values in the Benchmark. In the screenshot above you see list of rules. The Set Password Expiration Parameters rule is selected and has been expanded so that we can see the various components of this rule. We have selected the minimum password length rule, and can see the details of this rule on the right side of the window.

We see the title of this rule, the unique identifier for the rule, and the type of this rule. Since this as an xccdf:Value rule, it has an explicit value that will be checked. Since this rule is checking the minimum password length, the minimum password length must be set to this value or larger.

We see that the minimum password length in the Benchmark is 12. We can change this to another value, such as 8 characters. If we change the minimum password length check, the change will be saved in the Tailoring File – the Benchmark is not modified.

After selecting the SCAP Rules you wish to evaluate and modifying values as needed you run the scan by clicking on the SCAN button. The SCAP Scan is run and results displayed in the SCAP Workbench. You can also see the full SCAP report by clicking on the Show Report button, or save the full report by clicking Save Results.


February 14, 2015

Adding an LDAP backed domain to a Packstack install

I’ve been meaning to put all the steps together to do this for a while:

Got an IPA server running on Centos7
Got a Packstack all in one install on Centos 7. I registered this host as a FreeIPA client, though that is not strictly required.

Converted the Horizon install to be domain aware by editing /etc/openstack-dashboard/local_settings

OPENSTACK_API_VERSIONS = {
    "identity": 3
}
OPENSTACK_KEYSTONE_MULTIDOMAIN_SUPPORT = True
OPENSTACK_KEYSTONE_DEFAULT_DOMAIN = 'Default'

And restarting HTTPD.

sudo yum install python-openstackclient

The keystonerc_admin file is set for V2.0 of the identity API. To make it work with the v3 api, cp keystonerc_admin keystonerc_admin_v3 and edit:

export OS_AUTH_URL=http://10.10.10.25:5000/v3/
export OS_IDENTITY_API_VERSION=3
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default

And confirm:

$ openstack domain list
+----------------------------------+------------+---------+----------------------------------------------------------------------+
| ID                               | Name       | Enabled | Description                                                          |
+----------------------------------+------------+---------+----------------------------------------------------------------------+
| default                          | Default    | True    | Owns users and tenants (i.e. projects) available on Identity API v2. |
+----------------------------------+------------+---------+----------------------------------------------------------------------+

Add a domain for the LDAP backed domain like this:

$ openstack domain create YOUNGLOGIC
+---------+----------------------------------+
| Field   | Value                            |
+---------+----------------------------------+
| enabled | True                             |
| id      | a9569e236912496f9f001e73fc978baa |
| name    | YOUNGLOGIC                       |
+---------+----------------------------------+

To Enable domain specific backends, edit /etc/keystone.conf like this:

[identity]
domain_specific_drivers_enabled=true
domain_config_dir=/etc/keystone/domains

Right now this domain is backed by SQL for Both Identity and Assignments. To convert it to LDAP for Identity, create a file in /etc/keystone/domains. This directory and file needs to be owned by the Keystone user:

Here is my LDAP specific file /etc/keystone/domains/keystone.YOUNGLOGIC.conf

# The domain-specific configuration file for the YOUNGLOGIC domain

[ldap]
url=ldap://ipa.younglogic.net
user_tree_dn=cn=users,cn=accounts,dc=younglogic,dc=net
user_id_attribute=uid
user_name_attribute=uid
group_tree_dn=cn=groups,cn=accounts,dc=younglogic,dc=net


[identity]
driver = keystone.identity.backends.ldap.Identity

Restart Keystone. Juno RDO has Keystone running in Eventlet still, so use:

sudo systemctl restart openstack-keystone.service

Now, grant the admin user an admin role on the new domain:

openstack role add --domain YOUNGLOGIC --user  admin admin

Like most things, I did my initial test using curl:

{
    "auth": {
        "identity": {
            "methods": [
                "password"
            ],
            "password": {
                "user": {
                    "domain": {
                        "name": "YOUNGLOGIC"
                    },
                    "name": "edmund",
                    "password": "nottellingyou"
                }
            }
        }
    }
}
curl -si -d @token-request-edmund.json -H "Content-type: application/json" $OS_AUTH_URL/auth/tokens

That requests an unscoped token. It has the side effect of populating the id_mappings for the user and group ids from the user that connects.

You can then assign roles to the user like this:

openstack role add --project demo --user  9417d7b6e7d53d719106b192251e7b9560577b9c1709463a19feffdd30ea7513 _member_

I have to admit I cheated: I looked at the database:

$   echo "select * from id_mapping;"  |  mysql keystone  --user keystone_admin --password=stillnottelling 
public_id	domain_id	local_id	entity_type
7c3448d7dc5f51861444a7f974bc63c77a2685a057d8545341a1fbcafd754b96	a9569e236912496f9f001e73fc978baa	ayoung	user
9417d7b6e7d53d719106b192251e7b9560577b9c1709463a19feffdd30ea7513	a9569e236912496f9f001e73fc978baa	edmund	user
862caa65329a761556ded5e6317f3f0cbfcab839f01340b334bdd2be4e54f1c4	a9569e236912496f9f001e73fc978baa	ipausers	group
b14ecd0d21ffb1485261ffce9c95b2cf8fec3d8194bfcda4257bb0ac74f80b0e	a9569e236912496f9f001e73fc978baa	peter	user
4e2abec872c7559279612abd2834ba1a3919aad9c01035cb948a91241a831830	a9569e236912496f9f001e73fc978baa	wheel	group

But with that knowledge I can do:

openstack role add --project demo --group  862caa65329a761556ded5e6317f3f0cbfcab839f01340b334bdd2be4e54f1c4 _member_

And now every user in the ipausers group gets put in the demo project. This works upon first login to Horizon.

February 12, 2015

Debugging OpenStack with rpdb

OpenStack has many different code bases.  Figuring out how to run in a debugger can be maddening, especially if you are trying to deal with Eventlet and threading issues.  Adding HTTPD into the mix, as we did for Keystone, makes it even trickier.  Here’s how I’ve been handling things using the remote pythong debugger (rpdb).

rpdb is a remote python debugging tool.  To use it, you edit the python source and add the line

import  rpdb;  rpdb.set_trace()

and execute your program.  The first time that code gets hit, the program will stop on that line, and then open up a socket. You can connect to the socket with:

telnet localhost 4444

You can replace localhost with a ip address or the hostname.

In order to use this from within the unit tests run from tox on keystone client, for example,  you first need to get rpdb into the venv

. .tox/py27/bin/activate
pip install rpdb
deactivate

Note that if you put the rpdb line into code that is executed on multiple tests, the second and subsequent times tests hit that line of code, the rpdb code will report an error binding to the socket that the address is already in use.

I use emacs, and to run the code such that It matches up with the source in my local git repository, I use:

Meta-X pdb

and then I run pdb like this:

telnet localhost 4444

and gud is happy to treat it like a local pdb session.

February 04, 2015

Life-cycle of a Security Vulnerability

Security vulnerabilities, like most things, go through a life cycle from discovery to installation of a fix on an affected system. Red Hat devotes many hours a day to combing through code, researching vulnerabilities, working with the community, and testing fixes–often before customers even know a problem exists.

Discovery

When a vulnerability is discovered, Red Hat engineers go to work verifying the vulnerability and rating it to determine it’s overall impact to a system. This is a most important step as mis-identifying the risk could lead to a partial fix and leave systems vulnerable to a variation of the original problem. This also allows prioritization of fixes so that those issues with the greatest risk to customers are handled first and issues of low or minimal risk are not passed on to customers who also need to invest time in validating new packages for their environment.

Research

Many times a vulnerability is discovered outside of Red Hat’s domain. This means that the vulnerability must be researched and reproduced in-house to fully understand the risk involved. Sometimes reproducing a vulnerability leads to discovering other vulnerabilities which need fixes or re-engineering.

Notification

When a vulnerability has been discovered, Red Hat works with upstream developers to develop and ship a patch that fixes the problem. A CVE assignment will be made that records the vulnerability and links the problem with the fix among all applicable implementations. Sometimes the vulnerability is embedded in other software and that host software would acquire the CVE. This CVE is also used by other vendors or projects that ship the same package with the same code—CVEs assigned to software Red Hat ships are not necessarily Red Hat specific.

Patch development

One of the most difficult parts of the process is the development of the fix. This fix must remedy the vulnerability completely while not introducing any other problems along the way. Red Hat reviews all patches to verify it fixes the underlying vulnerability while also checking for regressions. Sometimes Red Hat must come up with our own patches to fix a vulnerability. When this happens, we fix not only our shipped software, but also provide this fix back upstream for possible inclusion into the master software repository. In other cases, the upstream patch is not applicable because the version of the software we ship is older, and in these cases Red Hat has to backport the patch to the version we do ship. This allows us to minimize any changes exclusively to those required to fix the flaw without introducing and possible regressions or API/ABI changes that could have an impact on our customers.

Quality assurance

As important as patch development, Red Hat’s QE teams validate the vulnerability fix and also check for regressions. This step can take a significant amount of time and effort depending on the package, but any potential delays introduced due to the quality assurance effort is worth it as it significantly reduces any possible risk that the security fix may be incomplete or introduces other regressions or incompatibilities. Red Hat takes the delivery of security fixes seriously and we want to ensure that we get it right the first time as the overhead of re-delivering a fix, not to mention the additional effort by customers to re-validate a secondary fix, can be costly.

Documentation

To make understanding flaws easier, Red Hat spends time to document what the flaw is and what it can do. This documentation is used to describe flaws in the errata that is released and in our public CVE pages. Having descriptions of issues that are easier to understand than developer comments in patches is important to customers who want to know what the flaw is and what it can do. This allows customers to properly assess the impact of the issue to their own environment. A single flaw may have much different exposure and impact to different customers and different environments, and properly-described issues allow customers to make appropriate decisions on how, when, and if the fix will be deployed in their own environment.

Patch shipment

Once a fix has made it through the engineering and verification processes, it is time to send it to the customers. At the same time the fixes are made available in the repositories, a Red Hat Security Advisory (RHSA) is published and customers are notified using the rhsa-announce list. The RHSA will provide information on the vulnerability and will point to errata that more thoroughly explain the fix.

Customers will begin to see updates available on their system almost immediately.

Follow-on support

Sometimes questions arise when security vulnerabilities are made public. Red Hat customers have access to our technical support team that help support all Red Hat products. Not only can they answer questions, but they can also help customers apply fixes.

Conclusion

Handling security issues properly is a complex process that involves a number of people and steps. Ensuring these steps are dealt with correctly and all issues are properly prioritized is one of the things Red Hat provides with each subscription. The level of expertise required to properly handle security issues can be quite high. Red Hat has a team of talented individuals who worry about these things so you don’t have to.

January 29, 2015

The Travelling Saxophone

The Saxophone is a harsh mistress. She demands attention every day. A musician friend once quoted to me: “Skip a day and you know. Skip two days and your friends know. Skip three days and everyone knows.” That quote keeps me practising nightly.

Playing Sax by the Seine

By the Seine: Photo by Jamie Lennox

My work on OpenStack has me travelling a bit more than I have had to for other software projects. While companies have been willing to send me to conferences in the past, only OpenStack has had me travelling four times a year: two for the Summit and two for mid-cycle meetups of the Keystone team. Keeping on a practice schedule while travelling is tough, sometimes impossible. But the nature of the places where I am visiting makes me want to bring along my horn and play there.

The Kilo OpenStack summit was in Paris in November. The thought of playing in Paris took residence in my imagination and wouldn’t leave. I brought the horn along, but had trouble finding a place and a time to play. The third night, I decided that I would skip the scheduled fun and go play in the middle of the Arc de Triomphe, a couple blocks away from my hotel. There is a walkway under the traffic circle with stairs that lead up to the plaza. However, a couple of police stationed at the foot of the stairs made me wonder if playing there would be an issue, and I continued on. As I approached the far end of the walkway, I heard an accordion.

The accordion player spoke no English. I spoke less French. However, his manner indicated he was overjoyed to let me play along with him.

I shut my case to indicate that tips would still be going in his box. I was certainly not playing for the money.

He struck up a tune, and I followed long, improvising. It worked. He next said the single word “Tango” and I kicked started one off with a growl. Another tune, and then he suggested “La Vie en Rouge” and I shrugged. He seemed astounded that I didn’t really know the tune. This would be the equivalent of being in New Orleans and not knowing “When the Saint’s Go Marching In.” I faked it, but I think his enthusiasm waned, and I packed up afterwards and headed back to the hotel.

I got one other chance to play on that trip. Saturday, prior to heading to the airport, Jamie Lennox and I toured a portion of the city, near the Eiffel tower. Again, I wasn’t playing for the money, and I didn’t want to gather crowds. So we headed down to the banks of the Seine and I played near a bridge, enjoying the acoustics of the stone.

The Keystone midcycle happened in January, and I brought my Sax again. This time, I played each night, usually in the courtyard of the hotel or down along the Riverwalk. The Keystone gang joined me one night, after dinner, and it was gratifying to play for people I knew. On the walk back to the Hotel, Dolph and Dave Stanek (maybe others) were overly interested in their cell phones. It turned out they were setting up ww.opensax.com.

Playing by the Riverwalk

Playing by the Riverwalk: Photo by Dolph Matthews

January 28, 2015

Security improvements in Red Hat Enterprise Linux 7

Each new release of Red Hat® Enterprise Linux® is not only built on top of the previous version, but a large number of its components incorporate development from the Fedora distribution. For Red Hat Enterprise Linux 7, most components are aligned with Fedora 19, and with select components coming from Fedora 20. This means that users benefit from new development in Fedora, such as firewalld which is described below. While preparing the next release of Red Hat Enterprise Linux, we review components for their readiness for an enterprise-class distribution. We also make sure that we address known vulnerabilities before the initial release. And we review new components to check that they meet our standards regarding security and general suitability for enterprise use.

One of the first things that happens is a review of the material going into a new version of Red Hat Enterprise Linux. Each release includes new packages that Red Hat has never shipped before and anything that has never been shipped in a Red Hat product receives a security review. We look for various problems – from security bugs in the actual software to packaging issues. It’s even possible that some packages won’t make the cut if they prove to have issues that cannot be resolved in a manner we decide is acceptable. It’s also possible that a package was once included as a dependency or feature that is no longer planned for the release. Rather than leave those in the release, we do our best to remove the unneeded packages as they could result in security problems later down the road.

Previously fixed security issues are also reviewed to ensure nothing has been missed since the last version. While uncommon, it is possible that a security fix didn’t make it upstream, or was somehow dropped from a package at some point during the move between major releases. We spend time reviewing these to ensure nothing important was missed that could create problems later.

Red Hat Product Security also adds several new security features in order to better protect the system.

Before its 2011 revision, the C++ language definition was ambiguous as to what should happen if an integer overflow occurs during the size computation of an array allocation. The C++ compiler in Red Hat Enterprise Linux 7 will perform a size check (and throw std::bad_alloc on failure) if the size (in bytes) of the allocated array exceeds the width of a register, even in C++98 mode. This change affects the code generated by the compiler–it is not a library-level correction. Consequently, we compiled all of Red Hat Enterprise Linux 7 with a compiler version that performs this additional check.

When we compiled Red Hat Enterprise Linux 7, we also tuned the compiler to add “stack protector” instrumentation to additional functions. The GCC compiler in Red Hat Enterprise Linux 6 used heuristics to determine whether a function warrants “stack protector” instrumentation. In contrast, the compiler in Red Hat Enterprise Linux 7 uses precise rules that add the instrumentation to only those functions that need it. This allowed us to instrument additional functions with minimal performance impact, extending this probabilistic defense against stack-based buffer overflows to an even larger part of the code base.

Red Hat Enterprise Linux 7 also includes firewalld. firewalld allows for centralized firewall management using high-level concepts, such as zones. It also extends spoofing protection based on reverse path filters to IPv6, where previous Red Hat Enterprise Linux versions only applied anti-spoofing filter rules to IPv4 network traffic.

Every version of Red Hat Enterprise Linux is the result of countless hours of work from many individuals. Above we highlighted a few of the efforts that the Red Hat Product Security team assisted with in the release of Red Hat Enterprise Linux 7. We also worked with a number of other individuals to see these changes become reality. Our job doesn’t stop there, though. Once Red Hat Enterprise Linux 7 was released, we immediately began tracking new security issues and deciding how to fix them. We’ll further explain that process in an upcoming blog post about fixing security issues in Red Hat Enterprise Linux 7.

January 20, 2015

Running SCAP Scans

OpenSCAP can be run from the command line, but there are easier ways to do it.

OpenSCAP support has been integrated into Red Hat Satellite and into the Spacewalk open source management platform.

Red Hat Satellite has the ability to push SCAP content to managed systems and to run the SCAP audit scans. Red Hat Satellite has the ability to schedule SCAP audit scans and to retrieve the reports and access them through the Red Hat Satellite Audit tab.

If you are going to be using SCAP in production, especially on large numbers of systems, you should really be using a management framework like Red Hat Satellite or Spacewalk.

For development, testing, tuning SCAP benchmarks, and small scale use, the SCAP Workbench is a friendly and flexible tool. We will cover this in more detail in the next post.


January 12, 2015

Security Tests – SCAP Content

While the SCAP technologies are interesting, they have limited value without security content – the actual set of security tests run by SCAP. Fortunately there is a good set of content available that can be used as a starting point.

The US Government has released a set of SCAP content that covers the baseline security required – the United States Government Configuration Baseline (USGCB), which contains the security configuration baselines for a variety of computer products which are widely deployed across federal agencies. USGCB content covers Internet Explorer, Windows, and Red Hat Enterprise Linux Desktop 5.

Also from the US Government is the Department of Defense STIG or Security Technical Implementation Guides. A specific example of this would be downloadable SCAP Content for RHEL 6, the Red Hat 6 STIG Benchmark, Version 1, Release 4.

A number of vendors include SCAP content in their products. This is often a sample or an example – it is enough to get you started, but does not provide a comprehensive security scan.

While the available SCAP content is a good start, most organizations will have additional needs. This can be addressed in two ways: by tailoring existing SCAP content and by writing new SCAP content.

Tailoring SCAP content involves choosing which SCAP rules will be evaluated and changing parameters.

An example of changing parameters is minimum password length. The default value might be 12 characters. You can change this in a tailoring file, perhaps to 8 characters or to 16 characters for a highly secure environment.

A common way to use SCAP is to have a large SCAP benchmark (content) which is used on all systems, and to select which rules will be used for each scan. This can be changed for each system and each run. You do this by providing the SCAP benchmark, an SCAP Tailoring file, and running the SCAP scanner.

Writing new SCAP content can be a daunting task. SCAP is a rich enterprise framework – in other words, it is complex and convoluted… If you are going to be writing SCAP content (and you really should), I suggest starting with Security Automation Essentials, getting very familiar with the various websites we’ve mentioned, studying the existing SCAP content, and being prepared for a significant learning curve.


Update on Red Hat Enterprise Linux 6 and FIPS 140 validations

Red Hat achieved its latest successful FIPS 140 validation back in April 2013. Since then, a lot has happened. There have been well publicized attacks on cryptographic protocols, weaknesses in implementations, and changing government requirements. With all of these issues in play, we want to explain what we are doing about it.

One of the big changes was that we enabled support of Elliptic Curve Cryptography (ECC) and Elliptic Curve Diffie Hellman (ECDH) in Red Hat Enterprise Linux to meet the National Institute of Standards and Technology’s (NIST’s) “Suite B” requirements taking effect this year. Because we added new ciphers, we knew we needed to re-certify. Re-certification brings many advantages to our government customers, who not only benefit from the re-certification, but they also maintain coverage from our last FIPS 140 validation effort. One advantage of re-certification is that we have picked up fixes for BEAST, Lucky 13, Heartbleed, Poodle, and some lesser known vulnerabilities around certificate validation. It should be noted that these attacks are against higher level protocols that are not part of any crypto primitives covered by a FIPS validation. But, knowing the fixes are in the packages under evaluation should give customers additional peace of mind.

The Red Hat Enterprise Linux 6 re-certification is now under way. It includes reworked packages to meet all the updated requirements that NIST has put forth taking effect Jan. 1, 2014, such as a new Deterministic Random Bit Generator (DRGB) as specified in SP 800-90A (PDF); an updated RSA key generation technique as specified in FIPS 186-4 (PDF); and updated key sizes and algorithms as specified in SP 800-131A (PDF).

Progress on the certification is moving along – we’ve completed review and preliminary testing and are now applying for Cryptographic Algorithm Validation System (CAVS) certificates. After that, we’ll submit validation paperwork to NIST. All modules being re-certified are currently listed on NIST’s Modules in Process page, except Volume Encryption (dm-crypt). Its re-certification is taking a different route because the change is so minor thus not needing CAVS testing. We are expecting the certifications to be completed early this year.

January 06, 2015

What’s worse?
<noscript>Take Our Poll</noscript><script type="text/javascript"> (function(d,c,j){if(!d.getElementById(j)){var pd=d.createElement(c),s;pd.id=j;pd.src='https://s1.wp.com/wp-content/mu-plugins/shortcodes/js/polldaddy-shortcode.js';s=d.getElementsByTagName(c)[0];s.parentNode.insertBefore(pd,s);} else if(typeof jQuery !=='undefined')jQuery(d.body).trigger('pd-script-load');}(document,'script','pd-polldaddy-loader')); </script>
Securing Secure Shell

I was passed an interesting article, this morning, regarding hardening secure shell (SSH) against poor crypto that can be a victim of cracking by the NSA and other entities.  The article is well written and discusses why the changes are necessary in light of recent Snowden file releases.


January 05, 2015

SCAP Component Technologies

We’re going to dig into SCAP in a fair amount of detail. So, let’s start by covering the various technologies that make up SCAP:

  • XCCDF – the Extensible Configuration Checklist Description Format. An XML based language for creating machine parsable security checklists.
  • OVAL – the Open Vulnerability and Assessment Language. Standardizes how to assess and report on the machine state of computer systems.
  • OCIL – the Open Checklist Interactive Language. Ask users questions. For example, “do you know who to report security breaches to?” and allowing the user to respond with yes or no – or perhaps the name and contact information of where to report security breaches.
  • CCE – Common Configuration and Enumeration. Uniquely identify configuration characteristics. For example, how do you identify minimum password length across Windows, Unix, Linux and Mac?
  • CPE – Common Platform Enumeration. A structured naming scheme for IT systems, software and packaging.
  • CVE – Common Vulnerability Enumeration. A standard way to uniquely identify computer vulnerabilities, for example HeartBleed – CVE-2014-0160.
  • CEE – Common Event Expression. A common way to record events – i.e. a standard logging format.
  • CRE – Common Remediation Enumeration. Describes how to remediate or mitigate security vulnerabilities.
  • CVSS – Common Vulnerability Scoring System. A consistent methodology for measuring and quantifying the impact and risk of vulnerabilities identified through CVE.

Some of these are widely used. For example, the CVE Database maintained by Mitre is the common resource used for sharing information on security vulnerabilities. It has been used by security professionals around the world for over 15 years.

Others are new, such as the use of XCCDF and OVAL to create standardized security content that can be shared across organizations and industries and be used by automated scanners.


December 18, 2014

Before you initiate a “docker pull”

In addition to the general challenges that are inherent to isolating containers, Docker brings with it an entirely new attack surface in the form of its automated fetching and installation mechanism, “docker pull”. It may be counter-intuitive, but “docker pull” both fetches and unpacks a container image in one step. There is no verification step and, surprisingly, malformed packages can compromise a system even if the container itself is never run. Many of the CVE’s issues against Docker have been related to packaging that can lead to install-time compromise and/or issues with the Docker registry.

One, now resolved, way such malicious issues could compromise a system was by a simple path traversal during the unpack step. By simply using a tarball’s capacity to unpack to paths such as “../../../” malicious images were able to override any part of a host file system they desired.

Thus, one of the most important ways you can protect yourself when using Docker images is to make sure you only use content from a source you trust and to separate the download and unpack/install steps. The easiest way to do this is simply to not use “docker pull” command. Instead, download your Docker images over a secure channel from a trusted source and then use the “docker load” command. Most image providers also serve images directly over a secure, or at least verifiable, connection. For example, Red Hat provides a SSL-accessible “Container Images”.  Fedora also provides Docker images with each release as well.

While Fedora does not provide SSL with all mirrors, it does provide a signed checksum of the Docker image that can be used to verify it before you use “docker load”.

Since “docker pull” automatically unpacks images and this unpacking process itself is often compromised, it is possible that typos can lead to system compromises (e.g. a malicious “rel” image downloaded and unpacked when you intended “rhel”). This typo problem can also occur in Dockerfiles. One way to protect yourself is to prevent accidental access to index.docker.io at the firewall-level or by adding the following /etc/hosts entry:

127.0.0.1 index.docker.io

This will cause such mistakes to timeout instead of potentially downloading unwanted images. You can still use “docker pull” for private repositories by explicitly providing the registry:

docker pull registry.somewhere.com/image

And you can use a similar syntax in Dockerfiles:

from registry.somewhere.com/image

Providing a wider ecosystem of trusted images is exactly why Red Hat began its certification program for container applications. Docker is an amazing technology, but it is neither a security nor interoperability panacea. Images still need to come from sources that certify their security, level-of-support, and compatibility.

December 17, 2014

Container Security: Isolation Heaven or Dependency Hell

Docker is the public face of Linux containers and two of Linux’s unsung heroes: control groups (cgroups) and namespaces. Like virtualization, containers are appealing because they help solve two of the oldest problems to plague developers: “dependency hell” and “environmental hell.”

Closely related, dependency and environmental hell can best be thought of as the chief cause of “works for me” situations. Dependency hell simply describes the complexity inherent in modern application’s tangled graph of external libraries and programs they need to function. Environmental hell is the name for the operating system portion of that same problem (i.e. what wrinkles, in particular which bash implementation,on which that quick script you wrote unknowingly relies).

Namespaces provide the solution in much the same way as virtual memory simplified writing code on a multi-tenant machine: by providing the illusion that an application suite has the computer all to itself. In other words,”via isolation”. When a process or process group is isolated via these new namespace features, we say they are “contained.” In this way, virtualization and containers are conceptually related, but containers isolate in a completely different way and conflating the two is just the first of a series of misconceptions that must be cleared up in order to understand how to use containers as securely as possible. Virtualization involves fully isolating programs to the point that one can use Linux, for example, while another uses BSD. Containers are not so isolated. Here are a few of the ways that “containers do not contain:”

  1. Containers all share the same kernel. If a contained application is hijacked with a privilege escalation vulnerability, all running containers *and* the host are compromised. Similarly, it isn’t possible for two containers to use different versions of the same kernel module.
  2. Several resources are *not* namespaced. Examples include normal ulimit systems still being needed to control resources such as filehandlers. The kernel keyring is another example of a resource that is not namespaced. Many beginning users of containers find it counter-intuitive that socket handlers can be exhausted or that kerberos credentials are shared between containers when they believe they have exclusive system access. A badly behaving process in one container could use up all the filehandles on a system and starve the other containers. Diagnosing the shared resource usage is not feasible from within
  3. By default, containers inherit many system-level kernel capabilities. While Docker has many useful options for restricting kernel capabilities, you need a deeper understanding of an application’s needs to run it inside containers than you would if running it in a VM. The containers and the application within them will be dependent on the capabilities of the kernel on which they reside.
  4. Containers are not “write once, run anywhere”. Since they use the host kernel, applications must be compatible with said kernel. Just because many applications don’t depend on particular kernel features doesn’t mean that no applications do.

For these and other reasons, Docker images should be designed and used with consideration for the host system on which they are running. By only consuming images from trusted sources, you reduce the risk of deploying containerized applications that exhaust system resources or otherwise create a denial of service attack on shared resources. Docker images should be considered as powerful as RPMs and should only be installed from sources you trust. You wouldn’t expect your system to remain secured if you were to randomly install untrusted RPMs nor should you if you “docker pull” random Docker images.

In the future we will discuss the topic of untrusted images.

December 12, 2014

How to really screw up TLS

I’ve noticed a few of my favorite websites failing with some odd error from Firefox.

Firefox's Unable to connect securely error messageThe Firefox error message is a bit misleading.  It actually has nothing to do with the website supporting SSL 3.0 but the advanced info is spot on.  The error “ssl_error_no_cypher_overlap” means that the client didn’t offer any ciphers that the server also supports.  Generally when I see this I assume that the server has been setup poorly and only supports unsafe ciphers.  In this case the website only supports the RC4 cipher.  I wondered why I was starting to see a reversal of removing RC4 from so many websites recently (especially since RC4 is very weak and is on the way out).  Apparently these websites all use the F5 load balancer that had a bad implementation of the TLS 1.0 standard causing a POODLE-like vulnerability.

Stepping back for a moment, back in October the POODLE vulnerability hit the streets and a mass exodus from SSL 3.0 happened around the world.  I was happy to see so many people running away from the broken cryptographic protocol and very happy to see the big push to implementing the latest version of TLS, TLS 1.2.  So with SSL 3.0 out of the way and the POODLE vulnerability being squelched why are we seeing problems in TLS 1.0 now?

Well, simply put, F5 load balancers don’t implement TLS 1.0 correctly.  The problem with SSL 3.0 is that the padding format isn’t checked.  Apparently in the F5 devices it’s still a problem in TLS 1.0.  And while the company did offer up patches to fix the issue, some really bad advice has been circulating the Internetz telling people to only support RC4, again.  Sigh.

When RC4 finally dies a fiery death I’ll likely throw a party.  I’m sure I won’t be the only one…


December 11, 2014

Rolekit (or “How I learned to stop thinking in terms of packages”)

What’s the problem?

Let’s start with a simplification and discuss the lifecycle of software at a high-level:

  1. Research and Development – In this phase, the software is designed, coded and (hopefully) tested.
  2. Packaging – Here, we take the compiled, tested bits of the software and bundle it up into some sort of package that can be used to deliver it to a user.
  3. Deployment – An end-user takes the package and does something interesting with it (for the purists out there, I’m lumping the test, staging and production environments into the “deployment” category).

Despite the brevity of the list above, there are a lot of moving parts here. I’m going to use the Fedora process to illustrate how this all works in a pre-rolekit world and then talk a little bit about the limitations, some of the alternatives and finally how rolekit addresses the issue. First, though, I’ll answer the question I posited in the header: “What’s the problem?”

The problem to be solved is how to get useful software up and running in an end-user’s environment with the least amount of difficulty for the user. The first and most important rule in software is this: software is a means to an end, not an end unto itself. People install a piece of software in order to achieve a goal. This goal could be something relatively simple, such as “I want to listen to this MP3 I bought” or as complex as “I run the IT department for a multinational manufacturing company and I want to keep track of all my products, the rate of their sales and margins as well as what my competitors are doing”. The job of software is to enable the user to get to that desired state. To that end, I would argue this: it is far more important to help the user get started than it is to offer them every possible feature.

Some of you may interject: “But if you don’t have the feature they need, won’t they go to someone who does?”. Sure, sometimes that will happen. But you will probably discover that people will make a different tradeoff than you might think: “I can get 90% of what I need and get it set up in a few weeks” is a far more compelling statement to make to a financial decision-maker than “This product provides everything we need, but I’ll need two more full-time people to get it running next year”.

What are we doing today?

Open source development is fairly unique compared to traditional software development. One of its major advantages for development can also become its biggest challenge to deployment. Because of the breadth of open source projects out there, there is almost always someone who has done at least a piece of what you want to do already. These other projects, such as coding libraries, web application frameworks, video game engines, etc. all provide the building blocks to start your work. The great thing here is that you can pick up the pieces that you need from somewhere else and then focus your attention only on the parts that make your project unique or exciting.

However, the challenge starts happening when you get to the packaging phase. Now that you have something you want to share with the world, you need to package it in a manner that allows them to use it. There are generally two schools of thought on how to do this, each with their own strengths and weaknesses.

  1. Grab the source code (or pre-built binaries) for everything that you depend on for your project to work and package them all together in a single deliverable.
  2. Package all of your dependencies separately in their own deliverables

I’m not going to go into the details of why, but the Fedora Project has policies that require the second option. (If you’re interested in the reasoning, I strongly recommend reading the Fedora Packaging Guidelines page on the subject). Fedora then provides a dependency-resolution mechanism that simplifies this case by ensuring that when you attempt to retrieve the package you want, it also automatically installs all of the packages that it depends on (and so on, recursively until they are all satisfied).

How do we deploy it now?

There are two schools of thought on this subject, which I will refer to as the “Fedora Approach” and the “Debian Approach”, since those two Linux distributions best represent them. (Note: my understanding of the Debian Approach is second-hand, so if I get any of the subtleties incorrect, please feel free to leave a comment and I’ll correct it).

The Debian Approach

In Debian and its derivatives (such as Ubuntu, Mint, etc.), when the package resolution is completed and the packages are downloaded, the user is required to indicate at that time their explicit decision on how the package must behave. Through a system called “debconf”, package installation is directly tied to deployment; the package installation cannot conclude without it being explicitly configured at that time. If the installation is non-interactive (such as if the installation was initiated by another service, rather than the user), the configuration must either be specified by an “answer file” (a configuration file passed to debconf stating the answers in advance) or else the package must provide a sensible set of defaults to automatically deploy it.

 The Fedora Approach

In Fedora and its derivatives (such as Red Hat Enterprise Linux, CentOS, Scientific Linux, etc.), when the package resolution is completed and the packages are downloaded, that’s it. In the vast majority of cases, the software is now on the system, but it is not configured to do anything at all. (There are a few specific exceptions which have been granted by the Fedora Engineering Steering Committee for things like the firewall). On these systems, nothing will happen until the user takes an explicit action to configure and start the services.

“That sounds like the Debian Approach is better!” you may say. However, there are concerns to be had here. For one, the above explanation I made about dependency-resolution comes into play; you as a user may not be fully aware of what packages are going to be pulled in by your dependencies (even accidentally). Furthermore, just because you installed a web-server package, it doesn’t mean that you necessarily want it running immediately. So, Fedora forces you to make these decisions explicitly, rather than implicitly. So when you’re ready, you configure the software and then start it up.

Where does this fall down?

The real problem is that the concept of “packages” derives very much from the engineering side of things. A package is a logical bundling of software for the developers. Not all problems can be solved with a single package, though. For example, the FreeIPA identity-management solution requires many top-level packages including an LDAP directory server, a certificate authority server, a DNS server and others. In this, the concept of a “package” gets more than a little fuzzy. In this particular case (as has been common historically), the solution was “Let’s make another package that glues them together!”. So the FreeIPA package just adds those other packages to its dependency chain.

But just adding more packages doesn’t necessarily solve the end-user concern: How do I easily deploy this?

Enter rolekit

Rolekit was designed to be specifically for handling the deployment situation and shield end-users from the concept of project-level packages. Instead, complete solutions will be “packaged” as Server Roles. Users will come to rolekit and declare a machine to be e.g. a Domain Controller, providing the minimum information necessary to set it up (today, that’s just an admin password in the Domain Controller example). Rolekit will handle all of the other necessary work under the hood, which involves downloading the appropriate packages, installing them on the system, setting up the configuration, starting the appropriate services and carefully opening up the firewall to allow access to it.

There are a lot of moving parts involved in deploying a role, but the user doesn’t really need to know what they are. If they can be shielded from much of the noise and churn inherent in package installation, configuration, service management and firewall settings, then they get back much of their time for solving the problems unique to their environments.

Fedora and Server Roles

As of Fedora 21, we have implemented the first release of the rolekit framework as well as a single representative Role: the Domain Controller. For Fedora 22, we’re working with the Cockpit project to produce a simple and powerful graphical interface to deploy the Domain Controller Role as well as building a new Database Server Role. As the project progresses, we very much hope that others will come forward to help us build more solutions. A few that I’d love to see (but don’t have time to start on yet):

  • A fileserver role that manages Samba and NFS file-shares (maybe [s]ftp as well).
  • A mail and/or groupware server role built atop something like Kolab
  • A backup server

Welcome to the post-package world, my friends!


December 10, 2014

Analysis of the CVE-2013-6435 Flaw in RPM

The RPM Package Manager (RPM) is a powerful command-line driven package management system capable of installing, uninstalling, verifying, querying, and updating software packages. RPM was originally written in 1997 by Erik Troan and Marc Ewing. Since then RPM has been successfully used in all versions of Red Hat Linux and currently in Red Hat Enterprise Linux.

RPM offers considerable advantages over traditional open-source software install methodology of building from source via tar balls, especially when it comes to software distribution and management. This has led to other Linux distributions to accept RPM as either the default package management system or offer it as an alternative to the ones which are default in those distributions.

Like any big, widely used software, over time several features are added to it and also several security flaws are found. On several occasions Red Hat has found and fixed security issues with RPM.

Florian Weimer of Red Hat Product Security discovered an interesting flaw in RPM, which was assigned CVE-2013-6435. Firstly, let’s take a brief look at the structure of an RPM file. It consists of two main parts: the RPM header and the payload. The payload is a compressed CPIO archive of binary files that are installed by the RPM utility. The RPM header, among other things, contains a cryptographic checksum of all the installed files in the CPIO archive. The header also contains a provision for a cryptographic signature. The signature works by performing a mathematical function on the header and archive section of the file. The mathematical function can be an encryption process, such as PGP (Pretty Good Privacy), or a message digest in the MD5 format.

If the RPM is signed, one can use the corresponding public key to verify the integrity and even the authenticity of the package. However, RPM only checked the header and not the payload during the installation.

When an RPM is installed, it writes the contents of the package to its target directory and then verifies its checksum against the value in the header. If the checksum does not match, that means something is wrong with the package (possibly someone has tampered with it) and the file is removed. At this point RPM refuses to install that particular package.

Though this may seem like the correct way to handle things, it has a bad consequence. Let’s assume RPM installs a file in the /etc/cron.d directory and then verifies its checksum. This offers a small race-window, in which crond can run before the checksum is found to be incorrect and the file is removed. There are several ways to prolong this window as well. So in the end we achieve arbitrary code execution as root, even though the system administrator assumes that the RPM package was never installed.

The approach Red Hat used to solve the problem is:

  • Require the size in the header to match with the size of the file in the payload. This prevents anyone from tampering with the payload, because the header is cryptographically verified. (This fix is already present in the upstream version of RPM)
  • Set restrictive permissions while a file is being unpacked from an RPM package. This will only allow root to access those file. Also, several programs, including cron, perform a check for permission sanity before running those files.

Another approach to mitigate this issue is the use of the O_TMPFILE flag. Linux kernel 3.11 and above introduced this flag, which can be passed to open(2), to simplify the creation of secure temporary files. Files opened with the O_TMPFILE flag are created, but they are not visible in the file system. As soon as they are closed, they are deleted. There are two uses for these files: race-free temporary files and creation of initially unreachable files. These unreachable files can be written to or changed same as regular files. RPM could use this approach to create a temporary, unreachable file, run a checksum on it, and either delete it or atomically link it to set the file up, without being vulnerable to the attack described above. However, as mentioned above, this feature is only available in Linux kernel 3.11 and above, was added to glibc 2.19, and is slowly making its way into GNU/Linux distributions.

The risk mentioned above is greatly reduced if the following precautions are followed:

  • Always check signatures of RPM packages before installing them. Red Hat RPMs are signed with cryptographic keys provided at https://access.redhat.com/security/team/key. When installing RPMs from Red Hat or Fedora repositories, Yum will automatically validate RPM packages via the respective public keys, unless explicitly told not to (via the “nogpgcheck” option and configuration directive).
  • Package downloads via Red Hat software repositories are protected via TLS/SSL so it is extremely difficult to tamper with them in transit. Fedora uses a whole-file hash chain rooted in a hash downloaded over TLS/SSL from a Fedora-run central server.

The above issue (CVE-2013-6435) has been fixed along with another issue (CVE-2014-8118), which is a potentially exploitable crash in the CPIO parser.

Red Hat customers should update to the latest versions of RPM via the following security advisories:
https://rhn.redhat.com/errata/RHSA-2014-1974.html
https://rhn.redhat.com/errata/RHSA-2014-1975.html
https://rhn.redhat.com/errata/RHSA-2014-1976.html

December 03, 2014

Disabling SSLv3 on the client and server

Recently, some Internet search engines announced that they would prefer websites secured with encryption over those that were not.  Of course there are other reasons why securing your website with encryption is beneficial.  Protecting authentication credentials, mitigating the use of cookies as a means of tracking and allowing access, providing privacy of your users, and authenticating your own server thus protecting the information you are trying to convey to your users.  And while setting up and using encryption on a webserver can be trivial, doing it properly might take a few additional minutes.

Red Hat strives to ship sane defaults that allow both security and availability.  Depending on your clients a more stringent or lax configuration may be desirable.  Red Hat Support provides both written documentation as well as a friendly person that can help make sense of it all.  Inevitably, it is the responsibility of the system owner to secure the systems they host.

Good cryptographic protocols

Protocols are the basis for all cryptography and provide the instructions for implementing ciphers and using certificates.  In the asymmetric, or public key, encryption world the protocols are all based off of the Secure Sockets Layer, or SSL, protocol.  SSL has come along way since its initial release in 1995.  Development has moved relatively quickly and the latest version, Transport Layer Security version 1.2 (TLS 1.2), is now the standard that all new software should be supporting.

Unfortunately some of the software found on the Internet still supports or even requires older versions of the SSL protocol.  These older protocols are showing their age and are starting to fail.  The most recent example is the POODLE vulnerability which showed how weak SSL 3.0 really is.

In response to the weakened protocol Red Hat has provided advice to disable SSL 3.0 from its products, and help its customers implement the best available cryptography.  This is seen in products from Apache httpd to Mozilla Firefox.  Because SSL 3.0 is quickly approaching its twentieth birthday it’s probably best to move on to newer and better options.

Of course the protocol can’t fix everything if you’re using bad ciphers.

Good cryptographic ciphers

Cryptographic ciphers are just as important to protect your information.  Weak ciphers, like RC4, are still used on the Internet today even though better and more efficient ciphers are available.  Unfortunately the recommendations change frequently.  What was suggested just a few months ago may no longer be good choices today.  As more work goes into researching the available ciphers weaknesses are discovered.

Fortunately there are resources available to help you stay up to date.  Mozilla provides recommended cipher choices that are updated regularly.  Broken down into three categories, system owners can determine which configuration best meets their needs.

Of course the cipher can’t fix everything if your certificate are not secure.

Certificates

Certificates are what authenticate your server to your users.  If an attacker can spoof your certificate they can intercept all traffic going between your server and users.  It’s important to protect your keys and certificates once they have been generated.  Using a hardware security module (HSM) to store your certificates is a great idea.  Using a reputable certificate authority is equally important.

Clients

Most clients that support SSL/TLS encryption automatically try to negotiate the latest version.  We found with the POODLE attack that http clients, such as Firefox, could be downgraded to a weak protocol like SSL 3.0.  Because of this many server owners went ahead and disabled SSL 3.0 to prevent the downgrade attack from affecting their users.  Mozilla has, with their latest version of Firefox, disabled SSL 3.0 by default (although it can be re-enabled for legacy support).  Now users are protected even though server owners may be lax in their security (although they are still at the mercy of the server’s cipher and protocol choices).

Much of the work has already been done behind the scenes and in the development of the software that is used to serve up websites as well as consume the data that comes from these servers.  The final step is for system owners to implement the technology that is available.  While a healthy understanding of cryptography and public key infrastructure is good, it is not necessary to properly implement good cryptographic solutions.  What is important is protecting your data and that of your users.  Trust is built during every interaction and your website it usually a large part of that interaction.

December 02, 2014

Security Audit Automation Made Easy with SCAP

Security automation can be defined as the use of standardized specifications and protocols to perform specific common security functions.

Which leads us to SCAP – the Security Content Automation Protocol, an industry and government initiative to automate security audits and compliance.

The basic concept of SCAP is that security guides should be executable content, not paper documents. You should be able to define your security requirements (or security content) in a form that can be run on a computer with no human intervention, and which produces an audit report that can be understood by both computers and people. You should be able to run these reports – effectively, to perform a complete security audit on a system – as frequently as you want.

Further, these security guides should be dynamic, extensible, customizable, and actionable.

  • Dynamic – as new security threats are discovered, the threat and how to respond to the threat should be added to the security guide.
  • Extensible – you should be able to get security content from multiple sources, as well as create your own specialized security content.
  • Customizable – you should be able to choose which security rules apply to which systems. For example, a web server in a DMZ, a database server and a development system will all have different security requirements.
  • Actionable – the security guide should not only identify security issues, it should also give you assistance in resolving these security issues. Specifically, it should help you understand what the issue is, what the risk is, and what the exposure is, as well as what steps can be taken to resolve or mitigate it.

And, of course, consistent. You may recall the discussion of password rules from a few posts back. You need to apply the same security rules across Windows, Solaris, AIX, HP-UX, Linux, Mac, and all other computers you have.

For people who want to jump ahead, good resources for SCAP include:


MySQL On Fedora 20 Setup

I’ve set up MySQL enough times figuring things out from docs that I decided I need to take notes.

This is a destructive re-install. Don’t do this if you value your data. In fact, just don’t do this.

Cleanup after old installs.

sudo systemctl stop mysqld.service
sudo yum erase mysql mariadb-libs
#remove files that have the vestiges of old installs
sudo rm -rf  /var/lib/mysql/
sudo rm -rf /etc/mysql/conf.d/
sudo rm -rf /etc/my.cnf.d/
sudo rm -rf /etc/my.cnf
#Install
sudo yum install mysql-server
# run the server
sudo systemctl start mysqld.service
#create a db
sudo mysqladmin create keystone

Connect to the database as root to do the basics. Yes, this could be scripted from the command line:

To Create a user for ayoung

$ sudo mysql keystone
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 11
Server version: 5.5.39-MariaDB-wsrep MariaDB Server, wsrep_25.10.r4014

Copyright (c) 2000, 2014, Oracle, Monty Program Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [keystone]> CREATE USER 'ayoung'@'localhost' IDENTIFIED BY 'password';
Query OK, 0 rows affected (0.00 sec)

MariaDB [keystone]> GRANT ALL PRIVILEGES ON * . * TO 'ayoung'@'localhost';
Query OK, 0 rows affected (0.00 sec)

MariaDB [keystone]> 

Log in as ayoung

$ mysql keystone --password
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 5.5.39-MariaDB-wsrep MariaDB Server, wsrep_25.10.r4014

Copyright (c) 2000, 2014, Oracle, Monty Program Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [keystone]> 

November 24, 2014

Security Checklists and the US National Checklist Program

If you are going to perform a security audit you need a checklist.

Let’s spend a minute on this. If you want a predictable outcome, you need a standard process – a standard set of steps to go through to reach that outcome. Basic stuff. But here is the tricky part: people are bad about remembering things and doing things the same way every time. If the results are important, you need a checklist.

Rather than spending a lot of time here, I’m going to hand out a reading assignment: The Checklist Manifesto: How to Get Things Right by Atul Gawande. This is one of the books I strongly recommend everyone should read. Go ahead, I’ll wait until you come back.

OK, welcome back.

Let’s take a look at applying checklists to security. The first suggestion I will make is don’t write checklists from scratch. Find one that is close to what you need and modify it. It takes several iterations and considerable experience to develop a solid process that works – the more you can build on other peoples experience, the less work you have to do. And the better your chances of getting it right!

A good resource for checklists on computer security is the US National Checklist Program. This is a repository of publicly available security checklists to provide detailed guidance on setting the security configuration of operating systems and applications.

Let’s start out with a written checklist – how about the HPLaserJet 4345 MFP Security Checklist. This is a 49 page document detailing how to secure a printer. Yes, a printer. Modern printers are actually servers with a print engine hanging off the side. They can be a major security risk. They have an internal disk drive that stores the documents being printed. Did you securely remove classified documents from the last printer you got rid of?

The document covers threat models, network security, printer settings, and ramifications of the various settings. It includes many screenshots of how to use the Web-based management interface to access and change the many settings.

The good news is that this security guide exists. The bad news is that it is a time consuming manual process to apply it. Speaking of which – who configured your printer when it was installed six years ago? Did they do it right? What has happened in the intervening time? Did someone disable security on the printer so that they could get their job done?

It looks like it is time to print out the security guide and start pointing your browser at all the printers in your organization!

There has got to be a better way to do this. And no, ignoring security until you show up on the front page of the newspaper or in front of a congressional committee isn’t a better way!…


November 19, 2014

Availability of OpenLMI in Various Linux Distributions

A quick update on the availability of OpenLMI:

I have tested Fedora, RHEL, CentOS, and OEL servers using the LMI CLI running on a Fedora system – the cross platform access works.

Fedora

Fedora is the primary development platform for OpenLMI. OpenLMI support has been included in Fedora starting with Fedora 18. We strongly recommend using Fedora 20 or the upcoming Fedora 21 release when using Fedora with OpenLMI, as these include the latest versions of OpenLMI. Fedora includes all OpenLMI capabilities: the CIMOM, all Providers, the client tools and all client scripts.

Red Hat Enterprise Linux

RHEL 7 includes the OpenLMI CIMOM and Providers. RHEL 7 includes the client side infrastructure (LMIShell and the LMI CLI). Many of the client scripts are available through the EPEL repository.

CentOS

CentOS 7 includes the OpenLMI CIMOM and Providers. CentOS 7 includes the client side infrastructure (LMIShell and the LMI CLI). Many of the client scripts are available through the EPEL repository.

Oracle Enterprise Linux

OEL 7 includes the OpenLMI CIMOM and Providers.

SuSE

SLES 12 includes a subset of the OpenLMI Providers. SuSE uses the sfcb CIMOM instead of the OpenPegasus CIMOM used by default in the other distributions (both sfcb and OpenPegasus ship it all of these Linux distributions).

SLES 12 includes the following OpenLMI Providers:

  • openlmi-fan
  • openlmi-hardware
  • openlmi-journald
  • openlmi-logicalfile
  • openlmi-pcp
  • openlmi-powermanagement
  • openlmi-python-base
  • openlmi-python-providers
  • openlmi-realmd
  • openlmi-service
  • openlmi-software

The SLES 12 documentation notes that “Only reading of management information is supported for the ‘openlmi’ providers.”.

SLES 12 does not include the OpenLMI storage or network Providers; thus, you can not use OpenLMI to query or configure storage or networks on a SLES 12 system.

Debian

OpenLMI support is not currently available in Debian.

Ubuntu

OpenLMI support is not currently available in Ubuntu.


November 18, 2014

LISA’14 – Are We Making Linux Too Easy?

LISA’14, the Large Installation System Administration conference, was held in Seattle last week. I had the opportunity to give a talk on Server Management – if you are interested, the slides are available here.

One of the questions caught me completely off guard: “Aren’t you afraid that you are making system management too simple and that people won’t learn how to really manage Linux? They will just learn a few simple commands and not go any further. Today, they have to learn how Linux works and how to solve problems. OpenLMI will leave them unprepared.”

Wow… Where to start?

Thinking about this further, it could happen. In fact it will happen! Many people are looking for the quickest fix to a problem – a common way of working is to Google what you need, find something that looks like it should work, try a quick cut and paste, and move on.

OpenLMI is designed to support this. The LMI CLI is task oriented, simple, and easy to use. All you really need to use the LMI CLI is “LMI help”. The LMIShell scripts are designed to do useful work, to be easy to read, and to be modified for specific tasks.

If someone is simply looking for a way to perform a specific task, use it, and move on the the next problem, OpenLMI is a good way to go. You can use OpenLMI at a shallow level, even use it to avoid having to learn how Linux really works.

On the other hand, OpenLMI can also be used to ease into a deep knowledge of Linux: Start with the LMI CLI and use it to perform tasks. Move into LMIShell and start using and developing scripts. From there it is straightforward to develop custom automation tools. You have several ways to dive deeper into Linux administration, perhaps even developing custom OpenLMI Providers.

I would suggest that OpenLMI makes Linux more approachable. Some people will only use OpenLMI, and will never go deeper – if they can do what they need to do, this seems like a reasonable approach. Some people will use OpenLMI as a tool and and entry point to mastering Linux administration; this is great.

I don’t believe everyone needs to master Linux to use it. Consider the car analogy: All some people want to do is drive a car – automatic transmissions are perfect for them. Some people want to be able to do light repairs such as oil changes. Some want to rebuild engines and repair major subsystems of the car. And some people want to design the eight speed computer controlled automatic transmissions that are part of the integrated drive train of modern cars!

What do you think? Do we face a real risk of making Linux “too easy”, or should we try to make Linux more approachable?


Automation – a Security Imperative

So far we have established:

  • Security Guides are a good idea and exist in almost all organizations.
  • Security audits are good and widely used.
  • Security guides are often poorly written, subject to interpretation, and difficult to apply.
  • Security audits are expensive and not performed as often as they should be.

Hmmm…. Well, computers are good at following rules and measuring things. And if security guide rules are precise enough to be implemented and measured, they are very close to what you need to create a computer program.

The obvious next step is to create computer programs to implement security rules and perform computer audits!

In fact, this is what has been done for years. Numerous programs have been written for security, many security capabilities are built into operating systems, and scripts to configure systems are widely used.

However: security at the enterprise level is a big, complex undertaking.

You need a large investment in tracking threats as they emerge. It would be terribly convenient if there were a standard way to talk about threats – for example, the first 6 people who identify a new computer virus are going to call it different things, unless something is done to create a standard definition.

The vast majority of computer security issues are quickly fixed after they are identified. Decades of experience show that most computer intrusions can be prevented by applying existing patches. The question is what patches need to be applied to each specific system? This is a more complex question than it appears to be – few organizations automatically apply all patches to all systems. Instead, they test patches and carefully apply specific patches to specific systems.

The challenge is knowing which patches have been applied, which patches are available, and which patches are needed for each system. What is the risk addressed by each patch, what is the impact, and how relevant is the exposure?

Creating a useful set of security rules is a huge undertaking. If each organization is 90% common with other organizations and 10% unique, it is incredibly wasteful for each organization to build 100% of the security rules themselves.

And enterprise systems are complex. You need a workflow and extensible frameworks to be able to effectively secure, manage and monitor them.

All of these things call out for an industry wide initiative to build a standard foundation for automating security.


November 17, 2014

Dynamic Policy in Keystone

Ever get that feeling that an epiphany is right around the corner? I spent a good portion of the OpenStack summit with that feeling. I knew that it would not be earth shattering, or lead me to want to rewrite Keystone, but rather a clarification of how a bunch of things should fall together. The “click” happened on the second to last day, and it can be summarized in a few key points.

When discussing the OAUTH1.0 extension to Keystone, several people commented on how it was similar to trusts, and that we should have a unified mechanism between them for delegation. During a discussion with David Chadwick, he mentioned that the role assignments themselves were a form of delegation, and lamented that we were losing the chain of delagtion by how we delegate roles. So the first point was this:

Keystone should have a single, unified mechanism for delegation.

One key feature that feeds into that is the ability to break a big role into a small one. I had posted a spec for hierarchical roles prior to the summit, but wasn’t clear for how to implement it; I could see how it coule be implemented on the token side, but all people I talked to insisted it made more sense on the enforcement side. That is the second big point.

Role inheritance should be expanded by policy enforcement.

Policy is almost all static. Each OpenStack project had it’s own policy file in its own it repo. Extending it to cover is user requests for things like project specific policy or more granular roles has not been possible.

UPDATE: I’ve been asked to make clearer what problems this addresses.

  1. Determine what roles a user can assign to another user
  2. Allow a user to determine what roles they need to perform some action
  3. Allow some user interface to determine what a user is capable of doing based on their roles
  4. Establish an iterative process solve the long-standing bug that a user with admin on any scope has admin on all scoped.
  5. Allow a user to delegate a subset of their capabilites to a remote service.

What we have now is a simple set of specs that build on each other that will, in the end, provide a much more powerful, flexible, and consistant delegation mechanism for Keystone. Here are the General steps:

  1. Graduate oslo policy to a library
  2. Add to the policy library the essential code to enforce policy based on a keystone token.  I’ve looked at both the Keystone and Nova pieces that do this, and they are similar enough that we should not have too much problem making this happen.
  3. Add in the ability to fetch the policy.json file from Keystone.
  4. Add a rule to the Keystone policy API to return the default policy file if no policy file is specified for an endpoint.
  5. Merge the current default policy files from all of the projects into a single policy file, with namespaces that keep the rules from conflicting across services.  Reduce the duplication of rules like “admin_or_owner”  so that we have a consistent catalog of capabilities across OpenStack.  Make this merged file the default that is served out of Keystone when an endpoint asks for a policy file and Keystone does not have an endpoint specific file to give it.
  6. Make a database schema to hold the rules from the policy file.  Use this to generate the policy files served by Keystone.  There should be no functional difference between the global file and the one produced in the above merge.
  7. Use the hierarchical role definitions to generate the rules for the file above.  For example, rules that essentially say “grant access to a user with any role on this project”  will now say  “grant access to any user with the member role, or with any role that inherits the member role.  The member role will be the lowest form of access.  Admin will inherit member, as will all other defined roles.
  8. Break member up into smaller roles.  For example,  we could distinguish between actions that can only read state from those that can change it:  “Observer”  and “Editor”  Member would inherit editor, and editor would inherit observer.
  9. Change the rules for specific API policy enforcement points to know about the new roles.  For example, the API to create a new image in glance might now require the editor role instead of the member role.  But, since member inherits editor, all current users will be able to perform the same set of operations.
  10. Change the role assignment mechanism so that a user can only assign a role that they themselves have on the designated scope.  In order to assign Member, the user must have the member role, or a role that inherits Member,such as admin.  Role assignment, trusts, oauth, and any other mechanism out there will follow this limitation.  We will have to perform additional limitations, such as determining what happens to a delegated role when the person that does the delegation has that role removed;  perhaps one will need a specific role in order to perform “sticky” role assignments that last past your employment, or perhaps we will allow a user to pass some/all their delegations on to another user.

 

This is still in the planning stage.  One nice thing about a plan like this is that each stage shows value on its own, so that if we only get as far as, say stage 3, we still have a better system than we do today.  Many of the details are still hiding in the weeds, and will require more design.  But I think the above approach makes sense, and will make Keystone do what a lot of people need it to do.

Minimal Token Size

OpenStack Keystone tokens can become too big to fit in the headers between mod_wsgi and the WSGI applications. Compression mitigates the problem somewhat, but if token sizes continue to grow, eventually they outpace the benefits of compression. How can we keep them to a minimal size?

There are two variables to the size of the tokens: the packaging, and the data inside. The packaging for a PKIZ token has a lower bound based on the the signing algorithm. An empty CMS document of compressed data is going to be no less than 650 bytes. An unscoped token with proper compression comes in at 930 bytes. This are for headers, but it means that we have to keep additional data inside the token body as small as possible.

Encoding

Lets shift gears back to the encoding. A recent proposal suggested using symmetric encryption instead of asymmetric. The idea is that a subset of data would be encrypted by Keystone, and the data would have to be sent back to Keystone to validate. What would this save us?

Lets assume for a moment that we don’t want to pay any of the overhead of the CMS message format. Instead, keystone will encrypt just the JSON and base64 the data. How much does that save us? Depends on the encryption algorithm. An empty token will be tiny: 33 bytes when encrypted like this:

openssl bf -salt -a -in cms/empty.json -out cms/empty.bf

Which, according to the openssl man page, is blowfish encrypted and base64 encoded. What about a non-trivial token? Turns out, our unscoped token is quite a bit bigger: 780 bytes for the comparable call:

openssl bf -d -k key.data -in cms/auth_token_unscoped.json -out cms/auth_token_unscoped.bf

Compared with the PKIZ format at 929 bytes, the benefit does not seem all that great.

What about for a scoped token with role data embedded in it, but no service catalog? It turns out the compression actually makes the PKIZ format more effecient: PKIZ is 917 bytes versus 1008 for the bf.

Content

What data is in the token?

Identification. This is what you would see in an unsigned token: user id and name, domain id and possibly name.

Scope: domain and project info Roles: specific to the scope. service catalog. The sets of services and endpoints that implement those services.

It is the service catalog that is so problematic. While we have stated that you can make tokens without a service catalog, doing so is rally not going to allow the endpoints to make any sort of decisions about where to get resources.

There is a lot of redundant data in the catalog. We’ve discussed doing ID only service catalogs. That implies that each endpoint is expandable on the endpoint size: the endpoints need to be able to fetch the service catalog and then look up the endpoints by ID.

But let us think in terms of scale. If there is a service catalog with, say, 512 endpoints, we are still going to be sending tokens that are 512 * length(endpoint_id)

Can we do better? According to Jon Bently in Programming Pearls, yes we can. We can use a bitmap. No, not the image format. Here a bitmap is an array of bits, each of which, when set, indicates inclusion of the member in the set.

We need two things. One, a cached version of the service catalog on the endpoints. But now we need to put a slightly stricter constraint on it: the token must match up exactly to a version of the service catalog, and the service catalog must contain that version number. I’d take the git approach, do a sha256 hash of the service catalog document, and include that version in the token.

Second, we need to enforce ordering on the service catalog. Each endpoint must be in a repeatable location in the list. I need to be able to refer to the endpoints, not by ID, but by sequence number.

Now, what the token would contain? Two things:

The hash of the service catalog. A bitmap of the included services.

Here’s a minimal service catalog

Index | Service name | endpoint ID
 0 | Nova | N1
 1 | Glance | G1
 2 | Neutron | T1
 3 | Cinder | C1

A service catlog that had all of the endpoints would be (b for binary) b1111 or, in Hex, 0xF

A service catalog with only Nova would be b0001 or 0x1.

Just cinder would be b1000 or 0x8

A service catalog with 512 endpoints would be 512 bits in length. That would be 64 characters long, the length of a string comparable to a sha256. A comparable list of uuids would take 16384 characters, not including the JSON overhead of commas and quotes.

I’ve done a couple tests with token data in both the minimized and the endpoing_id only formats. With 30 endpoint ids, the compressed token size is 1969 bytes. Adding one ID to that increases the size to 1989. The minimized format is 1117 when built with the following data:

"minimizedServiceCatalog": { 
    "catalog_sha256": "7c7b67a0b88c271384c94ed7d93423b79584da24a712c2ece0f57c9dd2060924",
    "entrymap": "Ox2a9d590bdb724e6d888db96f846c9fd8" },

The ID only one would scale up at rougly 20 bytes per entry point, the minimized one would stay fairly fixed in length.

Are there other options? If a token without a catalog assumed that all endpoints were valid, and auth_token middleware set the environment for the request appropriately, then there is no reason to even send a catalog on over.

Project filtering of endpoints could allow for definitions of the service catalog that is a subset of the overall catalog. These subordinate service catalogs could have their own ids, and be sent over in the token. This would minimize the size of data in the token at the expense of the server; a huge number of projects, each with their own service catalog would lead to a large synchronization effort between the endpoints and the keystone server.

If a token is only allowed to work with a limited subset of the endpoints assigned to the project, then maintaining strictly small service catalogs in their current format would be acceptable. However, this would require a significant number of changes on how users and service request tokens from Keystone.

November 12, 2014

Enterprise Linux 6.5 to 6.6 risk report

Red Hat Enterprise Linux 6.6 was released the 14th of October, 2014, eleven months since the release of 6.5 in November 2013. So lets use this opportunity to take a quick look back over the vulnerabilities and security updates made in that time, specifically for Red Hat Enterprise Linux 6 Server.

Red Hat Enterprise Linux 6 is in its fourth year since release, and will receive security updates until November 30th 2020.

Errata count

The chart below illustrates the total number of security updates issued for Red Hat Enterprise Linux 6 Server if you had installed 6.5, up to and including the 6.6 release, broken down by severity. It’s split into two columns, one for the packages you’d get if you did a default install, and the other if you installed every single package.

During installation there actually isn’t an option to install every package, you’d have to manually select them all, and it’s not a likely scenario. For a given installation, the number of package updates and vulnerabilities that affected you will depend on exactly what you selected during installation and which packages you have subsequently installed or removed.

Security errata 6.5 to 6.6 Red Hat Enterprise Linux 6 ServerFor a default install, from release of 6.5 up to and including 6.6, we shipped 47 advisories to address 219 vulnerabilities. 2 advisories were rated critical, 25 were important, and the remaining 20 were moderate and low.

Or, for all packages, from release of 6.5 up to and including 6.6, we shipped 116 advisories to address 399 vulnerabilities. 13 advisories were rated critical, 53 were important, and the remaining 50 were moderate and low.

You can cut down the number of security issues you need to deal with by carefully choosing the right Red Hat Enterprise Linux variant and package set when deploying a new system, and ensuring you install the latest available Update release.

 

Critical vulnerabilities

Vulnerabilities rated critical severity are the ones that can pose the most risk to an organisation. By definition, a critical vulnerability is one that could be exploited remotely and automatically by a worm. However we also stretch that definition to include those flaws that affect web browsers or plug-ins where a user only needs to visit a malicious (or compromised) website in order to be exploited. Most of the critical vulnerabilities we fix fall into that latter category.

The 13 critical advisories addressed 42 critical vulnerabilities across six different projects:

  • An update to php RHSA-2013:1813 (December 2013).  A memory corruption flaw was found in the way the openssl_x509_parse() function of the PHP openssl extension parsed X.509 certificates. A remote attacker could use this flaw to provide a malicious self-signed certificate or a certificate signed by a trusted authority to a PHP application using the aforementioned function, causing the application to crash or, possibly, allow the attacker to execute arbitrary code with the privileges of the
    user running the PHP interpreter.
  • An update to JavaOpenJDK
    • RHSA-2014:0026 (January 2014).  Multiple improper permission check issues were discovered in the Serviceability, Security, CORBA, JAAS, JAXP, and Networking components in OpenJDK. An untrusted Java application or applet could use these flaws to bypass certain Java sandbox restrictions.
    • RHSA-2014:0406 (April 2014).  An input validation flaw was discovered in the medialib library in the 2D component. A specially crafted image could trigger Java Virtual Machine memory corruption when processed. A remote attacker, or an untrusted Java application or applet, could possibly use this flaw to execute arbitrary code with the privileges of the user running the Java Virtual Machine.
    • RHSA-2014:0889 (July 2014).  It was discovered that the Hotspot component in OpenJDK did not properly verify bytecode from the class files. An untrusted Java application or applet could possibly use these flaws to bypass Java sandbox restrictions.
  • An update to ruby RHSA-2013:1764 (November 2014).  A buffer overflow flaw was found in the way Ruby parsed floating point numbers from their text representation. If an application using Ruby accepted untrusted input strings and converted them to floating point numbers, an attacker able to provide such input could cause the application to crash or, possibly, execute arbitrary code with the privileges of the
    application.
  • An update to nss and nspr RHSA-2014:0917 (July 2014).  A race condition was found in the way NSS verified certain certificates.  A remote attacker could use this flaw to crash an application using NSS or, possibly, execute arbitrary code with the privileges of the user running that application.
  • An update to bash (Shellshock) RHSA-2014:1293 (September 2014).  A flaw was found in the way Bash evaluated certain specially crafted environment variables. An attacker could use this flaw to override or bypass environment restrictions to execute shell commands. Certain services and applications allow remote unauthenticated attackers to provide environment variables, allowing them to exploit this issue.
  • An update to Firefox:
    • RHSA-2013:1812 (December 2013).   Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to terminate unexpectedly or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:0132 (February 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:0310 (March 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:0448 (April 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:0741 (June 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:0919 (July 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:1144 (September 2014). Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
    • RHSA-2014:1635 (October 2014).  Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.
      A flaw was found in the Alarm API, which allows applications to schedule
      actions to be run in the future. A malicious web application could use this
      flaw to bypass cross-origin restrictions.

97% of updates to correct 42 critical vulnerabilities were available via Red Hat Network either the same day or the next calendar day after the issues were public.

Previous update releases

We generally measure risk in terms of the number of vulnerabilities, but the actual effort in maintaining a Red Hat Enterprise Linux system is more related to the number of advisories we released: a single Firefox advisory may fix ten different issues of critical severity, but takes far less total effort to manage than ten separate advisories each fixing one critical PHP vulnerability.

To compare these statistics with previous update releases we need to take into account that the time between each update release is different. So looking at a default installation and calculating the number of advisories per month gives the following chart:

Security Errata per month Red Hat Enterprise Linux 6 Server Default InstallThis data is interesting to get a feel for the risk of running Enterprise Linux 6 Server, but isn’t really useful for comparisons with other major versions, distributions, or operating systems — for example, a default install of Red Hat Enterprise Linux 6 Server does not include Firefox, but Red Hat Enterprise Linux 5 Server does. You can use our public security measurement data and tools, and run your own custom metrics for any given Red Hat product, package set, timescales, and severity range of interest.

See also: 6.5, 6.4, 6.3, 6.2, and 6.1 risk reports.

November 10, 2014

System Audits – There Has to be a Better Way!

We’re now at the point where we can discuss a system audit. We have defined what an audit is, what security requirements are, and what a security guide is.

At the most basic level, a system audit involves examining a system to verify that it conforms to specifications. This includes operational specifications for the role the system is performing, verifying the integrity and configuration of the system, and compliance against the company security guide.

In many cases system audits are manual processes. A team of people, either internally or from an external company hired to do the audit, go though a written set of checklists and manually verify system settings and configuration.

These audits are time consuming, tedious, and expensive. They are also error prone…

As a result, companies may only audit a system every six months, once a year, or even every two years.

There has to be a better way!


November 06, 2014

High Level Requirements for a Security Guide

Let’s lay out some basic requirements for a security guide:

  • The security guide must exist. It must be available, updated, and maintained.
  • The security guide must incorporate relevant government and industry requirements.
  • The security guide must be actionable. If it can’t be implemented it is useless.
  • The security guide should be pro-active, describing what should be done, not what is forbidden. And, where applicable, how to do it.
  • It should be possible to verify compliance with the security requirements through a system audit.
  • The security guide should support the company mission.

This last item may strike you as a bit odd… Recall that the reason we have computer systems is to generate business value. The security guide should balance security risk against generation of business value. If a computer system can’t be used to generate business value, you might as well get rid of it. And, of course, the most secure system is one that doesn’t exist! (Just for the record, this is a joke…)

Ideally, the security guide is a tool to improve the operation of an organization, balancing protection needs against business needs, ease of use, and the threat profile a company actually faces.


October 30, 2014

Ability to remove TLS 1.0 from httpd in CentOS 6

Due to a bug in mod_ssl, the ability to remove TLS 1.0 (and only support TLS 1.1 and/or TLS 1.2) has not been available.  The fix has now made it to CentOS 6 and you can now fine-tune your cryptographic protocols with ease.

Before the fix my /etc/httpd/conf.d/ssl.conf file had this line:

SSLProtocol all -SSLv2 -SSLv3

This allows all SSL protocols except SSLv2 and SSLv3 to be used with httpd.  This isn’t a bad solution but there are a couple of sites that I’d prefer to further lock down by removing TLS 1.0 and TLS 1.2 1.1.  With the fix now in mod_ssl my settings can now look like this:

SSLProtocol all -SSLv2 -SSLv3 -TLSv1 -TLSv1.1

…and I’ll only support TLS 1.2 and beyond.  Of course doing this will significantly reduce the number of clients that can connect to my server.  According to SSLLabs I’m blocking all IE users before IE 11, Android before 4.4.2, Java 7, and Firefox 24.2.0 ESR.  But luckily I really don’t have a problem with any of these browsers for a couple of things I do so I’ll likely tighten up security there and leave my more public sites alone.

Update (2014-12-12)

NSS and mod_nss for httpd wasn’t discussed because it’s not in use on my systems.  it should be noted that mod_nss can be similarly configured as mod_ssl however mod_nss does not support TLS 1.2 and you’ll max out at TLS 1.1.


October 22, 2014

Configuring FreeBSD as a FreeIPA client

A recent thread on the freeipa-users mailing list highlighted one user’s experience with setting up FreeBSD as a FreeIPA client, complete with SSSD and Sudo integration. GNU+Linux systems have ipa-client-install, but the lack of an equivalent on FreeBSD means that much of the configuration must be done manually. There is a lot of room for error, and this user encountered several "gotchas" and caveats.

Services that require manual configuration include PAM, NSS, Kerberos and SSSD. Certain features may require even more services to be configured, such as sshd, for known_hosts management. Most of the steps have been outlined in a post on the FreeBSD forums.

But before one can even begin configuring all these services, SSSD, Sudo and related software and dependencies must be installed. Unfortunately, as also outlined in the forum post, non-default port options and a certain make.conf variable must be set in order to build the software such that the system can be used as a FreeIPA client. Similarly, the official binary package repositories do not provide the packages in a suitable configuration.

This post details how I built a custom binary package repository for FreeBSD and how administrators can use it to install exactly the right packages needed to operate as a FreeIPA client. Not all FreeBSD administrators will want to take this path, but those who do will not have to worry about getting the ports built correctly, and will save some time since the packages come pre-built.

Custom package repository

poudriere is a tool for creating binary package repositories compatible with FreeBSD’s next-generation pkg(8) package manager (also known as "pkgng".) The official package repositories are built using poudriere, but anyone can use it to build their own package repositories. Repositories are built in isolated jails (an OS-level virtualisation technology similar to LXC or Docker) and can build packages from a list of ports (or the entire ports tree) with customised options. A customised make.conf file can also be supplied for each jail.

Providing a custom repository with FreeIPA-compatible packages is a practical way to help people wanting to use FreeBSD with FreeIPA. It means fewer steps in preparing a system as a FreeIPA client (fewer opportunities to make mistakes), and also saves a substantial amount of time since the administrator doesn’t need to build any ports. The BSD Now podcast has a detailed poudriere tutorial; all the detail on how to use poudriere is included there, so I will just list the FreeIPA-specific configuration for the FreeIPA repository:

  • security/sudo is built with the SSSD option set
  • WANT_OPENLDAP_SASL=yes appears in the jail’s make.conf

The repository is currently being built for FreeBSD 10.0 (both amd64 and i386.) 10.1 is not far away; once it is released I will build it for 10.1 instead. If anyone out there would like it built for FreeBSD 9.3 I can do that too – just let me know!

Assuming the custom repository is available for the release and architecture of the FreeBSD system, the following script will enable the repository and install the required packages.

#!/bin/sh
pkg install -y ca_root_nss
ln -s /usr/local/share/certs/ca-root-nss.crt /etc/ssl/cert.pem
mkdir -p /usr/local/etc/pkg/repos
cat >/usr/local/etc/pkg/repos/FreeIPA.conf <<"EOF"
FreeIPA: {
  url: "https://frase.id.au/pkg/${ABI}_FreeIPA",
  signature_type: "pubkey",
  pubkey: "/usr/share/keys/pkg/FreeIPA.pem",
  enabled: yes
}
EOF
cat >/usr/share/keys/pkg/FreeIPA.pem <<EOF
-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAopt0Lubb0ur+L+VzsP9k
i4QrvQb/4gVlmr/d59lUsTr9cz5B5OtNLi+WMVcNh4EmmNIiWoVuQY4Wqjm2d1IA
VCXw+OqeAuj9nUW4jSvI/lDLyErFBXezNM5yggeesiV2ii+uO41zOjUxnSkupFzh
zOWr+Oj4kJI/iNU++3RpzyrBSmSGK9TN9k3afhyDMNlJi5SqK/wOrSjqAMfaufHE
MkJqBibDL/+xx48SbtInhtD4LIneHoOGxVtkLIcTSS5EpnIsDWZgXX6jBatv9LJe
u2UeQsKLKcCgrhT3VX+pc/aDsUFS4ZqOonLRt9mcFVxC4NDNMKsfXTCd760HQXYU
enVLydNavvGtGYQpbUWx5IT3IphaNxWANACpWrcvTawgPyGkGTPd347Nqhm5YV2c
YRf4rVX/S7U0QOzMPxHKN4siZVCspiedY+O4P6qe2R2cTyxntjLVGZcTBlXAdQJ8
UfQuuX97FX47xghxR6wyWfkXGCes2kVdVo0fF0vkYe1652SGJsfWjc5ojR9KFKkD
DN3x3Wu6kW0koZMF3Tf0rtSLDmbZEBddIPFrXo8QHiyqFtU3DLrYWGmbLRkYKnYR
KvG3XCJ6EmvMlfr8GjDIaEiGo7E7IyLusZXXzbIW2EKQdwa6p4N8wrW/30Ov53jp
rO+Bwn10+9DZTupQ3c04lsUCAwEAAQ==
-----END PUBLIC KEY-----
EOF
pkg update
pkg install -r FreeIPA -y cyrus-sasl-gssapi sssd sudo

Once the packages are installed from the custom repository, configuration can continue as indicated in the forum post.

Future efforts

This post was concerned with package installation. This is an important but relatively small part of setting up a FreeBSD client. There is more that can be done to make it easier to integrate FreeBSD (and other non-GNU+Linux systems) with FreeIPA. I will conclude this post with some ideas along this trajectory.

Recent versions of FreeIPA include the ipa-advise tool, which explains how various legacy systems can be configured to some extent as FreeIPA clients. ipa-advise config-freebsd-nss-pam-ldapd shows advice on how to configure a FreeBSD system, but the information is out of date in many respects – it references the old binary package tools (which have now been completely removed) and has no information about SSSD. This information should be updated. I have had this task on a sticky-note for a little while now, but if someone else beats me to it, that would be no bad thing.

The latest major version of SSSD is 1.12, but the FreeBSD port is back at 1.9. The 1.9 release is a long-term maintenance (LTM) release, but any efforts to bring 1.12 to FreeBSD alongside 1.9 would undoubtedly be appreciated by the port maintainer and users.

A longer term goal should be a port of (or an equivalent to) ipa-client-install for FreeBSD. Most of the software needed for FreeIPA integration on FreeBSD is similar or identical to that used on GNU+Linux, but there are some differences. It would be a time consuming task – lots of trial runs and testing – but probably not particularly difficult.

In regards to the package repository, work is underway to add support for package flavours to the FreeBSD packaging infrastructure. When this feature is ready, a small effort should be undertaken to add a FreeIPA flavour to the ports tree, and ensure that the resultant packages are made available in the official package repository. Once this is achieved, neither manual port builds nor the custom package repository will be required –
everything needed to configure FreeBSD as a FreeIPA client will be available to all FreeBSD users by default.

October 20, 2014

Can SSL 3.0 be fixed? An analysis of the POODLE attack.

SSL and TLS are cryptographic protocols which allow users to securely communicate over the Internet. Their development history is no different from other standards on the Internet. Security flaws were found with older versions and other improvements were required as technology progressed (for example elliptic curve cryptography or ECC), which led to the creation of newer versions of the protocol.

It is easier to write newer standards, and maybe even implement them in code, than to adapt existing ones while maintaining backward compatibility. The widespread use of SSL/TLS to secure traffic on the Internet makes a uniform update difficult. This is especially true for hardware and embedded devices such as routers and consumer electronics which may receive infrequent updates from their vendors.

The fact that legacy systems and protocols need to be supported, even though more secure options are available, has lead to the inclusion of a version negotiation mechanism in SSL/TLS protocols. This mechanism allows a client and a server to communicate even if the highest SSL/TLS version they support is not identical. The client indicates the highest version it supports in its ClientHello handshake message, then the server picks the highest version supported by both the client and the server, then communicates this version back to the client in its ServerHello handshake message. The SSL/TLS protocols implement protections to prevent a man-in-the-middle (MITM) attacker from being able to tamper with handshake messages that force the use of a protocol version lower than the highest version supported by both the client and the server.

Most popular browsers implement a different out-of-band mechanism for fallback to earlier protocol versions. Some SSL/TLS implementations do not correctly handle cases when a connecting client supports a newer TLS protocol version than supported by the server, or when certain TLS extensions are used. Instead of negotiating the highest TLS version supported by the server the connection attempt may fail. As a workaround, the web browser may attempt to re-connect with certain protocol versions disabled. For example, the browser may initially connect claiming TLS 1.2 as the highest supported version, and subsequently reconnect claiming only TLS 1.1, TLS 1.0, or eventually SSL 3.0 as the highest supported version until the connection attempt succeeds. This can trivially allow a MITM attacker to cause a protocol downgrade and make the client/server use SSL 3.0. This fallback behavior is not seen in non HTTPS clients.

The issue related to the POODLE flaw is an attack against the “authenticate-then-encrypt” constructions used by block ciphers in their cipher block chaining (CBC) mode, as used in SSL and TLS. By using SSL 3.0, at most 256 connections are required to reliably decrypt one byte of ciphertext. Known flaws already affect RC4 and non block-ciphers and their use is discouraged.

Several cryptographic library vendors have issued patches which introduce the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) support to their libraries. This is essentially a fallback mechanism in which clients indicate to the server that they can speak a newer SSL/TLS versions than the one they are proposing. If TLS_FALLBACK_SCSV was included in the ClientHello and the highest protocol version supported by the server is higher than the version indicated by the client, the server aborts the connection, because it means that the client is trying to fallback to a older version even though it can speak the newer version.

Before applying this fix, there are several things that need to be understood:

  • As discussed before, only web browsers perform an out-of-band protocol fallback. Not all web browsers currently support TLS_FALLBACK_SCSV in their released version. Even if the patch is applied on the server, the connection may still be unsafe if the browser is able to negotiate SSL 3.0
  • Clients which do not implement out-of-protocol TLS version downgrades (generally anything which does not speak HTTPS) do not need to be changed. Adding TLS_FALLBACK_SCSV is unnecessary (and even impossible) if there is no downgrade logic in the client application.
  • Thunderbird shares a lot of its code with the Firefox web browser, including the connection setup code for IMAPS and SMTPS. This means that Thunderbird will perform an insecure protocol downgrade, just like Firefox. However, the plaintext recovery attack described in the POODLE paper does not apply to IMAPS or SMTPS, and the web browser in Thunderbird has Javascript disabled, and is usually not used to access sites which require authentication, so the impact on Thunderbird is very limited.
  • The TLS/SSL server needs to be patched to support the SCSV extension – though, as opposed to the client, the server does not have to be rebuilt with source changes applied. Just installing an upgrade TLS library is sufficient. Due to the current lack of browser support, this server-side change does not have any positive security impact as of this writing. It only prepares for a future where a significant share of browsers implement TLS_FALLBACK_SCSV.
  • If both the server and the client are patched and one of them only supports SSL 3.0, SSL 3.0 will be used directly, which results in a connection with reduced security (compared to currently recommended practices). However, the alternative is a total connection failure or, in some situations, an unencrypted connection which does nothing to protect from an MITM attack. SSL 3.0 is still better than an unencrypted connection.
  • As a stop-gap measure against attacks based on SSL 3.0, disabling support for this aging protocol can be performed on the server and the client. Advice on disabling SSL 3.0 in various Red Hat products and components is available on the Knowledge Base.

Information about (the lack of ongoing) attacks may help with a decision. Protocol downgrades are not covert attacks, in particular in this case. It is possible to log SSL/TLS protocol versions negotiated with clients and compare these versions with expected version numbers (as derived from user profiles or the HTTP user agent header). Even after a forced downgrade to SSL 3.0, HTTPS protects against tampering. The plaintext recovery attack described in the POODLE paper (Bodo Möller, Thai Duong, Krzysztof Kotowicz, This POODLE Bites: Exploiting The SSL 3.0 Fallback, September 2014) can be detected by the server and just the number of requests generated by it could be noticeable.

Red Hat has done additional research regarding the downgrade attack in question. We have not found any clients that can be forcibly downgraded by an attacker other than clients that speak HTTPS. Due to this fact, disabling SSL 3.0 on services which are not used by HTTPS clients does not affect the level of security offered. A client that supports a higher protocol version and cannot be downgraded is not at issue as it will always use the higher protocol version.

SSL 3.0 cannot be repaired at this point because what constitutes the SSL 3.0 protocol is set in stone by its specification. However, starting in 1999, successor protocols to SSL 3.0 were developed called TLS 1.0, 1.1, and 1.2 (which is currently the most recent version). Because of the built-in protocol upgrade mechanisms, these successor protocols will be used whenever possible. In this sense, SSL 3.0 has indeed been fixed – an update to SSL 3.0 should be seen as being TLS 1.0, 1.1, and 1.2. Implementing TLS_FALLBACK_SCSV handling in servers makes sure that attackers cannot circumvent the fixes in later protocol versions.

October 15, 2014

POODLE – An SSL 3.0 Vulnerability (CVE-2014-3566)

Red Hat Product Security has been made aware of a vulnerability in the SSL 3.0 protocol, which has been assigned CVE-2014-3566. All implementations of SSL 3.0 are affected. This vulnerability allows a man-in-the-middle attacker to decrypt ciphertext using a padding oracle side-channel attack.

To mitigate this vulnerability, it is recommended that you explicitly disable SSL 3.0 in favor of TLS 1.1 or later in all affected packages.

A brief history

Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols designed to provide communication security over networks. The SSL protocol was originally developed by Netscape.  Version 1.0 and was never publicly released; version 2.0 was released in February 1995 but contained a number of security flaws which ultimately led to the design of SSL 3.0. Over the years, several flaws were found in the design of SSL 3.0 as well. This ultimately lead to the development and widespread use of the TLS protocol.

Most TLS implementations remain backward compatible with SSL 3.0 to incorporate legacy systems and provide a smoother user experience. Many SSL clients implement a protocol downgrade “dance” to work around the server side interoperability issues. Once the connection is downgraded to SSL 3.0, RC4 or a block cipher with CBC mode is used; this is where the problem starts!

What is POODLE?

The POODLE vulnerability has two aspects. The first aspect is a weakness in the SSL 3.0 protocol, a padding oracle. An attacker can exploit this vulnerability to recover small amounts of plaintext from an encrypted SSL 3.0 connection, by issuing crafted HTTPS requests created by client-side Javascript code, for example. Multiple HTTPS requests are required for each recovered plaintext byte, and the vulnerability allows attackers to confirm if a particular byte was guessed correctly. This vulnerability is inherent to SSL 3.0 and unavoidable in this protocol version. The fix is to upgrade to newer versions, up to TLS 1.2 if possible.

Normally, a client and a server automatically negotiate the most recent supported protocol version of SSL/TLS. The second aspect of the POODLE vulnerability concerns this negotiation mechanism. For the protocol negotiation mechanism to work, servers must gracefully deal with a more recent protocol version offered by clients. (The connection would just use the older, server-supported version in such a scenario, not benefiting from future protocol enhancements.) However, when newer TLS versions were deployed, it was discovered that some servers just terminated the connection at the TCP layer or responded with a fatal handshake error, preventing a secure connection from being established. Clearly, this server behavior is a violation of the TLS protocol, but there were concerns that this behavior would make it impossible to deploy upgraded clients and widespread interoperability failures were feared. Consequently, browsers first try a recent TLS version, and if that fails, they attempt again with older protocol versions, until they end up at SSL 3.0, which suffers from the padding-related vulnerability described above. This behavior is sometimes called the compatibility dance. It is not part of TLS implementations such as OpenSSL, NSS, or GNUTLS; it is implemented by application code in client applications such as Firefox and Thunderbird.

Both aspects of POODLE require a man in the middle attack at the network layer. The first aspect of this flaw, the SSL 3.0 vulnerability, requires that an attacker can observe the network traffic between a client and a server and somehow trigger crafted network traffic from the client. This does not strictly require active manipulation of the network transmission, passive eavesdropping is sufficient. However, the second aspect, the forced protocol downgrade, requires active manipulation of network traffic.  As described in the POODLE paper, both aspects require the attacker to be able to observe and manipulate network traffic while it is in transit.


How are modern browsers affected by the POODLE security flaw?

Browsers are particularly vulnerable because session cookies are short and an ideal target for plain text recovery, and the way HTTPS works allows an attacker to generate many guesses quickly (either through Javascript or by downloading images). Browsers are also most likely to implement the compatibility fallback.
By default, Firefox supports SSL 3.0, and performs the compatibility fallback as described above. SSL 3.0 support can be switched off, but the compatibility fallback cannot be configured separately.

Is this issue fixed?

The first aspect of POODLE, the SSL 3.0 protocol vulnerability, has already been fixed through iterative protocol improvements, leading to the current TLS version, 1.2. It is simply not possible to address this in the context of the SSL 3.0 protocol, a protocol upgrade to one of the successors is needed. Note that TLS versions before 1.1 had similar padding-related protocol issues, which is why we recommend to switch to TLS 1.1, if possible. These issues, while present, are currently not known to be exploitable in a way such as POODLE. The first aspect of the POODLE vulnerability is not exploitable with TLS 1.0. (SSL and TLS are still quite similar as protocols, the name change has non-technical reasons.)

The second aspect, caused by browsers which implement the compatibility fallback in an insecure way, has yet to be addressed. Strictly speaking, this is a security vulnerability in browsers due to the way they misuse the TLS protocol. One way to fix this issue would be to remove the compatibility dance, focusing instead on making servers compatible with clients implementing the most recent TLS implementation (as explained, the protocol supports a version negotiation mechanism, but some servers refuse to implement it).

However, there is an industry-wide effort under way to enable browsers to downgrade in a secure fashion, using a new Signaling Cipher Suite Value (SCSV). This will require updates in browsers (such as Firefox) and TLS libraries (such as OpenSSL, NSS and GNUTLS). However, we do not envision changes in TLS client applications which currently do not implement the fallback logic, and neither in TLS server applications as long as they use one of the system TLS libraries. TLS-aware packet filters, firewalls, load balancers, and other infrastructure may need upgrades as well.

Is there going to be another SSL 3.0 issue in the near future? Is there a long term solution?

Disabling SSL 3.0 will obviously prevent exposure to future SSL 3.0-specific issues. The new SCSV-based downgrade mechanism should reliably prevent the use of SSL 3.0 if both parties support a newer protocol version. Once these software updates are widely deployed, the need to disable SSL 3.0 to address this and future vulnerabilities will hopefully be greatly reduced.

SSL 3.0 is typically used in conjunction with the RC4 stream cipher. (The only other secure option in a strict, SSL 3.0-only implementation is Triple DES, which is quite slow even on modern CPUs.) RC4 is already considered very weak, and SSL 3.0 does not even apply some of the recommended countermeasures which prolonged the lifetime of RC4 in other contexts. This is another reason to deploy support for more recent TLS versions.

I have patched my SSL implementation against BEAST and LUCKY-13, am I still vulnerable?

This depends on the type of mitigation you have implemented. If you disabled protocol versions earlier than TLS 1.1 (which includes SSL 3.0), then the POODLE issue does not affect your installation. If you forced clients to use RC4, the first aspect of POODLE does not apply, but you and your users are vulnerable to all of the weaknesses in RC4. If you implemented the n/n-1 split through a software update, or if you deployed TLS 1.1 support without enforcing it, but made no other configuration changes, you are still vulnerable to the POODLE issue.

Is it possible to monitor for exploit attempts?

The protocol downgrade is visible on the server side. Usually, servers can log TLS protocol versions. This information can then be compared with user agents or other information from the profile of a logged-in user, and mismatches could indicate attack attempts.

Attempts to abuse the SSL 3.0 padding oracle part of POODLE, as described in the paper, are visible to the server as well. They result in a fair number of HTTPS requests which follow a pattern not expected during the normal course of execution of a web application. However, it cannot be ruled out that a more sophisticated adaptive chosen plain text attack avoids confirmation of guesses from the server, and this more advanced attack would not be visible to the server, only to the client and the network up to the point at which the attacker performs their traffic manipulation.

What happens when i disable SSL 3.0 on my web server?

Some old browsers may not be able to support a secure connection to your site. Estimates of the number of such browsers in active use vary and depend on the target audience of a web site. SSL protocol version logging (see above) can be used to estimate the impact of disabling SSL 3.0 because it will be used only if no TLS version is available in the client.

Major browser vendors including Mozilla and Google have announced that they are to deactivate the SSL 3.0 in their upcoming versions.

How do I secure my Red Hat-supported software?

Red Hat has put together several articles regarding the removal of SSL 3.0 from its products.  Customers should review the recommendations and test changes before making them live in production systems.  As always, Red Hat Support is available to answer any questions you may have.

October 14, 2014

Who can sign for what?

In my last post, I discussed how to extract the signing information out of a token. But just because the signature on a document is valid does not mean that the user who signed it was authorized to do so. How can we got from a signature to validating a token? Can we use that same mechanism to sign other OpenStack messages?

The following is a proposed extension to Keystone client based on existing mechanisms.

Overview

  1. Extract signer data out of the certificates
  2. Fetch the compete list of certificate from Keystone using the OS-SIMPLE-CERT extension
  3. Match the signer to the cert to validate the signature and extract the domain data for the token
  4. Fetch the mapping info from the Federation extension
  5. Use the mapping info to convert from the signing cert to a keystone user and groups
  6. Fetch the effective roles from Keystone for the user/groups for that domain
  7. Fetch policy from Keystone
  8. Execute the policy check to validate that the signer could sign for the data.

We need a method to go from the certificate used to sign the document to a valid Keystone user. Implied in there is that everything signed in an OpenStack system is going to be signed by a Keystone user. This is an expansion on how things were done in the past, but there is a pretty solid basis for this approach: in Kerberos, everything is a Principal, whether user or system.

From Tokens to Certs

The Token has the CMS Signer Info.  We can extract that information as previously shown.

The OS-SIMPLE-CERT extension has an API for fetching all of the signing certs as once:

This might not scale greatly, it is sufficient for supporting a proof-of-concept.  It reduces the problem of “how to find the cert for this token”  down to a match between the signing info and the attributes of the certificates.

To extract the data from the certificates, We can Popen the OpenSSL command to validate a certificate.  This is proper security practice anyway, as, while we trust the authoritative Keystone, we should verify whenever possible.  It will be expensive, but this result can be cached and reused, so it should not have to happen very often.

From Certs to Users

To translate from a certificate to a user, we need to first parse the data out of the certificate. This is possible doing a call to OpenSSL. We can be especially efficient by using that call to validate the certificate itself, and then converting the response to a dictionary. Keystone already has a tool to convert a dictionary to the Identity objects (user and groups): the mapping mechanism in the Federation backend. Since a mapping is in a file we can fetch, we do not need to be inside the Keystone server to process the mapping, we just need to use the same mechanism.

Mappings

The OS-FEDERATION extension has an API to List all mappings.

And another to get each mapping.

Again, this will be expensive, but it can be cached now, and optimized in the future.

The same process that uses the mappings to translate the env-vars for an X509 certificate  to a user inside the Keystone server can be performed externally. This means extracting code from the Federation plugin of the Keystone server to python-keystoneclient.

From User to Roles

Once we have the users and groups, we need to get the Role data appropriate to the token. This means validating the token, and extracting out the domain for the project. Then we will use the Identity API to list effective role assignments

We’ll probably have to call this once for the user ID and then once for each of the groups from the mapping in order to get the full set of roles.

From Roles to Permission

Now, how do we determine if the user was capable of signing for the specified token? We need a policy file. Which one? The one abstraction we currently have is that a policy file can be associated with an endpoint. Since keystone is responsible for controlling the signing of tokens, the logical endpoint is the authoritative keystone server where we are getting the certificates etc:

We get the effective policy associated with the keystone endpoint using the policy API.

And now we can run the users RBAC through the policy engine to see if they can sign for the given token.  The policy engine is part of oslo common.  There is some “flattening” code from the Keystone server we will want to pull over.  But of these will again land in python-keystoneclient.

Implementation

This is a lot of communication with Keystone, but it should not have to be done very often: once each of these API calls have been made, the response can be cached for a reasonable amount of time. For example, a caching rule could say that all data is current for a minimum of 5 minutes. After that time, if a newly submitted token has an unknown signer info, the client could refetch the certificates. The notification mechanism from Keystone could also be extended to invalidate the cache of remote clients that register for such notifications.

For validating tokens in remote endpoints, the process will be split between python-keystoneclient and keystonemiddleware.  The Middleware piece will be responsible for cache management and maintaining the state of the validation between calls.  Keystone Client will expose each of the steps with a parameter that allows the cached state to be passed in, as well as already exposing the remote API wrapping functions.

At this state, it looks like no changes will have to be made to the Keystone server itself.  This shows the power of the existing abstractions.  In the future, some of the calls may need optimization.   Of example, the fetch for certificates may need to be broken down into a call that fetches an individual certificate by its signing info.

October 10, 2014

Who Signed that Token?

The specification For multiple signers requires a mechanism to determine who signed the token and then determine I’d the signer had the authority to issue a token for the scope of the token.  These are the steps he he necessary to perform that validation.

The CMS document is signed by a certificate, but, due to size constraints, the certificate has been stripped out of the token.  All that remains in the token is the ‘signer info’ section of the CMS document, as defined here http://tools.ietf.org/html/rfc5652#page-13 as

SignerIdentifier ::= CHOICE {
issuerAndSerialNumber IssuerAndSerialNumber,
subjectKeyIdentifier [0] SubjectKeyIdentifier }

IssuerAndSerialNumber is defined as:

The IssuerAndSerialNumber type identifies a certificate, and thereby
an entity and a public key, by the distinguished name of the
certificate issuer and an issuer-specific certificate serial number.

The definition of Name is taken from X.501 [X.501-88], and the
definition of CertificateSerialNumber is taken from X.509 [X.509-97].

IssuerAndSerialNumber ::= SEQUENCE {
issuer Name,
serialNumber CertificateSerialNumber }

CertificateSerialNumber ::= INTEGER

http://tools.ietf.org/html/rfc5652#section-10.2.4

SubjectKeyIdentifier is a more general approach which allows the user to identify if a specific certificate matches (it has the Subject Key Identifier in it as well) but does not provide a way to identify who signed it, or where to fetch the corresponding certificate.

With the Keystone server acting as the system of reference, we know where to fetch the certificate. All signing certificates can be fetched using the OS-SIMPLE-CERT extension. So, regardless of which form the signing info takes, we could determine which certificate to use in order to verify the token.

To date, only a single certificate is used to sign tokens. Identifying the signer of the token is the first step to expanding that in the future.

What does the current OpenSSL binary call produce:  I took the DER form of one of the sample tokens and passed it through /usr/lib64/nss/unsupported-tools/derdump  (one of the most useful of tools)  and saw:

C-Sequence  (816)
   Object Identifier  (9)
      1 2 840 113549 1 7 2 (PKCS #7 Signed Data)
   C-[0]  (801)
      C-Sequence  (797)
         Integer  (1)
            01 
         C-Set  (13)
            C-Sequence  (11)
               Object Identifier  (9)
                  2 16 840 1 101 3 4 2 1 (SHA-256)
         C-Sequence  (309)
            Object Identifier  (9)
               1 2 840 113549 1 7 1 (PKCS #7 Data)
            C-[0]  (294)
               Octet String  (290)
                  7b 22 61 63 63 65 73 73 22 3a 20 7b 22 74 6f 6b 65 6e 
                  22 3a 20 7b 22 65 78 70 69 72 65 73 22 3a 20 22 32 31 
                  31 32 2d 30 38 2d 31 37 54 31 35 3a 33 35 3a 33 34 5a 
                  22 2c 20 22 69 64 22 3a 20 22 30 31 65 30 33 32 63 39 
                  39 36 65 66 34 34 30 36 62 31 34 34 33 33 35 39 31 35 
                  61 34 31 65 37 39 22 7d 2c 20 22 73 65 72 76 69 63 65 
                  43 61 74 61 6c 6f 67 22 3a 20 7b 7d 2c 20 22 75 73 65 
                  72 22 3a 20 7b 22 75 73 65 72 6e 61 6d 65 22 3a 20 22 
                  75 73 65 72 5f 6e 61 6d 65 31 22 2c 20 22 72 6f 6c 65 
                  73 5f 6c 69 6e 6b 73 22 3a 20 5b 5d 2c 20 22 69 64 22 
                  3a 20 22 63 39 63 38 39 65 33 62 65 33 65 65 34 35 33 
                  66 62 66 30 30 63 37 39 36 36 66 36 64 33 66 62 64 22 
                  2c 20 22 72 6f 6c 65 73 22 3a 20 5b 7b 22 6e 61 6d 65 
                  22 3a 20 22 72 6f 6c 65 31 22 7d 2c 20 7b 22 6e 61 6d 
                  65 22 3a 20 22 72 6f 6c 65 32 22 7d 5d 2c 20 22 6e 61 
                  6d 65 22 3a 20 22 75 73 65 72 5f 6e 61 6d 65 31 22 7d 
                  7d 7d 
         C-Set  (462)
            C-Sequence  (458)
               Integer  (1)
                  01 
               C-Sequence  (164)
                  C-Sequence  (158)
                     C-Set  (10)
                        C-Sequence  (8)
                           Object Identifier  (3)
                              2 5 4 5 (X520 Serial Number)
                           Printable String  (1)
                              "5"
                     C-Set  (11)
                        C-Sequence  (9)
                           Object Identifier  (3)
                              2 5 4 6 (X520 Country Name)
                           Printable String  (2)
                              "US"
                     C-Set  (11)
                        C-Sequence  (9)
                           Object Identifier  (3)
                              2 5 4 8 (X520 State Or Province Name)
                           Printable String  (2)
                              "CA"
                     C-Set  (18)
                        C-Sequence  (16)
                           Object Identifier  (3)
                              2 5 4 7 (X520 Locality Name)
                           Printable String  (9)
                              "Sunnyvale"
                     C-Set  (18)
                        C-Sequence  (16)
                           Object Identifier  (3)
                              2 5 4 10 (X520 Organization Name)
                           Printable String  (9)
                              "OpenStack"
                     C-Set  (17)
                        C-Sequence  (15)
                           Object Identifier  (3)
                              2 5 4 11 (X520 Organizational Unit Name)
                           Printable String  (8)
                              "Keystone"
                     C-Set  (37)
                        C-Sequence  (35)
                           Object Identifier  (9)
                              1 2 840 113549 1 9 1 (PKCS #9 Email Address)
                           IA5 String  (22)
                              "keystone@openstack.org"
                     C-Set  (20)
                        C-Sequence  (18)
                           Object Identifier  (3)
                              2 5 4 3 (X520 Common Name)
                           Printable String  (11)
                              "Self Signed"
                  Integer  (1)
                     11 
               C-Sequence  (11)
                  Object Identifier  (9)
                     2 16 840 1 101 3 4 2 1 (SHA-256)
               C-Sequence  (13)
                  Object Identifier  (9)
                     1 2 840 113549 1 1 1 (PKCS #1 RSA Encryption)
                  NULL  (0)
               Octet String  (256)
                  6e 93 08 58 52 dd 52 db 65 b9 aa 9b f5 87 37 bc 56 f2 
                  b5 25 05 a5 9b 37 68 cc 9e 2e f3 80 49 e2 58 d8 70 01 
                  35 0a 7b 66 c7 15 2c 65 6b b3 15 31 e6 8b 8e 27 eb 12 
                  d5 70 cd 71 b1 ae 68 fe b6 cf a6 b5 d7 a3 a6 84 d9 0d 
                  52 d9 e6 cd 38 fa b9 7e c5 09 63 76 99 14 3a f6 5b 71 
                  9c b7 90 9b 36 64 b5 f3 77 e6 5e ca e1 06 d1 bb a9 fc 
                  39 4c a5 e4 b2 0b 86 ae 46 d7 40 67 9d 82 38 3c 4e 69 
                  ee 00 d0 0d a8 d4 38 f9 8d a3 96 36 4d ed 18 6a 2f c4 
                  09 9f 13 9e 71 b4 31 5a f9 24 57 80 52 a2 dc 69 6e e4 
                  76 96 1b ef ae 2a cb 2d f7 fe 6d d9 6e db 3d e2 03 d1 
                  00 00 8d 8e 2c 13 49 bf 0a 10 09 74 c0 d9 25 2f 7d 1e 
                  8e f2 f0 ff 79 a4 ce 45 a0 4d a5 8d 4c c5 18 44 66 8a 
                  90 a5 55 c8 6d a9 53 1c a6 d0 47 c1 26 40 45 9f 05 91 
                  41 00 ad e0 03 6a 15 8b fc d4 7c c4 d1 26 34 0d a2 9b 
                  a7 e6 95 c7 

Th field X520 Serial Number Shows it is serial number 5 from the CA. The CA is then identified by its X500 Attributes such as

X520 Common Name

Which is “Self Signed”. The following Python code will dump the signing info for the PEM version of a token. Its basically a stripped down version of this code.

#!/usr/bin/python
import argparse
import base64
import errno
import hashlib
import logging
import zlib

from pyasn1_modules import rfc2315, rfc2459, pem
from pyasn1.codec.der import encoder, decoder

from pyasn1.type import univ, char
from pyasn1.codec.der import decoder as der_decoder


infile=None
parser = argparse.ArgumentParser()
parser.add_argument('infile', metavar='infile', help='token file to decode')
args = parser.parse_args()

in_file = open(args.infile, 'r')

idx, substrate = pem.readPemBlocksFromFile(
    in_file, ('-----BEGIN CMS-----', '-----END CMS-----'))


contentInfo, rest = der_decoder.decode(substrate, 
                                       asn1Spec=rfc2315.ContentInfo())

contentType = contentInfo.getComponentByName('contentType')
contentInfoMap = {
    (1, 2, 840, 113549, 1, 7, 1): rfc2315.Data(),
    (1, 2, 840, 113549, 1, 7, 2): rfc2315.SignedData(),
    (1, 2, 840, 113549, 1, 7, 3): rfc2315.EnvelopedData(),
    (1, 2, 840, 113549, 1, 7, 4): rfc2315.SignedAndEnvelopedData(),
    (1, 2, 840, 113549, 1, 7, 5): rfc2315.DigestedData(),
    (1, 2, 840, 113549, 1, 7, 6): rfc2315.EncryptedData()
}

content, _ = decoder.decode(
    contentInfo.getComponentByName('content'),
    asn1Spec=contentInfoMap[contentType]
)
signerInfos = content.getComponentByName('signerInfos')[0]
issuer_and_sn = signerInfos.getComponentByName('issuerAndSerialNumber')
issuer =  issuer_and_sn.getComponentByName('issuer')
for n in issuer[0]:
    print n[0][0]
    print n[0][1]

It does something wrong with the text fields, I suspect because they are some form of Not–quite-text that needs to be parsed. I don’t think I need to take this farther, however. INstead, I plan on following an example from the OCSP generating sample code that hashes the issuer and compares the hash.

UPDATE: Thanks to gsilvis who pointed out a spurious import in the code sample.

October 09, 2014

Automated configuration analysis for Mozilla’s TLS guidelines

My friend Hubert has been doing a lot of work to make better the world a little safer.  Glad he’s getting some recognition.  Here’s a great article on testing your server for proper SSL/TLS configurations.


Ansible Hostgroups from FreeIPA

Ansible provides management for a large array of servers using ssh as the access mechanism. This is a good match for  FreeIPA.  However, by default Ansible uses a flat file to store groups of hosts.  How can we get that info from FreeIPA?

 

If you want to run the `uptime` command on all web servers, you would define a fragment of /etc/ansible/hosts  like this:

[webservers]

alpha.example.org
beta.example.org
192.168.1.100
192.168.1.110
web1.example.com

And then run

ansible webservers -a uptime

In order to get ansible to use a different scheme, use a dynamic inventory.  I wrote a proof of concept one  that uses the hostgroup definitions from my IPA server to populate a json file.  The format of the file is specified in this tutorial:

My Sample ignores the command line parameters, and just returns the whole set of hostgroups.

#Apache License...

#!/usr/bin/python

import json
from ipalib import api
api.bootstrap(context='cli')
api.finalize()
api.Backend.xmlclient.connect()
inventory = {}
hostvars={}
meta={}
result =api.Command.hostgroup_find()['result']
for hostgroup in result:
    inventory[hostgroup['cn'][0]] = { 'hosts': [host for host in hostgroup['member_host']]}
    for host in hostgroup['member_host']:
        hostvars[host] = {}
inventory['_meta'] = {'hostvars': hostvars}
inv_string = json.dumps( inventory)
print inv_string

I copied it to /etc/ansible/freeipa.py and ran:

 

$ ansible -i /etc/ansible/freeipa.py packstacked -a uptime
ayoungf20packstack.cloudlab.freeipa.org | success | rc=0 >>
20:42:33 up 141 days, 20:43, 2 users, load average: 0.22, 0.15, 0.14

multidom.cloudlab.freeipa.org | success | rc=0 >>
20:42:34 up 52 days, 3:17, 1 user, load average: 0.01, 0.03, 0.05

horizon.cloudlab.freeipa.org | success | rc=0 >>
20:42:35 up 51 days, 6:07, 2 users, load average: 0.00, 0.03, 0.05

As I said, this was a proof of concept. It does not do everything that you might want to have an inventory do. I plan on fleshing it out and submitting to the Ansible plugin repo. Meanwhile, you can look at the other examples.

If you are curious, here is the output from when I run my plugin:

$ python freeipa.py | python -mjson.tool
{
    "_meta": {
        "hostvars": {
            "ayoungf20packstack.cloudlab.freeipa.org": {},
            "horizon.cloudlab.freeipa.org": {},
            "ipa.cloudlab.freeipa.org": {},
            "jboss.cloudlab.freeipa.org": {},
            "multidom.cloudlab.freeipa.org": {}
        }
    },
    "keystone-ha-cluster": {
        "hosts": [
            "horizon.cloudlab.freeipa.org",
            "ipa.cloudlab.freeipa.org",
            "jboss.cloudlab.freeipa.org"
        ]
    },
    "packstacked": {
        "hosts": [
            "ayoungf20packstack.cloudlab.freeipa.org",
            "horizon.cloudlab.freeipa.org",
            "multidom.cloudlab.freeipa.org"
        ]
    }
}

October 08, 2014

The Source of Vulnerabilities, How Red Hat finds out about vulnerabilities.

Red Hat Product Security track lots of data about every vulnerability affecting every Red Hat product. We make all this data available on our Measurement page and from time to time write various blog posts and reports about interesting metrics or trends.

One metric we’ve not written about since 2009 is the source of the vulnerabilities we fix. We want to answer the question of how did Red Hat Product Security first hear about each vulnerability?

Every vulnerability that affects a Red Hat product is given a master tracking bug in Red Hat bugzilla. This bug contains a whiteboard field with a comma separated list of metadata including the dates we found out about the issue, and the source. You can get a file containing all this information already gathered for every CVE. A few months ago we updated our ‘daysofrisk’ command line tool to parse the source information allowing anyone to quickly create reports like this one.

So let’s take a look at some example views of recent data: every vulnerability fixed in every Red Hat product in the 12 months up to 30th August 2014 (a total of 1012 vulnerabilities).

Firstly a chart just giving the breakdown of how we first found out about each issue: Sources of issues

  • CERT: Issues reported to us from a national cert like CERT/CC or CPNI, generally in advance of public disclosure
  • Individual: Issues reported to Red Hat Product Security directly by a customer or researcher, generally in advance of public disclosure
  • Red Hat: Issues found by Red Hat employees
  • Relationship: Issues reported to us by upstream projects, generally in advance of public disclosure
  • Peer vendors: Issues reported to us by other OS distributions, through relationships
    or a shared private forum
  • Internet: For issues not disclosed in advance we monitor a number of mailing lists and security web pages of upstream projects
  • CVE: If we’ve not found out about an issue any other way, we can catch it from the list of public assigned CVE names from Mitre

Next a breakdown of if we knew about the issue in advance. For the purposes of our reports we count knowing the same day of an issue as not knowing in advance, even though we might have had a few hours notice: Known in advanceThere are few interesting observations from this data:

  • Red Hat employees find a lot of vulnerabilities. We don’t just sit back and wait for others to find flaws for us to fix, we actively look for issues ourselves and these are found by engineering, quality assurance, as well as our security teams. 17% of all the issues we fixed in the year were found by Red Hat employees. The issues we find are shared back in advance where possible to upstream and other peer vendors (generally via the ‘distros’ shared private forum).
  • Relationships matter. When you are fixing vulnerabilities in third party software, having a relationship with the upstream makes a big difference. But
    it’s really important to note here that this should never be a one-way street, if an upstream is willing to give Red Hat information about flaws in advance,
    then we need to be willing to add value to that notification by sanity checking the draft advisory, checking the patches, and feeding back the
    results from our quality testing. A recent good example of this is the OpenSSL CCS Injection flaw; our relationship with OpenSSL gave us advance
    notice of the issue and we found a mistake in the advisory as well as a mistake in the patch which otherwise would have caused OpenSSL to have to have
    done a secondary fix after release. Only two of the dozens of companies prenotified about those OpenSSL issues actually added value back to OpenSSL.
  • Red Hat can influence the way this metric looks; without a dedicated security team a vendor could just watch what another vendor does and copy them,
    or rely on public feeds such as the list of assigned CVE names from Mitre. We can make the choice to invest to find more issues and build upstream relationships.

October 07, 2014

Yellow Sticky of Doom in the Cloud

The password managers we discussed in the last post are a good start. If you only use one system a local password database is all you need.

Most people have multiple “devices” – a PC, a laptop, a smartphone, a tablet, and the number keeps growing. It would be terribly convenient to have access to your passwords on all of your devices, and to have everything automatically updated when you add or change a password.

This is where network – or today CLOUD BASED (highlighted for dramatic emphasis…) – password managers come into play. These networked password managers share, distribute, backup, and replicate your passwords.

Putting your passwords IN THE CLOUD should make you nervous. It is important to do your homework before choosing one – don’t just choose the first one that comes up on a search!

There are several places to look. Wikipedia has a List of Password Managers. Information Week has an article on 10 Top Password Managers. Network World published Best tools for protecting passwords. Mac World produced Mac password managers. At a minimum make sure that the password managers you are considering have at least some public review and feedback. You should also do web searches looking for user experience and any issues with the various password managers.

For cloud based password managers, one of the most important things is to make sure that you retain control of the passwords. This is done by encrypting the password data locally, on your system, and only sending encrypted data to the cloud. Done properly, the master encryption password for the password database never leaves your system – no one, including the company hosting your password manager, can decrypt your password. Of course this also means that if you lose your password manager password you are out of luck; no one can recover it.

As an anecdote, not a recommendation, a thoroughly paranoid colleague who works in the security space and whose opinions I respect recommends LastPass.  I prefer open source password managers that can be audited, like KeePassX, but there don’t seem to be any with good Cloud integration.


October 01, 2014

Wherein our hero attempts to build his own OpenStack Keystone RPMs

I have a Devstack setup. I’ve hacked the Keystone repo to add some cool feature.  I want to test it out with an RDO deployment.  How do I make my own RPM for the RDO system?

This is not a how to. This is more like a police log.

sudo mkdir fedora
sudo chown ayoung:ayoung fedora
cd fedora/
git clone http://pkgs.fedoraproject.org/cgit/openstack-keystone.git

What did that get us?

$ ls openstack-keystone
0001-remove-runtime-dep-on-python-pbr.patch
0002-sync-parameter-values-with-keystone-dist.conf.patch
daemon_notify.sh
keystone-dist.conf
openstack-keystone.init
openstack-keystone.logrotate
openstack-keystone-sample-data
openstack-keystone.service
openstack-keystone.spec
openstack-keystone.sysctl
openstack-keystone.upstart
sources

When I build, I do so in /home/ayoung/rpmbuild. How:

$ cat ~/.rpmmacros
%_topdir /home/ayoung/rpmbuild
%packager Adam Young
%dist .f20_ayoung

So, first move these files into %_topdir/SOURCES

 cp * ~/rpmbuild/SOURCES/

Technically you don’t need to move the .spec file there, but it hurts nothing to do so.

Ok, lets make a first attempt at building:

$ rpmbuild -bp openstack-keystone.spec 
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz

So it is looking for a tarball named keystone-2014.2.b3.tar.gz. Where does it get that name:

In openstack-keystone.spec:

%global release_name juno
%global milestone 3

%global with_doc %{!?_without_doc:1}%{?_without_doc:0}

Name:           openstack-keystone
Version:        2014.2
Release:        0.4.b%{milestone}%{?dist}
Summary:        OpenStack Identity Service
...
Source0:        http://launchpad.net/keystone/%{release_name}/%{release_name}-%{milestone}/+download/keystone-%{version}.b%{milestone}.tar.gz

Now, lets assume that I already have that version installed with RDO, and I want to test my change on top of it. If I make a mile > 3 it will update. I could do this by changing the spec file.

changeing
-%global milestone 3
+%global milestone 3a
would give me :
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3a.tar.gz: No such file or directory

But can I do that from the command line? Sure. Use -D, and put hta macro in quotes:

$ rpmbuild -D "milestone 3b"  -bp openstack-keystone.spec 
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz: No such file or directory

Ok, so now I need a tarball that looks like the file name I will use in the rpmbuild process. Use git-archive to produce it.

cd /opt/stack/keystone
git archive -o /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz  HEAD
cd /opt/fedora/openstack-keystone/
 rpmbuild -bp openstack-keystone.spec
error: Failed build dependencies:
	python-pbr is needed by openstack-keystone-2014.2-0.4.b3.f20_ayoung.noarch

Alright! Closer. There is a Yum utility to help out with this last problem. TO get the utilities:

sudo yum install yum-utils

then to get the build dependencies:

sudo yum-builddep openstack-keystone.spec

The next time running an rpmbuild gets the error:

+ cd keystone-2014.2.b3
/var/tmp/rpm-tmp.NSt5Oy: line 37: cd: keystone-2014.2.b3: No such file or directory

Due to the fact that the archive command above is not putting the code in a subdir. There is a flag for that: –prefix

cd /opt/stack/keystone
git archive  --prefix=keystone-2014.2.b3/   -o /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz  HEAD
cd /opt/fedora/openstack-keystone
rpmbuild -bp openstack-keystone.spec
+ echo 'Patch #1 (0001-remove-runtime-dep-on-python-pbr.patch):'
Patch #1 (0001-remove-runtime-dep-on-python-pbr.patch):
+ /usr/bin/cat /home/ayoung/rpmbuild/SOURCES/0001-remove-runtime-dep-on-python-pbr.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file bin/keystone-all

Now I get an error on a patch application: this is better, as it implies I am packing the tarball correctly. That change is basically removing pbr from the keystone-all startup file. PBR does versioning type stuff that competes with RPM.

Let’s ignore that patch for now: Keystone will run just fine with PBR in. In the spec file I comment out:

%patch0001 -p1

Code in the spec file is doing something comparable, and fails out due to me skipping first patch.

+ sed -i s/REDHATKEYSTONEVERSION/2014.2/ bin/keystone-all keystone/cli.py
+ sed -i s/2014.2.b3/2014.2/ PKG-INFO

Skip that code for now, too.

--- a/openstack-keystone.spec
+++ b/openstack-keystone.spec
@@ -117,7 +117,7 @@ This package contains documentation for Keystone.
 %prep
 %setup -q -n keystone-%{version}.b%{milestone}
 
-%patch0001 -p1
+#%patch0001 -p1
 %patch0002 -p1
 
 find . \( -name .gitignore -o -name .placeholder \) -delete
@@ -127,9 +127,9 @@ rm -rf keystone.egg-info
 # Let RPM handle the dependencies
 rm -f test-requirements.txt requirements.txt
 # Remove dependency on pbr and set version as per rpm
-sed -i s/REDHATKEYSTONEVERSION/%{version}/ bin/keystone-all keystone/cli.py
+#sed -i s/REDHATKEYSTONEVERSION/%{version}/ bin/keystone-all keystone/cli.py
 
-sed -i 's/%{version}.b%{milestone}/%{version}/' PKG-INFO
+#sed -i 's/%{version}.b%{milestone}/%{version}/' PKG-INFO

And the build works. Thus far, I’ve been running -bp which just preps the source tree in the repo. Lets see what the full rpmbuild process returns:

+ cp etc/keystone.conf.sample etc/keystone.conf
+ /usr/bin/python setup.py build
error in setup command: Error parsing /home/ayoung/rpmbuild/BUILD/keystone-2014.2.b3/setup.cfg:Exception: Versioning for this project requires either an sdist tarball, or access to an upstream git repository. Are you sure that git is installed?

There is that python build reasonableness change. OK, what is that doing? Lets start with the last line we commented out in the spec file:

+#sed -i ‘s/%{version}.b%{milestone}/%{version}/’ PKG-INF

This is putting the RPM version of the python library into the Python EGG info. In my repo I have

keystone.egg-info/PKG-INFO

with

Version: 2014.2.dev154.g1af2428

I wonder if I can make the RPM version look like that? It won’t be an rpm -U to go from my RDO deployment if I do, but I can work around that with –oldpackage.

What is dev154.g1af2428? I suspect some git magic for the last part. Git log shows:

commit 1af24284bdc093dae4f027ade2ddb29656b676f0

So g1af2428? is g + githash[0:6] but what about dev154? It turns out it is the number of commits since the last tag.

At this point, I resorted to IRC, and then reading the code. The short of it is that I am submitting a patch to the keystone spec file to remove all of the PBR related changes. Instead, when python setup.py is called, we’ll use the environment variable to tell PBR the version number. For example

PBR_VERSION=%{version}.%{milestone}  %{__python} setup.py build

September 30, 2014

LDAP persistent searches with ldapjdk

As part of the LDAP-based profiles feature I’ve been working on for the Dogtag, it was necessary to implement a feature where the database is monitored for changes to the LDAP profiles. For example, when a profile is updated on a clone, that change is replicated to other clones, and those other clones have to detect that change and each instance must update its view of the profiles accordingly. This post details how the LDAP persistent search feature was used to implement this behaviour.

A naïve approach to solving this problem would have been to unconditionally refresh all profiles at a certain interval. Slightly better would be to check all profiles at a certain interval and update those that have changed. Both of these methods involve some non-trivial delay between changes being replicated to the local database, and the profile subsystem reflecting those changes.

A different approach was to use the LDAP persistent search capability. With this feature, once the search is running, the client receives immediate notification of changes. This advantage commended it over the polling approach as a more appropriate basis for a solution.

ldapjdk persistent search API

A big part of the motivation for this post was the paucity of the ldapjdk documentation with respect to persistent searches. The necessary information is all there – but it is scattered across several classes, all of which play some important part in a working implementation, but none of which tells the full story.

Hopefully some people will benefit from this information being brought together in one place and explained step by step. Let’s look at the classes involved one by one as we build up the solution.

LDAPPersistSearchControl

This is the server control that activates the persistent search behaviour. It also provides static flags for specifying what kinds of updates to listen for. Its constructor takes a union of these flags and three boolean values:

changesOnly

Whether to return existing entries that match the search criteria. For our use case, we are only interested in changes.

returnControls

Whether to return entry change controls with each search result. These controls are required if you need to know what kind of change occured (add, modify, delete or modified DN).

isCritical

Whether this control is critical to the search operation.

The LDAPPersistSearchControl object used for our persistent search is constructed in the following way:

int op = LDAPPersistSearchControl.ADD
    | LDAPPersistSearchControl.MODIFY
    | LDAPPersistSearchControl.DELETE
    | LDAPPersistSearchControl.MODDN;
LDAPPersistSearchControl persistCtrl =
    new LDAPPersistSearchControl(op, true, true, true);

LDAPSearchConstraints

The LDAPSearchConstraints object sets various controls and parameters for an LDAP search, persistent or otherwise. In our case, we need to attach the LDAPPersistSearchControl to the constraints, as well as disable the timeout of the search, and set the results batch size to 1 so that no buffering of results will occur at the server:

LDAPSearchConstraints cons = conn.getSearchConstraints();
cons.setServerControls(persistCtrl);
cons.setBatchSize(1);
cons.setServerTimeLimit(0 /* seconds */);

LDAPSearchResults

Executing the search method of an LDAPConnection (here named conn), yields an LDAPSearchResults object. This is the same whether or the search was a persistent search according to the LDAPSearchConstraints. The different between persistent and non-persistent searches is in how results are retrieved from the results object: if the search is persistent, the hasMoreElement method will block until the next result is received from the server (or the search times out, the connection dies, et cetera).

Let’s see what it looks like to actually execute the persistent search and process its results:

LDAPConnection conn = ... /* an open LDAPConnection */

LDAPSearchResults results = conn.search(
    "ou=certificateProfiles,ou=ca," + basedn, /* search DN */
    LDAPConnection.SCOPE_ONE, /* search at one level below DN */
    "(objectclass=*)",        /* search filter */
    null,   /* list of attributes we care about */
    false,  /* whether to only include specified attributes */
    cons    /* LDAPSearchConstraints defined above */
);
while (results.hasMoreElements()) /* blocks */ {
    LDAPEntry entry = results.next();
    /* ... process result ... */
}

We see that apart from the use of the LDAPSearchConstraints to specify a persistent search and the blocking behaviour of LDAPSearchResults.hasMoreElements, performing a persistent search is the same as performing a regular search.

Let us next examine what happens inside that while loop.

LDAPEntryChangeControl

Do you recall the returnControls parameter for LDAPPersistSearchControl? If true, it ensures that each entry returned by the persistent search is accompanied by a control that indicates the type of change that affected the entry. We need to know this information so that we can update the profile subsystem in the appropriate way (was this profile added, updated, or deleted?)

Let’s look at how we do this. We are inside the while loop from above, starting exactly where we left off:

LDAPEntry entry = results.next();
LDAPEntryChangeControl changeControl = null;
for (LDAPControl control : results.getResponseControls()) {
    if (control instanceof LDAPEntryChangeControl) {
        changeControl = (LDAPEntryChangeControl) control;
        break;
    }
}
if (changeControl != null) {
    int changeType = changeControl.getChangeType();
    switch (changeType) {
    case LDAPPersistSearchControl.ADD:
        readProfile(entry);
        break;
    case LDAPPersistSearchControl.DELETE:
        forgetProfile(entry);
        break;
    case LDAPPersistSearchControl.MODIFY:
        forgetProfile(entry);
        readProfile(entry);
        break;
    case LDAPPersistSearchControl.MODDN:
        /* shouldn't happen; log a warning and continue */
        CMS.debug("Profile change monitor: MODDN shouldn't happen; ignoring.");
        break;
    default:
        /* shouldn't happen; log a warning and continue */
        CMS.debug("Profile change monitor: unknown change type: " + changeType);
        break;
    }
} else {
    /* shouldn't happen; log a warning and continue */
    CMS.debug("Profile change monitor: no LDAPEntryChangeControl in result.");
}

The first thing that has to be done is to retrieve from the LDAPSearchResults object the LDAPEntryChangeControl for the most recent search result. To do this we call results.getResponseControls(), which returns an LDAPControl[]. Each search result can arrive with multiple change controls, but we are specifically interested in the LDAPEntryChangeControl so we iterate over the LDAPControl[] until we find what we want, then break.

Next we ensure that we did in fact find the LDAPEntryChangeControl. This should always hold in our implementation but the code should handle the failure case anyway – we just log a warning and move on.

Finally, we call changeControl.getChangeType() and dispatch to the appropriate behaviour according to its value.

Interaction with the profile subsystem

Up to this point, we have seen how to use the ldapjdk API to execute a persistent LDAP search and process its results. Of course, this is just part of the story – the search somehow needs to be run in a way that doesn’t impede the regular operation of the Dogtag PKI, and needs to safely interact with the profile subsystem. Because the persistent search involves blocking calls, the procedure needs to run in its own thread.

Because this persistent search only concerns the ProfileSubsystem class, it was possible to completely encapsulate it within this class such that no changes to its API (including constructors) were necessary. An inner class Monitor, which extends Thread, actually runs the search. In this way, the code we saw above is neatly segregated from the rest of the ProfileSubsystem class, and there are no visibility issues when calling the readProfile and forgetProfile methods of the other class.

The following simplified code conveys the essence of the complete implementation:

public class ProfileSubsystem implements IProfileSubsystem {
    public void init(...) {
        // Read profiles from LDAP into the subsystem.
        // Calls readProfile for each existing LDAPEntry.

        monitor = new Monitor(this, dn, dbFactory);
        monitor.start();
    }

    public synchronized IProfile createProfile(...) {
        // Create the profile
    }

    public void readProfile(LDAPEntry entry) {
        // Read some LDAP attributes into local vars
        createProfile(...);
    }

    private void forgetProfile(LDAPEntry entry) {
        profileId = /* read from entry */
        forgetProfile(profileId);
    }

    private void forgetProfile(String profileId) {
        // Forget about this profile.
    }

    private class Monitor extends Thread {
        public Monitor(...) {
            // constructor
        }

        public void run() {
            // Execute the persistent search as above.
            //
            // Calls readProfile and forgetProfile depending
            // on changes that occur.
        }
    }
}

So, what’s going on here? First of all, it must be emphasised that this example is simplified. For example, I have omitted details of how the monitor thread is stopped when the subsystem is shut down or reinitialised.

The monitor thread is started by the init method, once the existing profiles have been read into the profile subsystem. Executing the persistent search and handling results is the one job this the monitor has to do, so it can block without affecting any other part of the system. When it receives results, it calls the readProfile and forgetProfiles methods of the outer class – the ProfileSubsystem – to keep it up to date with the contents of the database.

Other parts of the system access the ProfileSubsystem as well, so consideration had to be given to synchronisation and making sure that changes to the contents of the ProfileSubsystem are done safely. In the end, the only method that was made synchronized was createProfile, which is also called by the REST interface. The behaviour of the handful of other methods that could be called simultaneously should be fine by virtue of the fact that the internal data structures used are themselves synchronised and idempotent. Hopefully I have not overlooked something important!

Conclusion

LDAP persistent searches can be used to receive immediate notification of changes that occur in an LDAP database. They support all the parameters of regular LDAP searches. ldapjdk’s API provides persistent search capabilities including the ability to discern what kind of change occurred for each result.

The ldapjdk LDAPSearchResults.hasMoreElements() method blocks each time it is called until a result has been received from the server. Because of this, it will usually be necessary to execute persistent searches asynchronously. Java threads can be employed to do this, but the usual "gotchas" of threading apply – threads must be stopped safely and the safety of methods that could be called from multiple places at the same time must be assessed. The synchronized keyword can be used to ensure serialisation of calls to methods that would otherwise be unsafe under these conditions.

September 29, 2014

Multiple Signers

You have a cloud, I have a cloud.

Neither of use are willing to surrender control of our OpenStack deployments, but we need to inter-operate.

We both have Keystone servers. Those servers are the system of record for user authorization through out our respective deployments. We each wish to maintain control of our assignments. How can we make a set of resources that can be shared?  It can’t be done today.  Here is why not, and how to make it possible.

Simple example:  Nova in OpenStack Deployment A (OSDA) creates a VM using an image from Glance in OpenStackDeployment B. Under the rules of policy, the VM and the image must both reside in the same project. However, the endpoints are managed by two different Keystone servers.

At the Keystone to Keystone level, we need to provide a shared view of the project. Since every project is owned by a domain, the correct level of abstraction is for the two Keystone servers to have a common view of the domain.  In our example we create a Domain in OSDA named “SharedDom” and assume it assigns “54321CBE” as the domain id. Keystone for OSDB needs to have an identical domain. Lets loosen the rules on domain creation to allow that: create domain takes both name and accepts “54321CBE” as the domain’s identifier.

Once we have the common domain definition, we can create a project under “SharedDom” in OSDA and, again, put a copy into OSDB.

Let us also assume that the service catalog has been synchronized across the two Keystone images. While you might be tempted to just make each a region in one Keystone server, that surrenders the control of the cloud to the remote Keystone admin, and so will not be organizationally acceptable. So, while we should tag each service catalog as a region, and keep the region names generally unique, they still are managed by distinct peer Keystone servers.

The API call that creates a VM in one endpoint by fetching an image from another accesses both endpoints with a single token. It is impossible to do today. Which keystone is then used to allocate the token? If it is OSDA, then Glance would reject it. if OSDB, Nova would reject it.

For PKI tokens, we can extract the “Signing Data” from the token body and determine which certificate signed it. From the certificate we can determine which Keystone server signed the token. Keystone would then have to provide rules for which Keystone server was allowed to sign for a domain. By default, it would be only the authoritative Keystone server associated with the same deployment. However, Keystone from OSDA would allow Keystone from OSDB to sign for Tokens under “SharedDom” only.

If both keystone servers are using UUID tokens we could arrange for the synchronization of all tokens for the shared domain across keystone instances. Now each endpoint would still authenticate tokens against its own Keystone server. The rules would be the same as far as determining which Keystone could sign for which tokens, it would just be enforced at the Keystone server level. My biggest concern with this approach is that there are synchronization problems; it has a tendency toward race conditions. But it could be made to work as well.

The agreement can be set one direction.  If OSDA  is managing the domain, and OSDB accepts the agreement, then OSDA can sign for tokens for the domain that are used across both deployments endpoints.   Keystone for OSDA must explicitly add a rule that says that Keystone for OSDB can sign for tokens as well.

This setup works  on a gentleman’s agreement genteel understanding that only one of the two Keystone servers  actively manages the domain.  One Keystone server is delegated the responsibility of managing the domain.  The other one performs that delegation, and agrees to stay out of it.  If the shared domain is proving to be too much trouble, either Keystone server can disable the domain without the agreement of the other Keystone server.

As Mick Jagger sang “Hey, you!  Get off of my cloud!”

September 26, 2014

A follow up to the Bash Exploit and SELinux.
One of the advantages of a remote exploit is to be able to setup and launch attacks on other machines.

I wondered if it would be possible to setup a bot net attack using the remote attach on an apache server with the bash exploit.

Looking at my rawhide machine's policy

sesearch -A -s httpd_sys_script_t -p name_connect -C | grep -v ^D
Found 24 semantic av rules:
   allow nsswitch_domain dns_port_t : tcp_socket { recv_msg send_msg name_connect } ;
   allow nsswitch_domain dnssec_port_t : tcp_socket name_connect ;
ET allow nsswitch_domain ldap_port_t : tcp_socket { recv_msg send_msg name_connect } ; [ authlogin_nsswitch_use_ldap ]


The apache script would only be allowed to connect/attack a dns server and an LDAP server.  It would not be allowed to become a spam bot (No connection to mail ports) or even attack other web service.

Could an attacker leave a back door to be later connected to even after the bash exploit is fixed?

# sesearch -A -s httpd_sys_script_t -p name_bind -C | grep -v ^D
#

Nope!  On my box the httpd_sys_script_t process is not allowed to listen on any network ports.

I guess the crackers will just have to find a machine with SELinux disabled.