July 24, 2014

Devstack mounted via NFS

Devstack allows the developer to work with the master branches for upstream OpenStack development. But Devstack performs many operations (such as replacing pip) that might be viewed as corrupting a machine, and should not be done on your development workstation. I’m currently developing with Devstack on a Virtual Machine running on my system. Here is my setup:

Both my virtual machine and my Base OS are Fedora 20. To run a virtual machine, I use KVM and virt-manager. My VM is fairly beefy, with 2 GB of Ram allocated, and a 28 GB hard disk.

I keep my code in git repositories on my host laptop. To make the code available to the virtual machine, I export them via NFS, and mount them on the host VM in /opt/stack, owned by the ayoung user, which mirrors the setup on the base system.

Make sure NFS is running with:

sudo systemctl enable nfs-server.service 
sudo systemctl start  nfs-server.service

My /etc/exports:

/opt/stack/ *(rw,sync,no_root_squash,no_subtree_check)

And to enable changes in this file

sudo exportfs

Make sure firewalld has the port for nfs open, but only for the internal network. For me, this is interface

virbr0: flags=4163 UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255

I used the firewall-config application to modify firewalld:

For both, make sure the Configuration select box is set on Permanent or you will be making this change each time you reboot.

First add the interface:

firewalld-nfs-interfaces

And enable NFS:

firewalld-nfs-ports

In the Virtual machine, I added a user (ayoung) with the same numeric userid and group id from my base laptop. To find these values:

$ getent passwd ayoung
ayoung:x:14370:14370:Adam Young:/home/ayoung:/bin/bash

I admit I created them when I installed the VM, which I did using the Anaconda installer and a DVD net-install image. However, the same thing can be done using user-add. I also added the user to the wheel group, which simplifies sudo.

On the remote machine, I created /opt/stack and let the ayoung user own them:

$ sudo mkdir /opt/stack ; sudo chown ayoung:ayoung /opt/stack

To mount the directory via nfs, I made an /etc/fstab entry:

192.168.122.1:/opt/stack /opt/stack              nfs4  defaults 0 0 

And now I can mount the directory with:

$ sudo mount /opt/stack

I went through and updated the git repos in /opt/stack using a simple shell script.

 for DIR in `ls` ; do pushd $DIR ; git fetch ; git rebase origin/master ; popd ; done

The alternative is setting RECLONE=yes in /opt/stack/devstack/localrc.

When running devstack, I had to make sure that the directory /opt/stack/data was created on the host machine. Devstack attempted to create it, but got an error induced by nfs.

Why did I go this route? I need to work on code running in HTTPD, namely Horizon and Keystone. THat preclueded me from doing all of my work in a venv on my laptop. The NFS mount gives me a few things:

  • I keep my Git repo intact on my laptop. This includes the Private key to access Gerrit
  • I can edit using PyCharm on my Laptop.
  • I am sure that the code on my laptop and in my virtual machine is identical.

This last point is essential for remote debugging. I just go this to work for Keystone, and have submitted a patch that enables it for Keystone. I’ll be working up something comparable for Horizon here shortly.

July 21, 2014

Threats: William the Manager
William the Manager

William the Manager

William is concerned with his group getting their job done. He is under budget pressure, time pressure, and requirements to deliver. William is a good manager – he is concerned for his people and dedicated to removing obstacles that get in their way.

To a large degree William is measured on current performance and expectations for the next quarter. This means that he has little sympathy for other departments getting in the way of his people making the business successful! A lot of his job involves working with other groups to make sure that they meet his needs. And when they don’t, he gets them over-ruled or works around them.

When William does planning – and he does! – he is focused on generating business value and getting results that benefit him and his team. He is not especially concerned about global architecture or systems design or “that long list of hypothetical security issues”. Get the job done, generate value for the company, and move on to the next opportunity.

William sees IT departments as an obstacle to overcome – they are slow, non-responsive, and keep doing things that get in the way of his team. He sees the security team in particular as being an unreasonable group of people who have no idea what things are like in the real world, and who seem be be dedicated to coming up with all sorts of ridiculous requirements that are apparently designed to keep the business from succeeding.

William, with the best of intentions, is likely to compromise and work around security controls – and often gets the support of top management in doing this. To be more blunt, if security gets in the way, it is gone! If a security feature interferes with getting work done, he will issue orders to turn that feature off. If you look at some of my other posts on the value of IT and computer systems, such as Creating Business Value, you will see that, at least in some cases, William may be right.

And this is assuming that William is a good corporate citizen, looking out for the best interests of the company. If he is just looking out for himself, the situation can be much worse.

It is not enough to try to educate William on security issues – for one thing (depending on the security feature), William may be right! The only chance for security is to find ways to implement security controls that don’t excessively impact the business units. And to keep the nuclear option for the severe cases where it is needed, such as saving credit card numbers in plain text on an Internet facing system. (Yes, this can easily happen – for example, William might set up a “quick and dirty” ecommerce system on AWS if the IT group isn’t able to meet his needs.)


July 18, 2014

Oh No! I Committed to master! What do I do?

You were working in a git repo and you committed your change to master. Happens all the time. Panic not.

Here are the steps to recover

Create a new branch from your current master branch. This will include your new commit.  (to be clear, you should replace ‘description-of-work’  with a short name for your new branch)

git branch description-of-work

now reset your current master branch to upstream

git reset --hard origin/master

All fixed.

Why did this work?

A branch in git points to a specific commit.  All commits are named by hashes.  For example, right now, I have a keystone repo with my master branch pointing to

$ git show master
commit bbfd58a6c190607f7063d15a3e2836e40806ef57
Merge: e523119 f18911e
Author: Jenkins <jenkins@review.openstack.org>
Date: Fri Jul 11 23:36:17 2014 +0000

Merge "Do not use keystone's config for nova's port"

This is defined by a file in .git/refs:

$ cat .git/refs/heads/master 
bbfd58a6c190607f7063d15a3e2836e40806ef57

I could edit this file by hand and have the same effect as a git checkout. Lets do exactly that.

$ cp .git/refs/heads/master .git/refs/heads/edit-by-hand
$ git branch 
  edit-by-hand
* master
$ git show edit-by-hand 
commit bbfd58a6c190607f7063d15a3e2836e40806ef57
Merge: e523119 f18911e
Author: Jenkins <jenkins>
Date:   Fri Jul 11 23:36:17 2014 +0000

    Merge "Do not use keystone's config for nova's port"

OK, lets modify this the right way:

$ git checkout edit-by-hand 
Switched to branch 'edit-by-hand'
$ git reset --hard HEAD~1
HEAD is now at e523119 Merge "Adds hacking check for debug logging translations"
$ git show edit-by-hand 
commit e52311945a4ab3b47a39084b51a2cc596a2a1161
Merge: b0d690a 76baf5b
Author: Jenkins <jenkins>
Date:   Fri Jul 11 22:19:03 2014 +0000

    Merge "Adds hacking check for debug logging translations"
...

that made it in to:

$ cat .git/refs/heads/edit-by-hand 
e52311945a4ab3b47a39084b51a2cc596a2a1161

Here is the history for the edit-by-hand branch

$ git log edit-by-hand  --oneline

Returns

e523119 Merge "Adds hacking check for debug logging translations"
b0d690a Merge "multi-backend support for identity"
6aa0ad5 Merge "Imported Translations from Transifex"
...

I want the full hash for 6aa0ad5 so:

git show --stat 6aa0ad5
commit 6aa0ad5beb39107ffece6e5d4a068d77f7d51059

Lets set the branch to point to this:

$ echo 6aa0ad5beb39107ffece6e5d4a068d77f7d51059 > .git/refs/heads/edit-by-hand 
$ git show
commit 6aa0ad5beb39107ffece6e5d4a068d77f7d51059
Merge: 2bca93f bf8a2e2
Author: Jenkins <jenkins>
Date:   Fri Jul 11 21:48:09 2014 +0000

    Merge "Imported Translations from Transifex"

What would you expect now? I haven’t looked yet, but I would expect git to tell me that I have a bunch of unstaged changes; basically, everything that was in the commits on master that I chopped off of edit-by-hand. Lets look:

$ git status
On branch edit-by-hand
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   doc/source/configuration.rst
	modified:   doc/source/developing.rst
	modified:   etc/keystone.conf.sample
...
Untracked files:
  (use "git add <file>..." to include in what will be committed)

It was pretty long, so I cut out some files.

I can undo all those “changes” by setting the hash back to the original value:

$ echo e52311945a4ab3b47a39084b51a2cc596a2a1161   > .git/refs/heads/edit-by-hand 
$ git status
On branch edit-by-hand
Untracked files:
  (use "git add <file>..." to include in what will be committed)

Why did that work? Without the explicit “checkout” command, the current workspace was left unchanged.

July 16, 2014

Kerberos for Horizon and Keystone

I have a Horizon instance Proof of Concept. It has a way to go to be used in production, but the mechanism works.

This is not a how-to. I’ve written up some of the steps in the past. Instead, this is an attempt to illuminate some of the issues.

I started with a Packstack based all in one on Fedora 20 instance registered as a FreeIPA client. I hand modified Keystone to use HTTPD.

The Horizon HTTP instance is set up with S4U2Proxy. Since both Horizon and Keystone are on the same machine, it is both the source and target of the proxy rule; a user connecting via HTTPS gets a service ticket for Horizon, which then requests a delegated service ticket for itself. I’m not seeing any traffic on the KDC when this happens, which leads me to think that the Kerberos library is smart enough to reuse the initial service ticket for the user. However, I’ve also tested S4U2Proxy from HTTPD to Keystone on a remote machine in an earlier set up, and am fairly certain that this will work when Horizon and Keystone are not co-located.

After initial configuration, I did a git clone of the repositories for the projects I needed to modify:

  • django_openstack_auth
  • keystone
  • python-keystoneclient
  • python-openstackclient

To use this code, I switched to each directory and ran:

sudo pip install -e .

Horizon

overseas-broadening-horizons630x354

Horizon uses form based authentication. I have not modified this. Longer term, we would need to determine what UI to show based on the authentication mechanism. I would like to be able to disable Form Based authentication for the Kerberos case, as I think passing your Kerberos password over the wire is one of the worst security practices; We should actively discourage it.

Django OpenStack Auth and Keystone Client

Django_Reinhardt,_Aquarium,_New_York,_N.Y.,_ca._Nov._1946_(William_P._Gottlieb_07311)

Django Reinhardt

Horizon uses a project called django-openstack-auth that communicates with Keystone client. This needs to work with client auth plugins. I’ve hacked this in code, but the right solution is for it to get the auth plugin out of the Django configuration options. Implied here is that django-openstack-auth should be able to use keystoneclient sessions and V3 Keystone authentication.

keystone_2705841439_067c16b192

When a user authenticates to Horizon, they do not set a project. Thus, Horizon does not know what project to pass in the token request. The Token API has some strange behaviour when it comes to token requests without an explicit scope. If the user has a default project set, they get a token scoped to that project. If they do not have a default project set, they get an unscoped token.

Jamie Lennox has some outstanding patches to Keystone client that address the Kerberos use cases. Specifically, if the client gets an unscoped token, there is no service catalog. Since the client makes calls to Keystone based on endpoints in the service catalog, it cannot use an unscoped token. One of Jamie’s patches address this; if there is no service catalog, continue to use the Auth URL to talk to Keystone. This is a bit of an abuse of the Auth URL and really should only be used to get the list of domains or projects for a user. Once the user has this information, they can request a scoped token. This is what Horizon needs to do on behalf of the user.

While Keystone can use Kerberos as an “method” value when creating a token, the current set of plugins did not allow for mapping to the DefaultDomain. There is a plugin for “external” to do that, and I subclassed that for Kerberos. There is an outstanding ticket for removing the “method” value from the plugin implementations, which will reduce the number of classes we need to implement common behavior.

To talk to a Kerberos protected Keystone, the Keystone client needs to use the Kerberos Auth plugin. However, it can only use this to get a token; the auth plugin does not handle other communication. Thus, only the /auth/tokens path should be Kerberos protected. Here is what I am using:

<Location "/keystone/krb">
  LogLevel debug
  WSGIProcessGroup keystone_krb_wsgi
  NSSRequireSSL
</Location>

<Location "/keystone/krb/v3/auth/tokens">
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  # defaults off, but to be explicit
  # Keystone should not be a proxy 
  KrbConstrainedDelegation off
  Require valid-user
  NSSRequireSSL
</Location>

<Location "/dashboard/auth/login/">
  LogLevel debug
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  KrbConstrainedDelegation on
  Require valid-user
  NSSRequireSSL
</Location>

To go further, the Kerberos protection should be optional if you wish to allow other authentication methods to be stacked with Kerberos.

This implies that the value:

Require valid-user

Should be in the HTTPD conf section for Horizon, but not for Keystone. This does not yet work for me, and I will investigate further.

Herakles_menangkap_Kerberos

Manage Kerberos like a Hero

Secure web communications require cryptography. For authentication, the two HTTP based standards are Client Side Certificates and Kerberos. Of the two, only Kerberos allows for constrained delegation in a standardized way. Making Kerberos part of the standard approach to OpenStack will lead to a more secure OpenStack.

Towards efficient security code audits

Conducting a code review is often a daunting task, especially when the goal is to find security flaws. They can, and usually are, hidden in all parts and levels of the application – from the lowest level coding errors, through unsafe coding constructs, misuse of APIs, to the overall architecture of the application. Size and quality of the codebase, quality of (hopefully) existing documentation and time restrictions are the main complications of the review. It is therefore useful to have a plan beforehand: know what to look for, how to find the flaws efficiently and how to prioritize.

Code review should start by collecting and reviewing existing documentation about the application. The goal is to get a decent overall picture about the application – what is the expected functionality, what requirements can be possibly expected from the security standpoint, where are the trust boundaries. Not all flaws with security implications are relevant in all contexts, e.g. effective denial of service against server certainly has security implications, whereas coding error in command line application which causes excessive CPU load will probably have low impact. At the end of this phase it should be clear what are the security requirements and which flaws could have the highest impact.

Armed with this knowledge the next step is to define the scope for audit. It is generally always the case that conducting a thorough review would require much more resources than are available, so defining what parts will be audited and which vulnerabilities will be searched for increases efficiency of the audit. It is however necessary to state all the assumptions made explicitly in the report – this makes it possible for others to review them or revisit them in the future in next audits.

In general there are two approaches to conducting a code review – for the lack of better terminology we shall call them bottom up and top down. Of course, real audits always combine techniques from both, so this classification is merely useful when we want to put them in a context.

The top down approach starts with the overall picture of the application and security requirements and drills down towards lower levels of abstraction. We often start by identifying components of application, their relationships and mapping the flow of data. Drilling further down, we can choose to inspect potentially sensitive interfaces which components provide, how data is handled at rest and in motion, how access to sensitive parts of application are restricted etc. From this point audit is quickly becoming very targeted – since we have a good picture of which components, interfaces and channels might be vulnerable to which classes of attacks, we can focus our search and ignore the other parts. Sometimes this will bring us down to the level of line-by-line code inspection, but this is fine – it usually means that architecturally some part of security of application depends on correctness of the code in question.

Top down approach is invaluable, as it is possible to find flaws in overall architecture that would otherwise go unnoticed. However, it is also very demanding – it requires a broad knowledge of all classes of weaknesses, threat models and ability to switch between abstraction levels quickly. Cost of such audit can be reduced by reviewing the application very early in the design phase – unfortunately most of the times this is not possible due to development model chosen or phase in which audit was requested. Another way how to reduce the effort is to invest effort into documentation and reusing it in the future audits.

In the bottom up approach we usually look for indications of vulnerabilities in the code itself and investigate whether they can possibly lead to exploitation. These indications may include outright dangerous code, misuse of APIs, dangerous coding constructs and bad practices to poor code quality – all of these may indicate presence of weakness in the code. Search is usually automated, as there is abundance of tools to simplify this task including static analyzers, code quality metric tools and the most versatile one: grep. All of these reduce the cost of finding a potentially weak spots and so the cost lies in separating wheat from chaff. Bane of this appoach is receiver operating characteristic curve – it is difficult to substantially improve it, so we are usually left with the tradeoffs between false positives and false negatives.

Advantages of bottom up approach are relatively low requirements on resources and reusability. This means it is often easy and desirable to run such analyses as early and as often as possible. It is also much less depends on the skill of the reviewer, since the patterns can be collected to create a knowledgebase, aided with freely available resources on internet. It is a good idea to create checklists to make sure all common types of weaknesses are audited for and make this kind of review more scalable. On the other hand, biggest disadvantage is that certain classes of weaknesses can never be found with this approach – these usually include architectural flaws which lead to vulnerabilities with biggest impact.

The last step in any audit is writing a report. Even though this is usually perceived as the least productive time spent, it is an important one. A good report can enable other interested parties to further scrutinize weak points, provides necessary information to make a potentially hard decisions and is a good way to share and reuse knowledge that might otherwise stay private.

July 14, 2014

Threats: Stan the Security Czar

What?!? The security guy is listed as a threat to system security?

Stan the Security Czar
Absolutely. Stan is knowledgeable. He knows that the world is filled with evil. And he is determined to protect his company from it.

There is a famous saying: The only truly secure computer system is one that is melted down into slag, cast into a concrete block, and dumped into the deepest ocean trench. Even then you can’t be completely sure…

The challenge is that many things done to harden a computer system make the system more difficult to use. And the Law of Unintended Consequences always comes in to play. For example, to make passwords resistant to brute force attacks, you need to make them long and have them include different types of characters. And, for some reason, you need to change passwords regularly.

So, the answer is to require 16 character passwords with upper case, lower case, numbers, and special characters, containing no dictionary words, and to change them every 30 days – right? Ummm, no. This actually massively reduces security – we will talk about this more in a future post.

As another example, how about setting the inactivity timer in an application, which forces you to re-enter your username and password, to five minutes? Or perhaps two minutes or even one minute? After all, you can’t be too secure! Far from being effective security, this will result in computers being thrown off the roof of the building and lynch mobs looking for the person responsible! As well as a significant drop in productivity.

An excellent discussion of the behaviour of Stan the Security Czar occurs in the book “The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win” – this is an excellent book which I encourage everyone to read. It shows how a focus on technology, without taking into consideration the power of people and processes, can be very expensive and actually reduce effective security.

To bring things into a sharp focus, recall our premise that the reason for IT is to support the generation of business value, and that business value comes from people using applications to transform data. Anything that interferes with any part of this reduces the value of IT – and heavy handed security approaches can massively impact the business value of IT. Without careful consideration of human and business factors, Stan is likely to do things that hinder use of computer systems and actually reduce overall security in the name of improving security. The challenge in dealing with Stan is to achieve appropriate security while maintaining the business value of IT.


July 11, 2014

Daniel J. Bernstein lecture on software (in)security

Building secure software and secure software systems is obviously an important part of my job as a developer on the FreeIPA identity management and Dogtag PKI projects here at Red Hat. Last night I had the privilege of attending a lecture by the renowned Research Professor Daniel J. Bernstein at Queensland University of Technology entitled Making sure software stays insecure (slides). The abstract of his talk:

We have to watch and listen to everything that people are doing so that we can catch terrorists, drug dealers, pedophiles, and organized criminals. Some of this data is sent unencrypted through the Internet, or sent encrypted to a company that passes the data along to us, but we learn much more when we have comprehensive direct access to hundreds of millions of disks and screens and microphones and cameras. This talk explains how we’ve successfully manipulated the world’s software ecosystem to ensure our continuing access to this wealth of data. This talk will not cover our efforts against encryption, and will not cover our hardware back doors.

Of course, Prof. Bernstein was not the "we" of the abstract. Rather, the lecture, in its early part, took the form of a thought experiment suggesting how this manipulation could be taking place. In the latter part of the lecture, Prof. Bernstein justified and discussed some security primitives he feels are missing from today’s software.

I will now briefly recount the lecture and the Q&A that followed (a reconstitution of my handwritten notes; some paraphrase and omissions have occurred), then wrap up with my thoughts about the lecture.

Lecture notes

Introduction

  • Smartphones; almost everyone has one. Pretty much anyone in the world can turn on the microphone or camera and find out what’s happening.
  • It is terrifying that people (authoritarian governments, or, even if you trust your goverment now, can you trust the next one?) have access to such capabilities.
  • Watching everyone, all the time, is not an effective way to catch bad guys. Yes, they are bad, but total surveillance is ineffective and violates rights.
  • Prof. Bernstein has no evidence of deliberate manipulation of software ecosystems to this end, but now embarks on a though experiment: what if they did try?

Distract users

  • Things labelled as "security" but are actually not, e.g. anti-virus.
  • People are told to do these things, and indeed are happy to follow along. They feel good about doing something,
  • Money gets spent on e.g. virus scanners or 2014 NIST framework compliance, instead of building secure systems. 2014 NIST definition of "protect" has 98 subcategories, none of which are about making secure software.

Distract programmers

  • Automatic low-latency security updates are viewed as a security method.
  • "Security" is defined by public security vulnerabilities. This is not security. The reality is that there are other holes that attackers are actively exploiting.

Distract researchers

  • Attack papers and competitions are prominent, and research funding is often predicated on their outcomes.
  • Research into building secure systems takes a back seat.

Discourage security

  • Tell people that "there’s no such thing as 100% security, so why even try?"
  • Tell people that "it is impossible to even define security, so give up."
  • Some people make both of these claims simultaneously.
  • Hide, dismiss or mismeasure security metric #1 (defined later).
  • Prioritise compatibility, "standards", speed, e.g. "an HTTP server in the kernel is critical for performance".

Definition of security

  • Integrity policy #1: Whenever a computer shows a file, it also tells me the source of the file.
  • Example: UNIX file ownership and permissions. Multi-user system, no file sharing. If users are not sharing files, the UNIX model if implemented correctly can enforce integrity policy #1. How can we check?
    1. Check the code that enforces the file permission rules.
    2. Check the code that allocates memory, reads and writes files, and authenticates users.
    3. Check all the kernel code (beacuse it is all privileged).
  • The code to check is the trusted computing base (TCB). The size of the TCB is security metric #1. It is unnecessary to check or limit anything else.

Example: file sharing

  • Eve and Frank need to share files. Eve can own the file but give Frank write permissions.
  • By integrity policy #1, the operating system must record Frank as the source of the file.
  • If a process reads data from multiple sources, files written by the process must be marked with all those sources.

Example: web browsing

  • If you visit Frank’s site, browser may try to verify and show Frank as source of the file(s) being viewed. But browser TCB is huge.
  • What if instead of current model, you gave Frank a file upload account on your system. Files uploaded could be marked with Frank as source. Browser could then read these files.
  • Assuming the OS has this capability, it needn’t be manual. Web browsing could work this way.

Conclusion

  • Is the community even trying to build a software system with a small TCB that enforces integrity policy #1?

Q&A: Identification of sources

  • Cryptography is good for this in networked world, but current CA system is "pathetic".
  • Certificate transparency is a PKI consistency-check mechamism that may improve current infrastructure.
  • A revised infrastructure for obtaining public keys is preferable. Prof. Bernstein thinks GNUnet is interesting.
  • Smaller (i.e. actually auditable) crypto implementations are needed. TweetNaCl (pronounced "tweet salt") is a full implementation of the NaCl cryptography API in 100 tweets.

Q&A: Marking regions of file with different sources

  • I asked a question about whether there was scope within the definition of integrity policy #1 for marking regions of files with different sources, rather than marking a contiguous file with all sources.
  • Prof. Bernstein suggested that there is, but it would be better to change how we are representing that data and decompose it into separate files, rather than adding complexity to the TCB. A salient point.

Discussion

This was a thought-provoking and thoroughly enjoyable lecture. It was quite narrow in scope, defining and justifying one class of security primitives that Prof. Bernstein believes are essential. The question of how to identify a source did not come up until the Q&A. Primitives to enable privacy or anonymity did not come up at all. I suppose that by not mentioning them, Prof. Bernstein was making the point that they are orthogonal problem spaces (a sentiment I would agree with).

I should also note that there was no mention of any integrity policy #2, security metric #2, or so on. My interpretation of this is that Prof. Bernstein believes that the #1 definitions are sufficient in the domain of data provenance, but there are other reasonable interpretations.

The point about keeping the trusted computing base as simple and as small as possible was one of the big take-aways for me. His response to my question implies that he feels it is preferable to incur costs in complexity and implementation time outside the TCB, perhaps many times over, in pursuit of the goal of TCB auditability.

Finally, Prof. Bernstein is not alone in lamenting the current trust model in the PKI of the Internet. It didn’t have a lot to do with the message of his lecture, but I nevertheless look forward to learning more about GNUnet and checking out TweetNaCl.

July 09, 2014

Is there a Java Binding for LMIShell?

An interesting question just came up: “is there a Java binding for LMIShell?”

Hmm, good question – let’s answer it by digging into the OpenLMI architecture a bit.

LMIShell is a client framework written in Python. It consists of a Python language binding to the OpenLMI WBEM interface (which is CIM XML over https) which presents the OpenLMI objects as native Python objects, a set of helper functions, a set of task oriented management scripts, and a task oriented CLI interface.

LMIShell is designed to be extended by adding new management scripts (also written in Python) and CLI calls.

Java also has a language binding to the OpenLMI WBEM interface. In fact, since this is Linux, there are two of them… The Java language bindings are provided by the sblim-cim-client and sblim-cim-client2 packages. Both of these packages provide a CIM Client Class Library for Java applications which is compliant with the JCP JSR48 specification. Details about the Java Community Process and JSR48 can be found at http://www.jcp.org and http://www.jcp.org/en/jsr/detail?id=48. Note that documentation and examples are available – see the sblim-cim-client2-manual package.

Thus, there is a direct interface to the OpenLMI API from Java. An entire client application can be written in Java – in fact, there was discussion of whether LMIShell should be implemented in Python or Java.

If you want to use the LMIShell CLI from Java, that is straightforward. If you want to call LMIShell functions from Java, it can be done but is a little trickier. If you want to write a Java application directly against the OpenLMI API, use the Java language binding.

In many cases the easiest answer is likely to be to look at the LMIShell modules to see how they call the OpenLMI API, and then implement the function directly in Java using the Java language binding.


Diagnosing a Dogtag SELinux Issue

In this post, I explain an issue I had with Dogtag failing to start due to some recently added behaviour that was prohibited by Fedora’s SELinux security policy, and detail the steps that were taken to resolve it.

The Problem

A recent commit to Dogtag added the ability to archive each subsystem’s configuration file on startup. This feature is turned on by default. On each startup, each subsystem’s CS.cfg is copied to /etc/pki/<instance>/<subsystem>/archives/CS.cfg.bak.<timestamp>. A symbolic link pointing to the archived file named CS.cfg.bak is then created in the parent directory of archives/, alongside CS.cfg.

Having built and installed a development version of Dogtag that contained this new feature, I attempted to start Dogtag, but the service failed to start.

% sudo systemctl start pki-tomcatd@pki-tomcat.service
Job for pki-tomcatd@pki-tomcat.service failed. See 'systemctl status pki-tomcatd@pki-tomcat.service' and 'journalctl -xn' for details.

The error message gave some advice on what to do next, so I followed its advice.

% systemctl status pki-tomcatd@pki-tomcat.service
pki-tomcatd@pki-tomcat.service - PKI Tomcat Server pki-tomcat
   Loaded: loaded (/usr/lib/systemd/system/pki-tomcatd@.service; enabled)
   Active: failed (Result: exit-code) since Tue 2014-07-08 21:22:42 EDT; 1min 10s ago
  Process: 26699 ExecStop=/usr/libexec/tomcat/server stop (code=exited, status=1/FAILURE)
  Process: 26653 ExecStart=/usr/libexec/tomcat/server start (code=exited, status=143)
  Process: 32704 ExecStartPre=/usr/bin/pkidaemon start tomcat %i (code=exited, status=1/FAILURE)
 Main PID: 26653 (code=exited, status=143)

Jul 08 21:22:42 ipa-1.ipa.local systemd[1]: Starting PKI Tomcat Server pki-tomcat...
Jul 08 21:22:42 ipa-1.ipa.local pkidaemon[32704]: ln: failed to create symbolic link ‘/var/lib/pki/pki-tomcat/conf/ca/CS.cfg.bak‘: Permission denied
Jul 08 21:22:42 ipa-1.ipa.local pkidaemon[32704]: SUCCESS:  Successfully archived '/var/lib/pki/pki-tomcat/conf/ca/archives/CS.cfg.bak.20140708212242'
Jul 08 21:22:42 ipa-1.ipa.local pkidaemon[32704]: WARNING:  Failed to backup '/var/lib/pki/pki-tomcat/conf/ca/CS.cfg' to '/var/lib/pki/pki-tomcat/conf/ca/CS.cfg.bak'!
Jul 08 21:22:42 ipa-1.ipa.local pkidaemon[32704]: /usr/share/pki/scripts/operations: line 1579: 0: command not found
Jul 08 21:22:42 ipa-1.ipa.local systemd[1]: pki-tomcatd@pki-tomcat.service: control process exited, code=exited status=1
Jul 08 21:22:42 ipa-1.ipa.local systemd[1]: Failed to start PKI Tomcat Server pki-tomcat.
Jul 08 21:22:42 ipa-1.ipa.local systemd[1]: Unit pki-tomcatd@pki-tomcat.service entered failed state.

journalctl -xn gave essentially the same information as above. We can see that creation of the symbolic link failed, which led to a subsequent warning and failure to start the service. Interestingly, we can also see that creation of archives/CS.cfg.bak.20140708212242 (the target of the symbolic link) was reported to have succeeded.

The user that runs the Dogtag server is pkiuser, and everything seemed fine with the permissions in /etc/pki/pki-tomcat/ca/. The archived configuration file that was reported to have been created successfully was indeed there.

Next I looked at the Dogtag startup routines, which live in /usr/share/pki/script/operations. I located the offending ln -s and replaced it with a cp, that is, instead of creating a symbolic link, the startup script would now simply create CS.cfg.bak as a copy of the archived configuration file. Having made this change, I tried to start Dogtag again, and it succeeded. Something was prohibiting the creation of the symbolic link.

The Culprit

That something was SELinux.

SELinux (Security-Enhanced Linux) is mandatory access control system for Linux that can be used to express and enforce detailed security policies. It is enabled by default in recent version of Fedora, which ships with a reasonable default set of security policies.

The Workaround

To continue the diagnosis of this problem, I restored the original behaviour of the startup script, i.e. creating a symbolic link, and confirmed that Dogtag was once again failing to start.

The next step was to look for a way to get SELinux to permit the operation. I soon discovered setenforce(8), which is used to put SELinux into enforcing mode (setenforce 1; the default behaviour) or permissive mode (setenforce 0). As expected, running sudo setenforce 0 allowed Dogtag startup to succeed again, but obviously this was not a solution – merely a temporary workaround, acceptable in a development environment, but unacceptable for our customers and users.

The Plumbing

Having little prior experience with SELinux, and since it had reached the end of the day, I emailed the other developers for advice on how to proceed. Credit goes to Ade Lee for most of the information that follows.

SELinux logs to /var/log/audit/audit.log (on Fedora, at least). This log contains details about operations that SELinux denied (or would have denied, if it was enforcing). This log can be read by the audit2allow(1) tool, to construct SELinux rules that would allow the operations that were denied. First, the log was truncated, so it will include only the relevant failures:

% sudo sh -c ':>/var/log/audit/audit.log'

Next, with SELinux still in permissive mode so that all operations that would otherwise be denied throughout the startup process will be permitted but logged, I started the server via systemctl as before. Startup succeeded, and audit log now contained information about all the would-have-failed operations. Here is a short excerpt from the audit log (three lines, wrapped):

type=AVC msg=audit(1404872081.435:1006): avc:  denied  { create }
  for  pid=1298 comm="ln" name="CS.cfg.bak"
  scontext=system_u:system_r:pki_tomcat_t:s0
  tcontext=system_u:object_r:pki_tomcat_etc_rw_t:s0 tclass=lnk_file
type=SYSCALL msg=audit(1404872081.435:1006): arch=c000003e
  syscall=88 success=yes exit=0 a0=7fff6b27aac0 a1=7fff6b27ab03 a2=0
  a3=7fff6b278790 items=0 ppid=1113 pid=1298 auid=4294967295 uid=994
  gid=994 euid=994 suid=994 fsuid=994 egid=994 sgid=994 fsgid=994
  tty=(none) ses=4294967295 comm="ln" exe="/usr/bin/ln"
  subj=system_u:system_r:pki_tomcat_t:s0 key=(null)
type=AVC msg=audit(1404872081.436:1007): avc:  denied  { read }
  for  pid=1113 comm="pkidaemon" name="CS.cfg.bak" dev="vda3"
  ino=134697 scontext=system_u:system_r:pki_tomcat_t:s0
  tcontext=system_u:object_r:pki_tomcat_etc_rw_t:s0 tclass=lnk_file

There were about 30 lines in the audit log. As expected, there were entries related to the failure to create a symbolic link – those are the lines above. There were also entries that didn’t seem related to the symlink failure, yet were obviously caused by the Dogtag startup.

To one unfamiliar with SELinux, the format of the audit log and the meaning of the entries therein is somewhat opaque. Running sudo audit2why -a distils the audit log into more human-friendly information, giving information about six denials including the symlink denial:

type=AVC msg=audit(1404872081.435:1006): avc:  denied  { create } for  pid=1298 comm="ln" name="CS.cfg.bak" scontext=system_u:system_r:pki_tomcat_t:s0 tcontext=system_u:object_r:pki_tomcat_etc_rw_t:s0 tclass=lnk_file
        Was caused by:
                Missing type enforcement (TE) allow rule.

                You can use audit2allow to generate a loadable module to allow this access.

Each message gives the user, operation and labels of resources involved in the denied operation, and the cause of the denial. It also suggests using audit2allow(1) to generate the rules that would allow the failed operations. Running sudo audit2allow -a gave the following output:

#============= pki_tomcat_t ==============

#!!!! This avc is a constraint violation.  You would need to modify the attributes of either the source or target types to allow this access.
#Constraint rule:
        constrain file { create relabelfrom relabelto } ((u1 eq u2 -Fail-)  or (t1=pki_tomcat_t  eq TYPE_ENTRY -Fail-) { POLICY_SOURCE: can_change_object_identity } ); Constraint DENIED

#       Possible cause is the source user (system_u) and target user (unconfined_u) are different.
allow pki_tomcat_t pki_tomcat_etc_rw_t:file create;
allow pki_tomcat_t pki_tomcat_etc_rw_t:file { relabelfrom relabelto };
allow pki_tomcat_t pki_tomcat_etc_rw_t:lnk_file { read create };
allow pki_tomcat_t self:process setfscreate;

I have no idea about the meanings of the warning and the constrain rule, but the other rules make more sense. In particular, the second-last rule is undoubtedly the one that will allow the creation of symbolic links. Without knowing the specifics of this rule format, I would interpret this line as,

Allow processes with the pki_tomcat_t attribute to create and read symbolic links in in areas (of the filesystem) with the pki_tomcat_etc_rw_t attribute.

Admittedly, I have inferred processes and filesystem above, in no small part due to the names pki_tomcat_t and pki_tomcat_etc_rw_t, which were probably chosen by the Dogtag developers. Nevertheless, the rule format seems to do a satisfactory job of communicating the meaning of a rule, especially when descriptive labels are used.

The Fix

The SELinux policies that permit Dogtag to manage its affairs (configuration, logging, etc.) on a Fedora system are not shipped in the pki-* packages, but rather in the selinux-policy-targeted package, which provides policies for Dogtag and many other network servers and programs.

For an issue in this package to be corrected, one has to file a bug against the selinux-policy-targeted component of the Fedora product on the Red Hat Bugzilla. A reference policy should be attached to the bug report; audit2allow will generate one when invoked with the -R or -reference argument.

% sudo audit2allow -R -i /var/log/audit/audit.log > pki-lnk_file.te
could not open interface info [/var/lib/sepolgen/interface_info]

This failed, but a web search soon revealed that the appropriate interface is generated by the sepolgen-ifgen command, which is provided by the policycoreutils-devel package.

% sudo yum install -y policycoreutils-devel
% sudo sepolgen-ifgen
% sudo audit2allow -R -i /var/log/audit/audit.log > pki-lnk_file.te
% cat pki-lnk_file.te

require {
        type pki_tomcat_etc_rw_t;
        type pki_tomcat_t;
        class process setfscreate;
        class lnk_file { read create };
        class file { relabelfrom relabelto create };
}

#============= pki_tomcat_t ==============

#!!!! This avc is a constraint violation.  You would need to modify the attributes of either the source or target types to allow this access.
#Constraint rule:
        constrain file { create relabelfrom relabelto } ((u1 eq u2 -Fail-)  or (t1=pki_tomcat_t  eq TYPE_ENTRY -Fail-) { POLICY_SOURCE: can_change_object_identity } ); Constraint DENIED

#       Possible cause is the source user (system_u) and target user (unconfined_u) are different.
allow pki_tomcat_t pki_tomcat_etc_rw_t:file create;
allow pki_tomcat_t pki_tomcat_etc_rw_t:file { relabelfrom relabelto };
allow pki_tomcat_t pki_tomcat_etc_rw_t:lnk_file { read create };
allow pki_tomcat_t self:process setfscreate;

With pki-link_file.te in hand, I filed a bug. Hopefully the package will be updated soon.

Conclusion

When I first ran into this issue, I had very little experience with SELinux. I now know a fair bit more than I used to – how to quickly determine whether SELinux is responsible for a given failure, and what the operations were that failed – but there is much more to learn about the workings of SELinux and the definition and organisation of policies.

As to the occurrence of the problem itself, whilst from a security standpoint it makes sense to separate the granting of privileges to software from the provision of that software, as a developer, it frustrated me that I had to submit a request to another team responsible for a different aspect of Fedora just for Dogtag to be able to create a symbolic link in its own configuration directory!

This arrangement of having the policies for myriad common servers and programs provided centrally by one or two packages is new to me. There are obvious merits to this approach – and obvious drawbacks. Perhaps there is another approach that represents the best of both worlds – security for the user, and convenience or lack of roadblocks for the developer. Perhaps I am talking about containers, à la Docker.

In the mean time, until the selinux-policy-targeted package is updated to add the symbolic link rules Dogtag needs, with SELinux still in permissive mode on my development VM, I can get on with the job of implementing LDAP profile storage in Dogtag.

July 08, 2014

Audit Belongs with Policy

Policy in OpenStack is the mechanism by which Role-Based-Access-Control is implemented. Policy is distributed in rules files which are processed at the time of a user request. Audit has come to mean the automated emission and collection of events used for security review. The two processes are related and need a common set of mechanisms to build a secure and compliant system.

This is a little rough, but I promised I would sum up our discussion.

Why Unified

The policy enforces authorization decisions. These decisions need to be audited.

Assume that both policy and audit are implemented as middleware pipeline components. If policy happens before audit, then denied operations would not emit audit events.If policy happens after audit, the audit event does not know if the request was successful or not.If audit and policy are not unified, the audit event does not know what rule was actually applied for the authorization decision.

Current Status

Rob Basham,Matt Rutkowski,Brad Topol, and Gordon Chung presented on a middleware Auditing implementation at the Atlanta Summit.

Tokens are unpacked by a piece of code called keystonemiddleware.auth_token. (actually, its in keystoneclient at the moment, but moving)

Auth token middleware today does too much. It unpacks tokens, but it also enforces policy on them; if an API pipeline calls into auth_token, the absence of a token will trigger the return of a ’401 Unauthorized’.

The first step to handling this is that a specific server can set ‘delay_auth_decision = True’ in the config file, and then no policy is enforced, but the decision is instead deferred until later.

Currently, policy enforcement is performed on a per-project basis. The Keystone code that enforces policy starts with a decorator defined here in the Icehouse codebase;

http://git.openstack.org/cgit/openstack/keystone/tree/keystone/common/controller.py?h=stable/icehouse#n87

The Glance code base uses this code;

http://git.openstack.org/cgit/openstack/glance/tree/glance/api/policy.py?h=stable/icehouse

Nova uses:

http://git.openstack.org/cgit/openstack/nova/tree/nova/policy.py?h=stable/icehouse

And the other projects are comparable. This has several implications. Probably the most significant is that policy implementation can vary from project to project, making the administrators life difficult.

Deep object inspection

What is different about Keystone’s implementation?  It has to do with the ability to inspect objects out of the database before applying policy.  If a user wants to read, modify, or delete an object, they only provide the  ID to the remote server.  If the server knows what the project ID is of the object, it can apply policy.  But that information is not in the request.  So the server needs to find what project owns the object.  The decorator @controller.protected contains the flag get_member_from_driver which fetches the object prior to enforcing the policy.

Nova buries the call to the enforce deeper inside the controller Method.

Path forward

Cleaning up policy implementation

  • Clean up the keystone server implementation of policy.  The creation of https://github.com/openstack/keystone/blob/master/keystone/common/authorization.py was a start, but the code called from the decorators that knows how to deal with the token data  in controller.py need to be pulled into authorization.py as well.
  • Move authorization.py into keystonemiddleware
  • Make the keystone server use the middleware implementation
  • convert the other projects to use the middleware implementation.
  • convert the other projects to use “delay_auth_decision” so this can eventually be the default.

Audit Middleware

  • put the audit middleware into Keystone middleware as-is.  This lets people use audit immediately.
  • Extract out the logic  from the middleware into code that can be called from policy enforcement.
  • Create a config option to control the emission of audit events from policy enforcement.
  • Remove the audit middleware from the API paste config files and enable the config option for policy emission of events.

Why not Oslo-policy.

Oslo policy is a general purpose rules engine.  Keeping that separate from the OpenStack RBAC specific implementation is a good separation of concerns.  Other parts of OpenStack may have needs for a policy/rules enforcement that is completely separate from RBAC.  Firewall configs in Neutron is the obvious first place.

 

July 07, 2014

Threats: Sphinx the Script Kiddie

Sphinx the Script Kiddie

Sphinx
Unlike Igor, Sphinx doesn’t have deep skills or knowledge. But he does have access to very powerful cracking toolkits that other people have developed and to people who can provide guidance and answer questions. This makes him far more dangerous than he would be if he had to rely on his own skills.

Most of the “hackers” (actually “crackers”) out there are actually like Sphinx. He may do everything from defacing web sites to identity theft and credit card fraud. In many cases he will be looking for targets of opportunity, rather that going after a specific system. He tends to use his cracking toolkits to probe every system he can find, looking for unsecured systems and common security flaws.

Much of your security strategy should be designed for Sphinx. There are a lot of them out there and they can do a lot of damage.


July 02, 2014

LMIShell on RHEL 7

Someone reported that they were having problems using LMIShell on a RHEL 7 system – they didn’t have any of the friendly commands that we have been talking about. And they were right; the full set of LMIShell scripts that provide the friendly CLI experience are not automatically installed on RHEL 7.

LMIShell on RHEL 7 is a special case – the LMIShell infrastructure is included in RHEL 7, but many of the scripts that make LMIShell easy to use are not packaged directly in RHEL 7. Instead, they are delivered through EPEL – the Extra Packages for Enterprise Linux. To effectively use LMIShell on a RHEL 7 system you need to install the EPEL repository and then install the OpenLMI Scripts from it.

One of the key characteristics of RHEL is the stability of interfaces. The OpenLMI API is stable, which allows us to include OpenLMI infrastructure and Providers in RHEL 7.

The LMIShell scripts, on the other hand, are rapidly evolving and changing. This is by design – we want the scripts to be useful, and we encourage people to modify and extend them. And hopefully submit their changes back upstream. This is a general characteristic of system management scripts; many of them change and evolve over time.

To install the full set of LMIShell scripts on a RHEL 7 system, first install the EPEL repository by going to http://mirror.pnl.gov/epel/beta/7/x86_64/repoview/epel-release.html downloading the package and installing it. This will configure your system to install packages from the EPEL for RHEL 7 repository.

Next, install LMIShell with the scripts:

#yum install 'openlmi-scripts\*'

This will install the LMIShell framework from RHEL 7 and all the LMIShell scripts from the EPEL repository. If you have already installed LMIShell it will simply install the scripts from EPEL.

To verify that the LMIShell scripts have been installed, issue the command “lmi help”. If you see a list of commands such as hwinfo, net, and storage, then the scripts are installed. You might also try “lmi hwinfo”, which will display information on the system and hardware configuration.


Wanted: A small crew for working on security bugs in Fedora

Do you hate security vulnerabilities?

Do you want to help make Fedora more secure?

Do you have a little extra time in your week to do a little work (no coding required)?

If you answered yes to the questions above I want you for a beta test of an idea I have to help make Fedora more secure.  I’m looking for just a few people (maybe five) to sort through security bugs and work with upstream and packagers to get patches or new releases into Fedora and help make everyone’s computing experience a little safer.  If you’re interested please contact me (sparks@fedoraproject.org 0x024BB3D1) and let me know you’re interested.


It’s all a question of time – AES timing attacks on OpenSSL

This blog post is co-authored with Andy Polyakov from the OpenSSL core team.

Advanced Encryption Standard (AES) is the mostly widely used symmetric block cipher today. Its use is mandatory in several US government and industry applications. Among the commercial standards AES is a part of SSL/TLS, IPSec, 802.11i, SSH and numerous other security products used throughout the world.

Ever since the inclusion of AES as a federal standard via FIPS PUB 197 and even before that when it was known as Rijndael, there has been several attempts to cryptanalyze it. However most of these attacks have not gone beyond the academic papers they were written in. One of them worth mentioning at this point is the key recovery attacks in AES-192/AES-256. A second angle to this is attacks on the AES implementations via side-channels. A side-channel attack exploits information which is leaked through physical channels such power-consumption, noise or timing behaviour. In order to observe such a behaviour the attacker usually needs to have some kind of direct or semi-direct control over the implementation.

There has been some interest about side-channel attacks in the way OpenSSL implements AES. I suppose OpenSSL is chosen mainly because its the most popular cross-platform cryptographic library used on the internet. Most Linux/Unix web servers use it, along with tons of closed source products on all platforms. The earliest one dates back to 2005, and the recent ones being about cross-VM cache-timing attacks on OpenSSL AES implementation described here and here. These ones are more alarming, mainly because with applications/data moving into the cloud, recovering AES keys from a cloud-based virtual machine via a side-channel attack could mean complete failure for the code.

After doing some research on how AES is implemented in OpenSSL there are several interesting facts which have emerged, so stay tuned.

What are cache-timing attacks?

Cache memory is random access memory (RAM) that microprocessor can access more quickly than it can access regular RAM. As the microprocessor processes data, it looks first in the cache memory and if it finds the data there (from a previous reading of data), it does not have to do the more time-consuming reading of data from larger memory. Just like all other resources, cache is shared among running processes for the efficiency and economy. This may be dangerous from a cryptographic point of view, as it opens up a covert channel, which allows malicious process to monitor the use of these caches and possibly indirectly recover information about the input data, by carefully noting some timing information about own cache access.

A particular kind of attack called the flush+reload attack works by  forcing data in the victim process out of the cache, waiting a bit, then measuring the time it takes to access the data. If the victim process accesses the data while the spy process is waiting, it will get put back into the cache, and the spy process’s access to the data will be fast. If the victim process doesn’t access the data, it will stay out of the cache, and the spy process’s access will be slow. So, by measuring the access time, the spy can tell whether or not the victim accessed the data during the wait interval. All this under premise that data is shared between victim and adversary.

Note that we are not talking about secret key being shared, but effectively public data, specifically lookup tables discussed in next paragraph.

Is AES implementation in OpenSSL vulnerable to cache-timing attacks?

Any cipher relying heavily on S-boxes may be vulnerable to cache-timing attacks. The processor optimizes execution by loading these S-boxes into the cache so that concurrent accesses/lookups, will not need loading them from the main memory. Textbook implementations of these ciphers do not use constant-time lookups when accessing the data from the S-boxes and worse each lookup depends on portion of the secret encryption key. AES-128, as per the standard, requires 10 rounds, each round involves 16 S-box lookups.

The Rijndael designers proposed a method which results in fast software implementations. The core idea is to merge S-box lookup with another AES operation by switching to larger pre-computed tables. There still are 16 table lookups per round. This 16 are customarily segmented to 4 split tables, so that there are 4 lookups per table and round. Each table consists of 256 32-bit entries. These are referred to as T-tables, and in the case of the current research, the way these are loaded into the cache leads to timing-leakages. The leakage as described in the paper  is quantified by probability of a cache line not being accessed as result of block operation. As each lookup table, be it S-box or pre-computed T-table, consists of 256 entries, probability is (1-n/256)^m, where n is number of table elements accommodated in single cache line, and m is number of references to given table per block operation. Smaller probability is, harder to mount the attack.

Aren’t cache-timing attacks local, how is virtualized environment affected?

Enter KSM (Kernel SamePage Merging). KSM enables the kernel to examine two or more already running programs and compare their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory pages to a single page. This page is then marked copy on write. If the contents of the page is modified by a guest virtual machine virtual machine, a new page is created for that guest virtual machine. This means that cross-VM cache-timing attacks would now be possible. You can stop KSM or modifiy its behaviour. Some details are available here.

You did not answer my original question, is AES in OpenSSL affected?

In short, no. But not to settle for easy answers, let’s have a close look at how AES in OpenSSL operates. In fact there are several implementations of AES in OpenSSL codebase and each one of them may or may not be chosen based on specific run-time conditions. Note: All of the above discussions are in about OpenSSL version 1.0.1.

  • Intel Advanced Encryption Standard New Instructions or AES-NI, is an extension to the x86 instruction set for intel and AMD machines used since 2008. Intel processors from Westmere onwards and AMD processors from Bulldozer onwards have support for this. The purpose of AES-NI is to allow AES to be performed by dedicated circuitry, no cache is involved here, and hence it’s immune to cache-timing attacks. OpenSSL uses AES-NI by default, unless it’s disabled on purpose. Some hypervisors mask the AES-NI capability bit, which is customary done to make sure that the guests can be freely migrated within heterogeneous cluster/farm. In those cases OpenSSL will resort to other implementations in its codebase.
  • If AES-NI is not available, OpenSSL will either use Vector Permutation AES (VPAES) or  Bit-sliced AES (BSAES), provided the SSSE3 instruction set extension is available. SSSE3 was first introduced in 2006, so there is a fair chance that this will be available in most computers used. Both of these techniques avoid data- and key-dependent branches and memory references, and therefore are immune to known timing attacks. VPAES is used for CBC encrypt, ECB and “obscure” modes like OFB, CFB, while BSAES is used for CBC decrypt, CTR and XTS.
  • In the end, if your processor does not support AES-NI or SSSE3, OpenSSL falls back to integer-only assembly code. Unlike widely used T-table implementations, this code path uses a single 256-bytes S-box. This means that probability of a cache line not being accessed as result of block operation would be (1-64/256)^160=1e-20. “Would be” means that actual probability is even less, in fact zero, because S-box is fully prefetched, and even in every round.

For completeness sake it should be noted that OpenSSL does include reference C implementation which has no mitigations to cache-timing attacks. This is a platform-independent fall-back code that is used on platforms with no assembly modules, as well as in cases when assembler fails for some reason. On side note, OpenSSL maintains really minimal assembler requirement for AES-NI and SSSE3, in fact the code can be assembled on Fedora 1, even though support for these instructions was added later.

Bottom line is that if you are using a Linux distribution which comes with OpenSSL binaries, there is a very good chance that the packagers have taken pain to ensure that the reference C implementation is not compiled in. (Same thing would happen if you download OpenSSL source code and compile it)

It’s not clear from the research paper how the researchers were able to conduct the side channel attack. All evidence suggests that they ended up using the standard reference C implementation of AES instead of assembly modules which have mitigations in place.  The researchers were contacted but did not respond to this point.  Anyone using an OpenSSL binary they built themselves using the defaults, or precompiled as part of an Linux distribution should not be vulnerable to these attacks.

June 30, 2014

Threats: Igor the Hacker

Now that we’ve taken a look at what some of the threats are, let’s look at who might be behind these threats. One goal is to determine who the greatest threat is. You may be surprised…

Igor the Hacker

Image

Igor is who you think of when someone says “hacker”. True hackers have always been skilled. Igor is very skilled and is in it for the money. He may have the backing of considerable resources from criminal organizations or even from state entities.

There are two ways Igor may be after you. If he is building a zombie botnet for spam and ddos attacks he will be looking for systems that are easy to take over. Normal security precautions should provide a good defense.

On the other hand, if you have assets that Igor is after, you have a real problem. Almost no level of security will be enough to stop him. And he won’t stop with computer attacks; social engineering is one of his most powerful tools. In some cases he may even resort to physical penetration to get to your systems.

Fortunately, there aren’t that many Igors around. You can’t build a security strategy around nothing but stopping Igor – it isn’t cost effective and truly hardened systems are often difficult to use. We will examine how a defense in depth approach can be used to manage Igor.

(Note: Igor is actually a cracker, not a hacker. A hacker is someone with deep computer skills who makes computers do amazing things. It describes someone with exceptional knowledge and skills. Unfortunately, hacker has been hijacked by the media to refer to criminal crackers…)


June 22, 2014

Signing PGP keys

If you’ve recently completed a key signing party or have otherwise met up with other people and have exchanged key fingerprints and verified IDs, it’s now time to sign the keys you trust.  There are several different ways of completing this task and I’ll discuss two of them now.

caff

CA Fire and Forget (caff) is a program that allows you to sign a bunch of keys (like you might have after a key signing party) very quickly.  It also adds a level of security to the signing process by forcing the other person to verify that they have both control over the email address provided and the key you signed.  The way caff does this is by encrypting the signature in an email and sending it to the person.  The person who receives the message must also decrypt the message and apply the signature themselves.  Once they sync their key with the key server the new signatures will appear for everyone.

$ gpg --keyserver hkp://pool.sks-keyservers.net --refresh-key

There is some setup of caff that needs to be done prior but once you have it setup it’ll be good to go.

Installing caff

Installing caff is pretty easy although there might be a little trick.  In Fedora there isn’t a caff package.  Caff is actually in the pgp-tools package; other distros may have this named differently.

Using caff

Once you have caff installed and setup, you just need to tell caff what key IDs you would like to sign.  “man caff” will give you all the options but basically ‘caff -m no yes -u ‘ will sign all the keys listed after your key.  You will be asked to verify that you do want to sign the key and then caff will sign the key and mail it off.  The user will receive an email, per user id on the key, with instructions on importing the signature.

Signing a key with GnuPG

The other way of signing a PGP key is to use GnuPG.  Signing a key this way will simply add the signature to the key you have locally and then you’ll need to send those keys out to the key server.

Retrieving keys using GnuPG

The first thing that you have to do is pull the keys down from the keyserver.

$ gpg --keyserver hkp://pool.sks-keyservers.net --recv-keys ...

Once you have received all the keys you can then sign them.  If someone’s key is not there you should probably contact them and ask them to add their key to the servers.  If they already have uploaded their key, it might take a couple of hours before it is sync’d everywhere.

Using GnuPG

Signing a key is pretty straightforward:

$ gpg --sign-key 1bb943db
pub 1024D/1BB943DB created: 2010-02-02 expires: never usage: SC 
 trust: unknown validity: unknown
sub 4096g/672557E6 created: 2010-02-02 expires: never usage: E 
[ unknown] (1). MariaDB Package Signing Key <package-signing-key@mariadb.org>
[ unknown] (2) Daniel Bartholomew (Monty Program signing key) <dbart@askmonty.org>
Really sign all user IDs? (y/N) y
pub 1024D/1BB943DB created: 2010-02-02 expires: never usage: SC 
 trust: unknown validity: unknown
 Primary key fingerprint: 1993 69E5 404B D5FC 7D2F E43B CBCB 082A 1BB9 43DB
MariaDB Package Signing Key <package-signing-key@mariadb.org>
 Daniel Bartholomew (Monty Program signing key) <dbart@askmonty.org>
Are you sure that you want to sign this key with your
key "Eric Harlan Christensen <eric@christensenplace.us>" (024BB3D1)
Really sign? (y/N) y

In the example I signed the MariaDB key with my key.  Once that is complete a simple:

gpg --keyserver hkp://pool.sks-keyservers.net --send-key 1BB943DB

…will send the new signature to the key servers.


June 19, 2014

Using OpenLMI to join a machine to a FreeIPA domain

Stephen Gallagher has published an article on how to use OpenLMI to join a FreeIPA domain. The article is available on his blog at sgallagh.wordpress.com

As Stephen notes:

“Traditionally, enrolling a system has been a “pull” operation, where an admin signs into the system and then requests that it be added to the domain. However, there are many environments where this is difficult, particularly in the case of large-scale datacenter or cloud deployments. In these cases, it would be much better if one could script the enrollment process.”

He covers how to use OpenLMI to update DNS, install the IPA client software, and finally join a domain. While he shows how to do these steps interactively, they can also be scripted to fully automate the process.

Good stuff, and quite simple to do.


June 18, 2014

OpenSSL Privilege Separation Analysis

As part of the security response process, Red Hat Product Security looks at the information that we obtain in order to align future endeavors, such as source code auditing, to where problems occur in order to attempt to prevent repeats of previous issues.

Private key isolation

When Heartbleed was first announced, a patch was proposed to store private keys in isolated memory, surrounded by an unreadable page. The idea was that the process would crash due to a segmentation violation before the private key memory was read.

However, it was quickly pointed out that the proposed patch was flawed. It did not store the private keys in the isolated memory space, and the contents of memory accessible by Heartbleed could still contain information that can be used to quickly reconstruct the private key.

The lesson learned here was that an audit of how and where private keys can be accessed, and where useful information is stored, should be undertaken to identify any potential weaknesses in the approach. Additionally, testing and verifying results would have identified that the private keys were not located in memory surrounded by unreadable memory pages.

Private key privilege separation

The idea behind private key privilege separation is to reduce the risk of an equivalent Heartbleed-style memory leak vulnerability. This can be implemented by using an application in front of the end service being protected or be implemented in the target application itself.

One example of using an application in front of the service being protected is Titus.  This application runs a separate process per TLS connection and stores the private key in another process. This helps prevent Heartbleed-style bugs from leaking private keys and other information about application state. The per-connection process model also protects against information from other connections being leaked or affected.

One drawback of the current implementation in Titus is that it fork()s and doesn’t execve() itself.  If there are any memory corruption vulnerabilities present in Titus, or OpenSSL, writing an exploit against the target is far easier than it could have been and potentially leaves useful information in memory that can be obtained later on.

Additionally, depending on the how chroot directories are set up, there may not be devices such as /dev/urandom available, which reduces the possible entropy sources available to OpenSSL.

Another approach is to implement the private key privilege separation in the process itself which is what some of the OpenBSD software has started to do. The aim being that while it won’t protect against OpenSSL vulnerabilities in and of itself, it will help restrict private keys from being leaked.

Privilege-separated OpenSSL

Sebastian Krahmer wrote a OpenSSL Privilege Separation (sslps) proof of concept which uses the Linux Kernel Secure Computing (seccomp) interface to isolate OpenSSL from the lophttpd process. This effectively reduces the available system calls that OpenSSL itself makes.

This has the advantage that if there is a memory corruption or arbitrary code execution vulnerability present in OpenSSL an attacker requires a further kernel vulnerability present either in the allowed system calls or in the lophttpd IPC mechanism to gain access.

Another possibility is that the attacker is happy to sit in the restricted OpenSSL process and monitor the SSL_read and SSL_write traffic, potentially gaining access to the private keys in memory.

While the current version of sslps doesn’t mitigate against Heartbleed-style memory leaking the private key, it helps make an attacker’s job harder in a memory corruption or arbitrary code execution vulnerability situation in OpenSSL.

It will be interesting to see if the OpenSSL or LibreSSL developers investigate using privilege separation or sandboxing in the future and what approaches are taken to implement them.

Hardware

One approach to help restrict compromises from software is to store the private keys elsewhere to prevent key compromise. One such approach is using a Hardware Security Module (HSM) to handle key generation, encryption, and signing. We may discuss using HSMs in the future.

It is also possible to use a Trusted Platform Module (TPM) to provide key generation, storage, encryption, and signing with OpenSSL, but this approach may be too slow for non-client side consideration.

Designing a new approach

Having laid out what’s available, a rough draft of an idealized approach for hardening SSL processing can now be made.

First, the various private keys should be isolated from the main processing of SSL traffic. This will help reduce the impact of Heartbleed-style memory leaks which makes the attackers job of getting the private keys harder.

Second, the SSL traffic processing should be isolated from the application itself. This helps restrict the impact of bugs in OpenSSL from affecting the rest of the application and system to the maximum possible extent.

Lastly, use existing kernel features, such as executing a new process to have address space randomization and stack cookie values reapplied, as this helps reduce the amount of information available to attack other processes. Additionally, features such as seccomp could be used to restrict what the private key process and the SSL traffic process can do, which in turn helps restrict the attack surface available to a process. Furthermore, it may be possible to utilize mandatory access control (MAC) systems, such as SELinux, to further contain and restrict the processes involved.

Potential Pitfalls

Implementing all of the above may introduce some backwards compatibility issues. An example to consider is when applications which utilize chroot() and can no longer access the required executables to implement an idealized approach. Perhaps it might be feasible to implement a fallback to a fork() based mechanism.

There are other functionality that may be adversely affected by such restrictions, and would require proper indepth analysis, such as looking up server and client certificate validity. Some API compatibilities could also get in the way.

It’s possible that the IPC mechanisms would introduce some performance impact, but overhead would be dwarfed by the cryptographic processing side, and actually may not be measurable. It may be possible to reduce the amount of overhead with some compromise of security by using shared pages, or page migration between processes to reduce the data copying aspect of IPC, and just have the IPC mechanism used for message passing.

Conclusion

We’ve covered currently existing approaches and drawn up a rough list of idealized features that would be required to help reduce the current attack surface of OpenSSL. These features would make an attackers job harder in compromising private keys and compromising applications that use OpenSSL. A follow-up post may look at using an OpenSSL engine to move the private key from the application itself, into another process to prevent Heartbleed-style memory leaks from disclosing the private keys.

June 17, 2014

Threats

Let’s shift back to a security discussion and take a look at threats. Any intelligent discussion of threats starts out by looking at what you are protecting, how it can be threatened, and the impact if one of the threats actually occurs. Let’s take a look at some threats:

Defacing a Web Site

In the past this has been one of the most common and visible “hacker” threats. If you have a simple “brochure ware” site, the most reasonable approach may be to simply have a good backup you can restore. If, for example, you have a DreamWeaver site, you might simply mutter something appropriate under your breath and hit the button to republish the site.

On the other hand, if you have an ecommerce site that your company depends on… This site is obviously important and must be protected.

This is an example of considering exposure, impact and cost. You shouldn’t spend too much to protect the “brochure ware” site. You shouldn’t spend too little to protect the ecommerce site. You should do the analysis of what is appropriate!

Using a System for Other Purposes

Having your system hijacked and turned into a zombie spewing malware and spam is a bad thing. In addition to the direct impact on the system, this is likely to get your whole domain blacklisted and effectively kicked off the internet. Consider both the direct and indirect impact of someone taking over your system – this is worth defending against.

Stealing Data

Data theft can be catastrophic. The cost can go far beyond the direct costs – just ask Target about their credit card breach!

From the computer side, protecting data requires solid access controls, encryption, and operational controls. But you should ask some other questions: Why do you have that data at all? Do you need to store the data? Which computers actually need access to that data? In a uprising number of cases you may not actually need that data at all! As a simple example, don’t store passwords – store password hashes! Properly salted, of course…

Changing Data

This can be very serious. In many cases the absolute worst thing that can happen is for data to be changed. This can mean that none of the data can be trusted. Depending on the data and the change, this can range from a nuisance to life threatening. We will dig into this topic in more detail in the future.

Data Destruction

Data destruction can be malicious or accidental. What will you do if a disk drive crashes? What will you do if someone – maybe even you – “accidently” deletes a critical file? How about an evil hacker breaking in and deleting data?

Even worse, what if the evil hacker deletes every tenth record in your database? Or if there is data corruption in part of a file or database?

Data destruction can be subtle. You need to worry about preventing it, detecting it, and recovering from it.

Changing Software

Changing software is a severe and subtle risk! The bottom line is to make sure you can detect it if it occurs – yes, this is even more important than preventing it. A good example of what can happen is a recent Computerworld Sharktank article. In this cases, the people making changes to the system were authorized to do this – but the impact of those changes should have been detected.

Degraded System Availability or Performance

If a computer is performing an important business function, availability and performance have direct and measurable cost. You need to continuously measure the availability and performance of critical application services.


June 16, 2014

PGP Keysigning Event and CACert Assertion at SELF2014

SouthEast LinuxFest is happening this upcoming weekend.  I offered to host a PGP (I’ll substitute PGP for GPG, GnuPG, and other iterations) keysigning and CACert Assertion event and have been scheduled for 6:30 PM in the Red Hat Ballroom.  Since there is a little bit of planning needed on the part of the participant I’m writing this to help the event run smoothly.

Participating in the PGP Keysigning Event

If you haven’t already, generate your PGP keys.  Setting up your particular mail client (MUA) is more than what I’ll discuss here but there is plenty of resources on the Internet.  Send me (eric@christensenplace.us – signed, preferably encrypted to 0x024BB3D1) the fingerprint of your PGP key no later than 3:00PM on Saturday afternoon.  If you don’t send me your fingerprint by that time you’ll be responsible for providing it to everyone at the keysigning event on paper.  Obtaining your key’s fingerprint can be done as follows:

$ gpg --fingerprint 024bb3d1
pub 4096R/024BB3D1 2011-08-11 [expires: 2015-01-01]
 Key fingerprint = 097C 82C3 52DF C64A 50C2 E3A3 8076 ABDE 024B B3D1
uid Eric Harlan Christensen <eric@christensenplace.us>
uid Eric "Sparks" Christensen <sparks@redhat.com>
uid Eric "Sparks" Christensen <echriste@redhat.com>
uid Eric "Sparks" Christensen <sparks@fedoraproject.org>
uid [jpeg image of size 2103]
uid Eric Harlan Christensen <sparks@gnupg.net>
sub 3072R/DCA167D5 2013-02-03 [expires: 2023-02-01]
sub 3072R/A9D8262F 2013-02-03 [expires: 2023-02-01]
sub 3072R/56EA1030 2013-02-03 [expires: 2023-02-01]

Just send me the “Key fingerprint” portion and your primary UID (name and email address) and I’ll include it on everyone’s handout.  You’ll need to bring your key fingerprint on paper for yourself to verify that what I’ve written on the paper is, indeed, correct.

At the event we’ll quickly do a read of all the key fingerprints and validate them as correct.  Then we’ll line up and do the ID check.  Be sure you bring a photo ID with you so that we can validate who you are with who you claim to be to the authorities.  People are generally okay with a driver’s license; some prefer a passport.  Ultimately it’s up to the individual what they will trust.

CACert Assertion

CACert is a free certificate authority that signs X509 certificates for use in servers, email clients, and code signing.  If you are interested in using CACert you need to go sign up for an account before the event.  Once you have established an account, login and select “US – WoT Form” from the CAP Forms on the right-side of the page.  Print a few of these forms and bring them with you (I hope to have a final count of the number of assurers that will be available but you’ll need one form per assurer).  You’ll need to present your ID to the assurer so they can verify who you are.  They will then award you points in the CACert system.

Questions?

If you have any questions about the event feel free to ask them here (using a comment) or email me at eric@christensenplace.us.


Generating a PGP key using GnuPG

Generating a PGP using GnuPG (GPG) is quite simple.  The following shows my recommendations for generating a PGP key today.

$ gpg --gen-key 
gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Please select what kind of key you want:
 (1) RSA and RSA (default)
 (2) DSA and Elgamal
 (3) DSA (sign only)
 (4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 3072
Requested keysize is 3072 bits
Please specify how long the key should be valid.
 0 = key does not expire
  = key expires in n days
 w = key expires in n weeks
 m = key expires in n months
 y = key expires in n years
Key is valid for? (0) 1y
Key expires at Tue 16 Jun 2015 10:32:06 AM EDT
Is this correct? (y/N) y
You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
 "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"
Real name: Given Surname
Email address: given.surname@example.com
Comment: Example
You selected this USER-ID:
 "Given Surname (Example) <given.surname@example.com>"
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
You need a Passphrase to protect your secret key.
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
..........+++++
.....+++++
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++
....+++++
gpg: key 2CFA0010 marked as ultimately trusted
public and secret key created and signed.
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 2 signed: 49 trust: 0-, 0q, 0n, 0m, 0f, 2u
gpg: depth: 1 valid: 49 signed: 60 trust: 48-, 0q, 0n, 0m, 1f, 0u
gpg: depth: 2 valid: 8 signed: 17 trust: 8-, 0q, 0n, 0m, 0f, 0u
gpg: next trustdb check due at 2014-09-09
pub 3072R/2CFA0010 2014-06-16 [expires: 2015-06-16]
 Key fingerprint = F81D 16F8 3750 307C D090 4DC1 4D05 E6EF 2CFA 0010
uid Given Surname (Example) <given.surname@example.com>
sub 3072R/48083419 2014-06-16 [expires: 2015-06-16]

The above shows the complete exchange between GPG and myself.  I’ll point out a couple of selections I made and explain why I made those choices.

Key type selection

I selected the default selection of two RSA keys.  The keys used for signing and encryption will both be RSA which is strong right now.  DSA has been proven to be weak in certain instances and should be avoided in this context.  I have no comment on ElGamal as I’ve not done research here.  Ultimately the choice is up to you.

Bit strength

I’ve selected 3072 instead of the default 2048 here.  I recommend this as the minimum bit strength as this provides 128 bits of security as compared to 112 bits of security with 2048.  128 bits of security should be secure beyond 2031 as per NIST SP 800-57, Part 1, Rev 3.

Key expiration

By default, I make my keys expire after a year.  This is a fail-safe and can be later modified before the expiration to extend the expiration another year.  This makes sure the key will self destruct if you ever lose control of it.

Identifying information

You’ll  now be asked to add your name and email address.  This should be self-explanatory.

Key revocation

Once you have completed your key generation now is the time to generate the key revocation file.  If you ever lose control of your key you should immediately upload this file to the public key servers so everyone using your key will know that it has [potentially] been compromised.  Once you’ve generated this revocation just keep it somewhere safe.  You can even print it out and keep it locked up somewhere.  It’s important to do this this ahead of time as you may not be able to do this later.  You’ll obviously want to substitute your own keyid for 2CFA0010.

$ gpg --gen-revoke 2CFA0010
sec 3072R/2CFA0010 2014-06-16 Given Surname (Example) <given.surname@example.com>
Create a revocation certificate for this key? (y/N) y
Please select the reason for the revocation:
 0 = No reason specified
 1 = Key has been compromised
 2 = Key is superseded
 3 = Key is no longer used
 Q = Cancel
(Probably you want to select 1 here)
Your decision? 1
Enter an optional description; end it with an empty line:
> 
Reason for revocation: Key has been compromised
(No description given)
Is this okay? (y/N) y
You need a passphrase to unlock the secret key for
user: "Given Surname (Example) <given.surname@example.com>"
3072-bit RSA key, ID 2CFA0010, created 2014-06-16
ASCII armored output forced.
Revocation certificate created.
Please move it to a medium which you can hide away; if Mallory gets
access to this certificate he can use it to make your key unusable.
It is smart to print this certificate and store it away, just in case
your media become unreadable. But have some caution: The print system of
your machine might store the data and make it available to others!
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1
Comment: A revocation certificate should follow
iQGfBCABAgAJBQJTnwtaAh0CAAoJEE0F5u8s+gAQHMQMANH1JG5gVDnp5NY4o8ji
3j6GljQ9ieY+u3c5q0c08/uSAqGvL9jmPn1QAnikAkIJGy9kNmBJ/uC6pSMcHeCW
/vYWMD/cToy63tgLOf4A8GgX2k8ttFe+DpFFSt43zbGVowykZ5AHwKImtyFwVO7M
IKQZV21uFcIDl7jb5GkymkpWRZmIrexOyIAQjpyYWQT4BFFnI7kwpYyVbmodkwE/
JaC0d5dMVT9DRLr5FGuGSpzYJEeB14GCjT2EQ1js/Bji2fguFqpzM5z77FdzhS7s
SNGgY8bioyjUN3CsyHMfPpkJi9mBDCV4gTxyLlVOdDiSdqA56mzjvrx3tnltfjyN
kFJfPDWLqXFNpzX516oOo37b3P92bSEPcIgGeTL58nVUn/BWMsoDlIbwNyjxx7Tq
YYXa2T2rbH1JHndOrmAc9X98cNrhs+vppV6SBev2MnvqobT2nqW7hKeNvwIyqunF
79fL9En2p57pQ8vH4EeRhjFSciuZZBpCEv2cMIDQGMFKVQ==
=6ljf
-----END PGP PUBLIC KEY BLOCK-----

Proper key storage

Generally speaking, your private PGP key is stored on your computer encrypted.  It is protected by your normal security measures of your computer and whatever password you set.  There is a better way.  Use a hardware security module (HSM) like a Yubikey Neo, OpenPGP card, or CryptoStick to protect  your private key from disclosure.

Publishing your public key

Now that you have your PGP keys you’ll want to publish your public key to the key servers so others can easily obtain it to validate your signatures.

$ gpg  --keyserver hkps://hkps.pool.sks-keyservers.net --send-keys 2CFA0010

You’ll obviously want to substitute your own keyid for 2CFA0010.  This command will send your key to the SKS public key servers which will then replicate your key around the world in a few hours.

 


June 14, 2014

Why POpen for OpenSSL calls

Many people have questioned why I chose to use popen to call the OpenSSL binary from Keystone and the auth_token middleware. Here is my rationale:

Keystone and the other API services in OpenStack are run predominantly from the Eventlet Web Server. Eventlet is a continuation based server, and requires cooperative effort to multitask. This means that if a function call misbehaves, the server is incapable of handling additional requests. The call to a asymmetric cryptography function like signing (or verifying the signature for) a document is expensive. There are a couple several ways that this could be problematic. The Cryptographic library could call into native code without releasing the GIL. The Cryptographic library call could tie up the CPU without giving the Eventlet server the ability to schedule another greenthread.

The popen call performs the Posix Fork system call, and runs the target library in a subprocess. Greenlet has support for this kind of call. When the Greenlet version of popen is called, the greenthread making the call yields the scheduler. The greenlet schedule periodically checks the status. Meanwhile, other greenthreads can make progress in the system.

What is the price of the popen? There is the context switch: an operating system switch from one process to another. There is also the start up costs of running the other process. The executable needs to be loaded in to memory. Then, the output from the parent process (in this case, the document to sign or verify) is passed via a pipe to the child process. Once the child process has completed the operation on the document, it returns the output via pipes back to the parent process. The child process is torn down. However, on a loaded system, much of this cost is only paid once. The executable is memory mapped. If one process has the process memory mapped, the additional processes only needs virtual memory operations to access those same mapped pages. The certificates used in the signature process are also loaded from the file system, but are likely to be in the file cache, and thus the operations are once again pure in-memory operations. Since the data signed of validated does not hit the disk, the main cost is the marshalling from process to process.

One reason I chose popen as opposed to a library call was, at the time, there was no clear choice for a python library to use. SMIME (Also known as PKCS-7, or Crypto Message Syntax or CMS) is the standard for document signature. At a minimum, I wanted a mechanism that supported SMIME. While there are several native cryptographic libraries, the widest deployed is OpenSSL, and I wanted something that made use of it.

Aside: Our team has some in house with NSS, and the US Government kindof demands NSS due to it playing nicely with Common Criteria Certification and FIPS 140-2. However, most people out there are not familiar with NSS, and teaching the OpenStack world about how to deal with NSS was more than I could justify. In addition, CMS support is not in the Python-NSS library. To get even deeper: the CMSUtil command from the NSS toolkit did not seem to support stripping the certificates out of the signed document (it foes, I’ve since discovered) which is required for getting the token size as small as possible. Considering that we are seeing problems with tokens exceeding header size limits, I think this is essential. NSS support is likely to gate on resolving these issues. Neither are insurmountable.

Why not do a threadpool? First was the fact that I had no Crypto library to use. M2 Crypto, long time favorite, had just been removed from Nova. It had the operations, but was unsupported. There seems to be no other library out there that handles the whole PKCS-7 set of operations. Most do the hashing and signing just fine, but break down on the ASN format of the document. The OpenSSL Project’s own Python library does not support this. PyCrypto (currently used by barbican) doesn’t even seem to provide full X509 support.

Supposing I did have a library to choose, a threadpool would probably work fine. But then, it completely bypasses all of the benefits of Eventlet using greenthreads. Switching to a truly threaded webserver would make sense…assuming one could be found that worked well within Python’s threading limitations.

Another reason to not do a threadpool is that it would be a solution specific to Eventlet. I have long been campaigning for a transition to Apache HTTPD as the primary container for Keystone. Granted, HTTPD running in pre-fork mode would not even need to do a popen: it could just wait for the response from the library call. But then we are starting to have an explosion of options to test.

It turns out that the real price of the popen comes from the fact that the calling program is Python. When you do a fork, there is no “copy-on-write” semantics for all of the python code. In C, most of the code is in read only memory, and does not need to be duplicated. The same is not true for Python, and thus all those pages need to be copied. Thus far, there have been no complaints due to this thus far. However, it is sufficient reason to plan for a replacement to the popen approach. There are a few potential approaches, but no one stands out yet.

June 13, 2014

Using OpenLMI to join a machine to a FreeIPA domain

People who have been following this (admittedly intermittent) blog for a while are probably aware that in the past I was heavily-involved in the SSSD and FreeIPA projects.

Recently, I’ve been thinking a lot about two topics involving FreeIPA. The first is how to deploy a FreeIPA server using OpenLMI. This is the subject of my efforts in the Fedora Server Role project and will be covered in greater detail in another blog post, hopefully next week.

Today’s topic involves enrollment of FreeIPA clients into the domain from a central location, possibly the FreeIPA server itself. Traditionally, enrolling a system has been a “pull” operation, where an admin signs into the system and then requests that it be added to the domain. However, there are many environments where this is difficult, particularly in the case of large-scale datacenter or cloud deployments. In these cases, it would be much better if one could script the enrollment process.

Additionally, it would be excellent if the FreeIPA Web UI (or CLI) could display a list of systems on the network that are not currently joined to a domain and trigger them to join.

There are multiple problems to solve here. The first of course is whether OpenLMI can control the joining. As it turns out, OpenLMI can! OpenLMI 1.0 includes the “realmd” provider, which acts as a remote interface to the ‘realmd’ service on Fedora 20 (or later) and Red Hat Enterprise Linux 7.0 (or later).

Now, there are some pre-requisites that have to be met before using realmd to join a domain. The first is that the system must have DNS configured properly such that realmd will be able to query it for the domain controller properties. For both FreeIPA and Active Directory, this means that the system must be able to query for the _ldap SRV entry that matches the domain the client wishes to join.

In most deployment environments, it’s reasonable to expect that the DNS servers provided by the DHCP lease (or static assignment) will be correctly configured with this information. However, in a development or testing environment (with a non-production FreeIPA server), it may be necessary to first reconfigure the client’s DNS setup.

Since we’re already using OpenLMI, let’s see if we can modify the DNS configuration that way, using the networking provider. As it turns out, we can! Additionally, we can use the lmi metacommand to make this very easy. All we need to do is run the following command:

lmi -h <client> net dns replace x.x.x.x

With that done, we need to do one more thing before we join the domain. Right now, the realmd provider doesn’t support automatically installing the FreeIPA client packages when joining a domain (that’s on the roadmap). So for the moment, you’re going to want to run

lmi -h <client> sw install freeipa-client

(Replacing ‘freeipa-client’ with ‘ipa-client’ if you’re talking to a RHEL 7 machine).

With that done, now it’s time to use realmd to join the machine to the FreeIPA domain. Unfortunately, in OpenLMI 1.0 we do not yet have an lmi metacommand for this. Instead, we will use the lmishell python scripting environment to perform the join (don’t worry, it’s short and easy to follow!)

c = connect('server', 'username', 'password')
realm_obj = c.root.cimv2.LMI_RealmdService.first_instance()
realm_obj.JoinDomain(Domain='domainname.test', User='admin', Password='password')

In these three lines, we are connecting to the client machine using OpenLMI, getting access to the realm object (there’s only one on a system, so that’s why we use first_instance()) and then calling the JoinDomain() method, passing it the credentials of a FreeIPA administrator with privileges to add a machine, or else passing None for the User and a pre-created one-time password for domain join as the Password.

And there you have it, barring an error we have successfully joined a client to a domain!

Final thoughts: I mentioned above that it would be nice to be able to discover unenrolled systems on the network and display them. For this, we need to look into extending the set of attributes we have available in our SLP implementation so that we can query on this. It shouldn’t be too much work, but it’s not ready today.


June 10, 2014

OpenLMI Ships in RHEL 7

RHEL 7, the latest version of Red Hat Enterprise Linux, was announced today with immediate availability. OpenLMI is included in RHEL 7 – in fact, it was identified in the announcement keynote as one of the key new technologies in RHEL 7.

This means that OpenLMI is now available in a supported Enterprise Linux, as well as in community versions of Linux.

We encourage you to try OpenLMI in either the Enterprise or community versions. As always, see the OpenLMI website for more information.


June 05, 2014

Unattended Install of a FreeIPA Server

As a developer, I install and uninstall the application I’m working on all the time. Back when I was working on FreeIPA full time, I had a couple of functions that I used to do an unattended install with some simple defaults. I recently cleaned them up a little. Since a few people have asked me for them, I’m posting them here.

I have another set of bash functions that manages my set of developer machines. One of the sets the $DEVSERVER variable in my environment.

#The Kerberos REALM generated by this is the domain segment of the
#fully qualified domain name (FQDN) converted to uppercase.    
#If you were running it on local host, you could use `hostname -d` 
#but that doesn't work for a remote system.
ipa-gen-realm(){
    ipahost=$DEVSERVER
    IPAREALM=$( echo $DEVSERVER  | cut -f2- -d. |tr '[:lower:]' '[:upper:]' )
    echo $IPAREALM
}

#The forwarder for DNS can be defined as the existing set of
#nameservers from /etc/resolv.conf.
ipa-gen-resolver(){
     ssh $DEVSERVER " cat /etc/resolv.conf" | awk '/nameserver/ {print $2}'
} 

ipa-gen-install-command(){
    echo  ipa-server-install  -U -r $(ipa-gen-realm) -p FreeIPA4All \
          -a FreeIPA4All --setup-dns --forwarder $( ipa-gen-resolver)
}

Kerberos and Firewalls

Most datacenters block non-standard ports at their firewalls. This includes ports for lesser used protocols. The Kerberos Key Distribution Center (KDC) listens on port 88 (TCP and UDP). Which means that, practically speaking, a machine cannot get a ticket over the public internet. Last summer, Robby Harwood interned here at Red Hat. Together, we put together a plan to address this.

It turns out that the fine folks at Microsoft tripped over this very problem long ago, and came up with an approach: use HTTP to talk to a proxy to the KDC. Their protocol, called KKDCPP, was written up in RFC form on their site. It makes sense that the MIT Kerberos approach should interoperate with the Microsoft product.

The problem with interns is that they have a nasty habit of actually going back to finish their degree. In this case, we had a working prototype by the end of the summer, but still had the long haul to getting it merged into the MIT upstream. Fortunately, we have people here at Red Hat that can make these Herculean labors look easy. In this case, Nalin Dahyabhai spent a good chunk of time these past several months dealing with the refactorings and other changes necessary to get it in.

It merged a couple nights ago. I did the happy dance the next morning.

Kerberos across the public internet still has a long path. The code which merged needs to make it into the next Kerberos release, which needs to make it into the major Linux distributions. Until that happens, we can’t rely on the tools being in place, but we can prepare for it.

Even once it is deployed, there will be issues:

  1. How do you find the right KDC for a given site?
  2. How do you configure your system for a new KDC without giving away root privilege?
  3. How do you tell your browser that you don’t have a principal for a Kerberized site, and to use a different mechanism?

Robbies Development setup is documented here:

So: here’s what you can plan for: there will be a new release of MIT Kerberos. The Current plan is for a release in the fall timeframe, and we are hoping to get that version into Fedora.next. No promises, as this involves synchronizing across two distinct organizations, but it looks promising. We’ll make sure the Debian maintainers are aware as well, and try to make sure the corresponding releases have it. Meanwhile, look for notes on getting the corresponding proxy set up for FreeIPA and other MIT Kerberos server implementations. The Microsoft Proxy server is part of the terminal server product, so if you are a Microsoft shop, that is the path for you.

I’m pretty excited about this. Kerberos has the potential to vastly improve security in the public web.

UPDATE:
Nathaniel McCallum’s implementation of the KDC Proxy

OpenSSL MITM CCS injection attack (CVE-2014-0224)

In the last few years, several serious security issues have been discovered in various cryptographic libraries. Though very few of them were actually exploited in the wild before details were made public and patches were shipped, important issues like Heartbleed have led developers, researchers, and users to take code sanity of these products seriously.

Among the recent issues fixed by the OpenSSL project in version 1.0.1h, the main one that will have everyone talking is the “Man-in-the-middle” (MITM) attack, documented by CVE-2014-0224, affecting the Secure Socket Layer (SSL) and Transport Layer Security (TLS) protocols.

What is CVE-2014-0224 and should I really be worried about it?

The short answer is: it depends. But like any security flaw, its always safer to patch rather than defer and worry.

In order for an attacker to exploit this flaw, the following conditions need to be present.

  • Both the client and the server must be vulnerable. All versions of OpenSSL are vulnerable on the client side. Only 1.0.1 and above are currently known to be vulnerable on the server side. If either the client or the server is fixed, it is not feasible to perform this attack.
  • A Man-In-The-Middle (MITM) attacker: An attacker capable of intercepting and modifying packets off the wire. A decade back, this attack vector seemed almost impossible for anyone but Internet Service Providers as they had access to all the network devices through which most of the traffic on the internet passed.

However with the prevalence of various public wireless access points, easily available at cafes, restaurants, and even free internet access provided by some cities, MITM is now possible. Additionally, there is a variety of software available that provides the capability of faking Access Points. Once clients connect to the fake AP, an attacker could then act as a MITM for the client’s traffic. A successful MITM attack may disclose authentication credentials, sensitive information, or give the attacker the ability to impersonate the victim.

How does this attack work?

SSL/TLS sessions are initiated with the ClientHello and ServerHello handshake messages sent from the respective side. This part of the protocol is used to negotiate the attributes of the session, such as protocol version used, encryption protocol, encryption keys, Message Authentication Code (MAC) secrets and Initializaton Vectors (IV), as well as the extensions supported.

For various reasons, the client or the server may decide to modify the ciphering strategies of the connection during the handshake stage (don’t confuse this with the handshake protocol). This can be achieved by using the ChangeCipherSpec (CCS) request. The CCS consists of a single packet which is sent by both the client and the server to notify that the subsequent records will be protected under the newly negotiated CipherSpec and keys.

As per the standards (RFC 2246, RFC 5246) “The ChangeCipherSpec message is sent during the handshake after the security parameters have been agreed upon, but before the verifying Finished message is sent.”. This however did not happen with OpenSSL, and it accepted a CCS even before the security parameters were agreed upon. It is expected that accepting CCS out of order results in the state between both sides being desynchronized. Usually this should result in both sides effectively terminating the connection, unless you have another flaw present.

In order to exploit this issue, a MITM attacker would effectively do the following:

  • Wait for a new TLS connection, followed by the ClientHello / ServerHello handshake messages.
  • Issue a CCS packet in both the directions, which causes the OpenSSL code to use a zero length pre master secret key. The packet is sent to both ends of the connection. Session Keys are derived using a zero length pre master secret key, and future session keys also share this weakness.
  • Renegotiate the handshake parameters.
  • The attacker is now able to decrypt or even modify the packets in transit.

OpenSSL patched this vulnerability by changing how it handles when CCS packets are received, and how it handles zero length pre master secret values. The OpenSSL patch ensures that is is no longer possible to use master keys with zero length. It also ensures that CCS packets cannot be received before the master key has been set.

What is the remedy?

The easiest solution is to ensure you are using the latest version of OpenSSL your distribution provides. Red Hat has issued security advisories for all of its affected products, and Fedora users should also be able to update their openssl packages to a patched version.

You will need to restart any services using OpenSSL that are not restarted automatically.

If you are a Red Hat customer, there is a tool available located at https://access.redhat.com/labs/ccsinjectiontest/ which you can use to remotely verify the latest patches have been applied and your TLS server is responding correctly.

We have additional information regarding specific Red Hat products affected by this issue that can be found at https://access.redhat.com/site/articles/904433

June 04, 2014

pam_mkhomedir versus SELinux -- Use pam_oddjob_mkhomedir
SELinux is all about separation of powers, minamal privs or reasonable privs.

If  you can break a program into several separate applications, then you can use SELinux to control what each application is allowed.  Then SELinux could prevent a hacked application from doing more then expected.

The pam stack was invented a long time ago to allow customizations of the login process.  One problem with the pam_stack is it allowed programmers to slowly hack it up to give the programs more and more access.  I have seen pam modules that do some crazy stuff.

Since we confine login applications with SELinux, we sometimes come in conflict with some of the more powerful pam modules.
We in the SELinux world want to control what login programs can do.  For example we want to stop login programs like sshd from reading/writing all content in your homedir.

Why is this important?

Over the years it has been shown that login programs have had bugs that led to information leakage without the users ever being able to login to a system.

One use case of pam, was the requirement of creating a homedir, the first time a user logs into a system.  Usually colleges and universities use this for students logging into a shared service.  But many companies use it also.

man pam_mkhomedir
  The pam_mkhomedir PAM module will create a users home directory if it does not exist when the session begins. This allows    users to be present in central database (such as NIS, kerberos or LDAP) without using a distributed file system or pre-creating a large number of directories. The skeleton directory (usually /etc/skel/) is used to copy default files and also sets a umask for the creation.


This means with pam_mkhomedir, login programs have to be allowed to create/read/write all content in your homedir.  This means we would have to allow sshd or xdm to read the content even if the user was not able to login, meaning a bug in one of these apps could allow content to be read or modified without the attacker ever logging into the machine.

man pam_oddjob_mkhomedir
       The pam_oddjob_mkhomedir.so module checks if the user's home  directory exists,  and  if it does not, it invokes the mkhomedirfor method of the com.redhat.oddjob_mkhomedir service for the PAM_USER if the  module  is running with superuser privileges.  Otherwise, it invokes the mkmyhome‐dir method.
       The location of the skeleton directory and the default umask are deter‐mined  by  the  configuration for the corresponding service in oddjobd-mkhomedir.conf, so they can not be specified as arguments to this  module.
       If  D-Bus  has  not been configured to allow the calling application to invoke these methods provided as part of the  com.redhat.oddjob_mkhome‐dir interface of the / object provided by the com.redhat.oddjob_mkhome‐dir service, then oddjobd will not receive the  request  and  an  error  will be returned by D-Bus.


Nalin Dahyabhai wrote pam_oddjob_mkhomedir many years ago to separate out the ability to create a home directory and all of the content from the login programs.  Basically the pam module sends a dbus signal to a dbus service oddjob, which launches a tool to create the homedir and its content.  SELinux policy is written to allow this application to succeed.   We end up with much less access required for the login programs.

If you want the home directory created at login time if it does not exist. Use pam_oddjob_mkhomedir instead of pam_mkhomedir.
New OpenLMI Web Site

I usually hate announcements  of web site redesigns – “to enhance readability  we have moved to Pretentious_Obscure_Font and changed the borders”…

But I think you will like what we’ve done at www.openlmi.org.

First, we’ve totally redone the site navigation. There are now three major components – an introduction and overview, OpenLMI for system administrators, and OpenLMI for developers.  As OpenLMI matures we will talk more about using it, as well as our ongoing focus on core technology development.

Second, we have moved to an adaptive template. The site is now much more usable on mobile devices – try it on your tablet and phone! There are a lot of changes under the hood that you don’t care about and I won’t bore you with.

Third, we are looking for feedback. Let us know how we can make the site even better. And if we’ve broken anything we want to know about it!


June 03, 2014

Keystone tox cheat sheet

While I grumbled when run_tests.sh was deprecated with just a terse message to go read the docs about tox, I have since switched over. Here is my quick tox transition tutorial.

To list the target environments:

tox -l

Currently this is:

py26
py27
py33
pep8
docs
sample_config

To run anyone of these, you pass it to tox via the -e option:

To build just the docs

tox -edocs

Pep 8 check

tox -epep8

Run Unit tests for python2.7

tox -epy27

Test coverage

tox -e cover

Update config file

 tox -esample_config

Run any of them with the -r flag to recreate the tox repo, but it takes a long time.

Each of the environments listed corresponds to a virtual environment under .tox. If you want to, say, run the keystone server that you have built, from the top directory in your keystone git sandbox:

. .tox/py27/bin/activate
.tox/py27/bin/keystone-all

Note that the same thing will work for a custom keystone client:

cd /opt/stack/python-keystonclient
. .tox/py27/bin/activate
. ~/keystonerc #pronounced keystoner-see
.tox/py27/bin/bin/keystone token-get

If the python33 run fails with the error

‘db type could not be determined’

you should remove the file

rm .testrepository/time.dbm

To just build the virtual environment for development without running the tests:

 tox -epy27 --notest -r

If you have more, let me know and I’ll update the post.

June 02, 2014

OpenLMI on YouTube

For your entertainment and amusement, Russell Doty and Stephen Gallagher have done a reprise of their Red Hat Summit presentation on OpenLMI and posted it to YouTube.

These videos answer some of the most common questions we hear about OpenLMI:

  • Why are you doing this?
  • What does it do for me?
  • How do I use it?
  • How does it fit with the rest of the system?

Four videos are available on the TechPonder channel:

Intro to OpenLMI Part 1

Intro to OpenLMI Part 2: Architecture

Intro to OpenLMI Part 3: Examples

Intro to OpenLMI Part 4: Elephants and Conclusions


Introduction to the Dogtag Python API

There is a Python binding to the Dogtag REST API under active development by Abhishek Koneru. I will be using this API to add support for Dogtag profiles in FreeIPA. This post serves as an introduction to the API, with a particular focus on the profile-related parts.

Because it’s still in development, the API is subject to change. I think the overall structure of the API is fine so hopefully any changes will be minor. The API is well documented so if in doubt, check the docstrings (calling help(<module|class|object>) is a handy way to read the docs in the interactive Python interpreter).

PKIConnection

The pki.client.PKIConnection class connects to a Dogtag instance and executes REST verbs on behalf of clients. Internally, it uses the excellent Requests library.

import pki.client

scheme = 'https'
host = 'localhost'
port = '8443'
subsystem = 'ca'
conn = pki.client.PKIConnection(scheme, host, port, subsystem)

For actions that require authentication, a client certificate is required, in PEM format. Client certificates are often distributed in the PKCS #12 format. In such case, the following command will convert a PKCS #12 client certificate to an unencrypted PEM certificate:

$ openssl pkcs12 -nodes -in cl_cert.p12 -out cl_cert.pem

After telling the PKIConnection where to find the client certificate, the connection object will be ready to use:

conn.set_authentication_cert("/path/to/cl_cert.pem")

ProfileClient

The pki.profile.ProfileClient class proxies the profiles-related REST resources.

import pki.profile

profile_client = pki.profile.ProfileClient(conn)
profiles = profile_client.list_profiles()
for profile in profiles:
  pass  # do stuff

list_profiles() also takes optional start and size keyword arguments for pagination. For inspecting an individual profile, there is the get_profile method. But first let’s see what happens when we ask for a profile that doesn’t exist:

>>> profile = profile_client.get_profile('nope')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pki/__init__.py", line 234, in handler
    raise pki_exception
  pki.ProfileNotFoundException: Profile ID nope not found

So there are nice, specific exception types. There’s a whole bunch of domain-specific exceptions, but I won’t list them here. Moving on, we can have a look at a profile that does exist:

>>> profile = profile_client.get_profile('caServerCert')
>>>
>>> profile
{'ProfileData': {'status': 'enabled', 'visible': True,
'profile_id': u'caServerCert', 'name': u'Manual Server Certificate
Enrollment', 'description': u'This certificate profile is for
enrolling server certificates.'}}
>>>
>>> dir(profile)
['Input', 'Output', 'PolicySets', '__class__', '__delattr__',
'__dict__', '__doc__', '__fo rmat__', '__getattribute__',
'__hash__', '__init__', '__module__', '__new__', '__reduce__' ,
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__', 'authenticator_id',
'authorization_acl', 'class_id', 'description', 'enabl ed',
'enabled_by', 'from_json', 'inputs', 'link', 'name', 'outputs',
'policy _sets', 'profile_id', 'renewal', 'visible', 'xml_output']

The relevant attributes can be gleaned from above. At the moment, there’s not a whole lot you can do with a profile object, besides look at it. It contains some metadata about the profile and lists of its inputs, outputs and policies (defaults and constraints).

There’s not much else to the profiles aspect of the API at this time. You can list profiles, inspect profiles, and enable/disable profiles, but you aren’t yet able to create new profiles or perform more advanced profile administration. Future work will (hopefully) add these capabilities.

CertClient

Although pki.profile on its own doesn’t currently offer a lot to the API end-user, some other modules do leverage the provided classes and methods in their own behaviours. pki.cert is one such module.

import pki.cert

cert_client = pki.cert.CertClient(conn)

# enrol a certificate
inputs = {
  "cert_request_type": "pkcs10",
  "cert_request": "MIIBmDCC... (a PEM certificate request)",
  "requestor_name": "John A. Citizen",
  "requestor_email": "jcitizen@example.tld",
}
enroll_req = cert_client.create_enrollment_request("caServerCert", inputs)
req_infos = cert_client.submit_enrollment_request(enroll_req)

The above instantiates a CertClient (reusing the connection object from before), creates a certificate enrollment request for the caServerCert profile (using the given inputs) and submits the certificate enrollment request. A certificate enrollment can actually involve multiple certificates, so the req_infos variable above contains a CertRequestInfoCollection object. Completing the enrollment involves iterating over this collection and approving each certificate request.

certificates = []
for req_info in req_infos:
  req_id = req_info.request_id
  cert_client.approve_request(req_id)
  cert_id = cert_client.get_request(req_id).cert_id
  certificates.append(cert_client.get_cert(cert_id))

Assuming nothing went wrong, certificates now contains a list of pki.cert.CertData objects, but took quite a few operations to get from the enrollment request inputs to our actual certificate(s). Fortunately, the API provides a convenience method to take care of all these details:

profile_id = "caServerCert"
certificates = cert_client.enroll_cert(profile_id, inputs)

enroll_cert takes care of all the details and returns a list of CertData objects when it completes. If this particular process of certificate enrollment request generation, submission, approval and certificate retrieval turns out to be a common use case, this method will save a lot of typing, but it’s important to know how it works and what it does behind the scenes.

Let’s now have a look at one of these CertData objects:

>>> type(cert)
<class 'pki.cert.CertData'>
>>>
>>> cert
{'CertData': {'status': u'VALID', 'serial_number': u'0x17',
'subject_dn': u'CN=TestServer,O=Red Hat Inc.,L=Raleigh,ST=NC,C=US'}}
>>>
>>> dir(cert)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
'__getattribute__', '__hash__', '__init__', '__module__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'__weakref__', 'encoded', 'from_json', 'issuer_dn', 'link',
'nonce', 'not_after', 'not_before', 'pkcs7_cert_chain',
'pretty_repr', 'serial_number', 'status', 'subject_dn']
>>>
>>> cert.encoded
u'-----BEGIN CERTIFICATE-----\nMIIDFjCCA... (a PEM-encoded certificate)'

It has all the things you’d expect a data type representing a digital certificate to have.

As you might expect, enrolling new certificates is not the only way to get at a CertData object. The CertClient API supports listing and searching certificates, revocation and more. It also supports the whole gamut of CA agent operations with respect to pending certificate requests. In addition to approving requests, requests can be reviewed, rejected, assigned to another agent, and so on.

Conclusion

There are many details and features of the Dogtag Python API that were not covered in this post, but the most important details have been covered, and I hope I have conveyed a comprehension of the high-level organisation of the API and the common idioms.

As mentioned at the beginning of this post, the API is not yet released and is subject to change, but feel free to have a look at the code or begin experimenting with it. The Dogtag developers welcome feedback and pki-devel mailing list is the place to provide it.

May 31, 2014

OpenLMI: HWInfo Working as Intended on Virtual Machine

I was recently told that someone had installed OpenLMI and it wasn’t working. They ran the hwinfo command to test the installation and it wasn’t reporting the correct information – most of the information was reported as “not specified” or “N/A”.

It turns out that they had installed OpenLMI in a virtual machine – and that most of the hardware information was, in fact, not available.

In a virtual machine:

lmi> hwinfo
Hostname:         localhost.localdomain

Chassis Type:     Other
Manufacturer:     Bochs
Model:            Not Specified (Bochs)
Serial Number:    Not Specified
Asset Tag:        0
Virtual Machine:  N/A

Motherboard info: N/A

CPU:              Not Specified
Topology:         1 cpu(s), 1 core(s), 1 thread(s)
Max Freq:         2000 MHz
Arch:             x86_64

Memory:           1.0 GB
Modules:          1.0 GB, RAM (DIMM), Not Specified
Slots:            1 used, N/A total
lmi>

For the physical system this VM was running on:

lmi> hwinfo
Hostname:        testbeda

Chassis Type:    Desktop
Manufacturer:    Gigabyte Technology Co., Ltd.
Model:           GA-MA78GM-S2H
Serial Number:   Not Specified
Asset Tag:       0
Virtual Machine: N/A

Motherboard:     GA-MA78GM-S2H
Manufacturer:    Gigabyte Technology Co., Ltd.

CPU:             AMD Phenom(tm) 9550 Quad-Core Processor
Topology:        1 cpu(s), 1 core(s), 1 thread(s)
Max Freq:        3000 MHz
Arch:            x86_64

Memory:          4.0 GB
Modules:         2.0 GB, 800 MHz, None, Bank0/1
2.0 GB, 800 MHz, None, Bank2/3
Slots:           2 used, 4 total
lmi>

There is much more information available from the physical system. The only things missing are the asset tag and the serial number, which I haven’t assigned for this system.

The lesson here: you can use the hwinfo command on a virtual machine, but you won’t get much useful information. The VM should not be able to get this information – if it can, there is leakage between the physical and virtual systems.


May 30, 2014

Kerberos, Federation, and Horizon

I’ve been looking in to enabling Kerberos for Horizon. Since Horizon passes the Users credentials on to Keystone to get a token, Kerberos requires an additional delegation mechanism. This leads to some questions about how to handle delegation in the case of Federated Identity.

In Kerberos, the Horizon Django App would get a REMOTE_USER set by the Apache web server. REMOTE_USER by itself is not enough to send to Keystone to get a token.

Kerberos has a Delegation mechanism called “Service for User to Proxy” (abbreviated as S4U2Proxy) which would allow (only) Horizon to get a service ticket to Keystone on behalf of the user. I am pretty certain we can make this work without too much effort.

The exposure of a hacked Horizon set up this way would be limited to those active service tickets in the webserver. Since Kerberos tickets are typically 8 hours in duration, the hacker would have full access to Keystone and Keystone only for any use that had logged in in the past 8 hours, or that logged in while the attack was running. Essentially, it would be the same exposure as the current Horizon model would have; both ticket duration and token duration are configurable.

If Horizon under Kerberos falls back to a user-id/password mechanism, the Horizon server could fetch a much more powerful Ticket Granting Ticket (TGT) for the user and use that to get a service ticket. The exposure for this is much greater; a successful hacker could impersonate the user to all services. Unfortunately, business requirements often demand a fallback to UserID/Password for users with limited ability to administer their own machines. mod_auth_krb5 provides a means to use userid and password, and there are form based approaches, too.

For SAML we would need a comparable delegation mechanism. If Horizon were authorized to access Keystone on the behalf of the user, and it provided the original SAML Assertion, we would have the same exposure as the S4U2Proxy case. SAML assertion are time limited, so the Hacked Horizon instance would only be able to access Keystone, and only for the duration of the SAML assertions.

In both cases, the exposure would be better than today using User-Id/Password to get tokens from Horizon in the deployments when the tokens have a default lifetime of 12 hours. Whereas as sniffed password would be usable indefinitely, a stolen assertion or service ticket would only be usable for the life of the ticket or assertion, and on so long as the Horizon service itself is still compromised.

The Keystone default token duration has been shortened to one hour, and a corresponding Service ticket policy would make sense: limit Kerberos Service tickets or SAML assertions to access Horizon to one hour.

I’m digging into X509 client certificates as well, but there does not seem to be as clear a story there. I think the short of it is “theoretically possible” but may be impractical.

Kerberos, Keystone Client, and S4U2Proxy

Since my eventual goal is to Kerberize Horizon, my next step after getting a CGI solution working was to make use of the Keystone client. Since the Kerberos auth plugin is still a work-in-progress, it required a little tweaking, but not all that much.

My basic strategy was to get a test script working, and to follow that up with a simple WSGI app. Inb order to get the test script, I started with Jose’s incipient patch for a Kerberos Auth plugin. However, this does not quite line up with how the server currently views Kerberos. A bit of a digression is probably in order….

The V3 Auth API requires that a request state the “methods” used, so that a multi-factor authentication could be possible. Since each Factor often requires separate data, there is a section of the request for each of the methods to provide something custom. The standard plugin is the Password on, and for that, a request would have this fragment:

"identity": {
            "methods": [
                "password"
            ],
            "password": {
                "user": {
                    "domain": {
                        "name": "Default"
                    },
                    "name": "admin",
                    "password": "freeipa4all"
                }
            }

For Kerberos, Jose assumed that we would want the same, and made it work with a methond named “kerberos”. I, personally, don’t want the overhead of changing the payload with redundant data. With Kerberos, the most important thing is that the Web Request be allows to Negotiate the mechanism used to authenticate. When useing curl from the command line, this means passing the –negotiate flag. With the Python request-kerberos library, it means passing this to the post request:

requests_auth = requests_kerberos.HTTPKerberosAuth(
            mutual_authentication=requests_kerberos.OPTIONAL)

On the Server side, Kerberos is going to be handled by HTTPD. Yes, it is possible to do this in Eventlet, and Jose has submitted a patch for that, too. But In general, doing Crypto from Python is a bad idea, especially when working with a single threaded web server like Eventlet. However, if we do go with his server side approach, I would like the client to be blissfully ignorant of the server side, and use the same plugin.

Up until recently, the accepted way to set up Keystone with Kerberos was to use the “external” method. This requires no specific data in the request although having it doesn’t hurt anything. Instead, if the environment variable ‘REMOTE_USER’ is specified, the Keystone Server Auth controller understands that the user has already been authenticated.

While we work out which way to go in Juno, I need something which works with the Status Quo. My Keystone server is a modified Packstack install, running out of the RPMs from the RDO equivalent to Icehouse. So I took his back and made the following change:

diff --git a/keystoneclient/contrib/auth/v3/kerberos.py b/keystoneclient/contrib/auth/v3/kerberos.py
index b7f8545..2a0c4ae 100644
--- a/keystoneclient/contrib/auth/v3/kerberos.py
+++ b/keystoneclient/contrib/auth/v3/kerberos.py
@@ -29,7 +29,7 @@ class KerberosMethod(v3.AuthMethod):
                       **kwargs):
         request_kwargs['requests_auth'] = requests_kerberos.HTTPKerberosAuth(
             mutual_authentication=requests_kerberos.OPTIONAL)
-        return 'kerberos', {}
+        return 'external', {}

My script to test is relatively simple, and I’ve added it to Jose’s patch:

import os

from keystoneclient import session
from keystoneclient.contrib.auth.v3 import kerberos

try:
    OS_AUTH_URL = os.environ['OS_AUTH_URL']
except KeyError as e:
    raise SystemExit('%s environment variables not set.' % e.message)

OS_CACERT = os.environ.get('OS_CACERT')
kerb_auth = kerberos.Kerberos(OS_AUTH_URL)
sess=session.Session(kerb_auth, verify=OS_CACERT)
token=sess.get_token()
print (token)

Note that the machine I am running this from is not a registered ipa-client. If it were, I would not need to specify the CA Certificate file to use. Hence, that variable is optional. Before running it, I set:

OS_AUTH_URL=https://ayoungf20packstack.cloudlab.freeipa.org/keystone/krb/v3

Once I had that working, the next step was a WSGI App. In /var/www/cgi-bin/keystone/get_token.py I have

from keystoneclient import session
from keystoneclient.contrib.auth.v3 import kerberos 


def application(environ, start_response):
    status = '200 OK'
    
    OS_AUTH_URL = 'https://ayoungf20packstack.cloudlab.freeipa.org/keystone/krb/v3'
    kerb_auth = kerberos.Kerberos(OS_AUTH_URL)
    sess=session.Session(kerb_auth)
    token=sess.get_token()


    response_headers = [('Content-type', 'application/json'),
                        ('Content-Length', str(len(token)))]

    start_response(status, response_headers)

    return [token]

I added the configurtion to an existing config file in /etc/httpd/conf.d :

WSGIScriptAlias /keystone/token /var/www/cgi-bin/keystone/get_token.py

WSGIDaemonProcess keystone_hello_wsgi user=fedora group=wheel maximum-requests=10000

<location>

  WSGIProcessGroup keystone_hello_wsgi
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  KrbConstrainedDelegation on
  Require valid-user
  NSSRequireSSL
</location>

And Now

WSGI App to get  Token  via S4U2Proxy

WSGI App to get Token via S4U2Proxy

May 29, 2014

Testing S4U2Proxy

Yesterday I set up a S4U2Proxy configuration for HTTP to HTTP delegation. Today, I tested it.

I took Alexander’s approach to testing using CGI. Here’s my test page, that just fetches a token from Keystone using Curl:

#!/usr/bin/bash
OS_AUTH_URL=https://ayoungdevstack20.cloudlab.freeipa.org/keystone/krb
OS_PROJECT_NAME=demo

TOKEN=`curl   \
-H "Content-Type:application/json" \
--negotiate -u : \
-d  '{ "auth": { "identity": { "methods": []}, "scope": { "project": { "domain": { "name": "Default" }, "name": "demo" } } } }' \
-X POST $OS_AUTH_URL/v3/auth/tokens   `

echo "Content-type: application/json"
echo ""
echo $TOKEN
exit 0

I saved this in: /var/www/cgi-bin/s4u2test/kerberos-token-get.sh and created a configuration file for it in

/etc/httpd/conf.d/s4u2test.conf:

KrbConstrainedDelegationLock ipa

<Directory /var/www/cgi-bin/s4u2test/>
  WSGIProcessGroup keystone_krb_wsgi
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  KrbConstrainedDelegation on
  Require valid-user
</Directory>>

Then hit from a web browser: GET https://ayoungdevstack20.cloudlab.freeipa.org/cgi-bin/s4u2test/kerberos-token-get.sh
which returned

{"token": {"methods": [], "roles": [{"id": "a18fd6adab1e4f238dd8da598615c3ce", "name": "Member"}, {"id": "9fe2ff9ee4384b1894a90878d3e92bab", "name": ....

To test it out, I tried a couple things. First, I performed a kinit as a couple different users, and those that did not have a role on the “demo” project get:

{"error": {"message": "User caspian has no access to project 5d15013cbebd4b1e95ad3b5785c866f7", "code": 401, "title": "Unauthorized"}}

When I comment out the line in /etc/httpd/conf.d/s4u2test.conf
# KrbConstrainedDelegation on

And restart the web server I get: Internal Server Error.
Reenable, and it works again.

S4U2Proxy for Horizon

I’ve got a packstack install, and a Kerberos-capable Keystone. Time to call it from Horizon. Time to set up S4U2Proxy.


I’ll be following the steps laid out by Alexander Bokovoy.
This is a work in progress, and I’ll update as I learn more.

First step is to enable Kerberos on the Horizon server, regardless of other login mechanism. I have /etc/httpd/conf.d/wsgi-horizon.conf with

    WSGIDaemonProcess horizon_ssl user=fedora group=wheel processes=3 threads=10 home=/opt/stack/horizon

    SetEnv APACHE_RUN_USER fedora
    SetEnv APACHE_RUN_GROUP wheel
    WSGIProcessGroup horizon

    DocumentRoot /opt/stack/horizon/.blackhole/
    Alias /media /opt/stack/horizon/openstack_dashboard/static
    Alias /static /opt/stack/horizon/static
    ErrorLog /var/log/httpd/horizon_error.log
    LogLevel warn

    CustomLog /var/log/httpd/horizon_access.log combined

    <directory></directory>
        Options FollowSymLinks
        AllowOverride None
    </directory>

    <directory>
        Options Indexes FollowSymLinks MultiViews
        Require all granted
        AllowOverride None
        Order allow,deny
        allow from all
    </directory>

  <location>
  WSGIProcessGroup horizon_ssl
  NSSRequireSSL
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
#this next one is needed for S4U2
# KrbConstrainedDelegation on
  Require valid-user
  </location>

WSGISocketPrefix /var/run/httpd

For my server cn=s4u2proxy,cn=etc,$SUFFIX is going to expand to: cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org

$ ldapsearch -Y GSSAPI -H ldap://ipa.cloudlab.freeipa.org -b  cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org  ""
SASL/GSSAPI authentication started
SASL username: ayoung@IPA.CLOUDLAB.FREEIPA.ORG
SASL SSF: 56
SASL data security layer installed.
# extended LDIF
#
# LDAPv3
# base <cn> with scope subtree
# filter: (objectclass=*)
# requesting:  
#

# s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org

# ipa-http-delegation, s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=ipa-http-delegation,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,d
 c=org

# ipa-ldap-delegation-targets, s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=ipa-ldap-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=f
 reeipa,dc=org

# ipa-cifs-delegation-targets, s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=ipa-cifs-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=f
 reeipa,dc=org

Now, in Alexander’s article he delegates to LDAP. But Keystone is an HTTP server. There is no entry yet for an HTTP delegation target.

lets look at the LDAP one:

$ ldapsearch -Y GSSAPI -H ldap://ipa.cloudlab.freeipa.org -b  cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org  "(cn=ipa-ldap-delegation-targets)"
SASL/GSSAPI authentication started
SASL username: ayoung@IPA.CLOUDLAB.FREEIPA.ORG
SASL SSF: 56
SASL data security layer installed.
# extended LDIF
#
# LDAPv3
# base <cn> with scope subtree
# filter: (cn=ipa-ldap-delegation-targets)
# requesting: ALL
#

# ipa-ldap-delegation-targets, s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=ipa-ldap-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=f
 reeipa,dc=org
objectClass: groupOfPrincipals
objectClass: top
cn: ipa-ldap-delegation-targets
memberPrincipal: ldap/ipa.cloudlab.freeipa.org@IPA.CLOUDLAB.FREEIPA.ORG

I’m going to make a rule specific to Keystone, not a general HTTP-to-HTTP delegation, So the first ldif I need should look like this:

dn: cn=ipa-keystone-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org
objectClass: groupOfPrincipals
objectClass: top
cn: ipa-http-delegation-targets
memberPrincipal: HTTP/ayoungf20packstack.cloudlab.freeipa.org@IPA.CLOUDLAB.FREEIPA.ORG

It means that something can get a service ticket to ayoungf20packstack on behalf of another user. What that something will be defined by another rule. Lets look at an existing one:

$ ldapsearch -Y GSSAPI -H ldap://ipa.cloudlab.freeipa.org -b  cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org  \
"(cn=ipa-http-delegation)"
SASL/GSSAPI authentication started
SASL username: ayoung@IPA.CLOUDLAB.FREEIPA.ORG
SASL SSF: 56
SASL data security layer installed.
# extended LDIF
#
# LDAPv3
# base <cn> with scope subtree
# filter: (cn=ipa-http-delegation)
# requesting: ALL
#

# ipa-http-delegation, s4u2proxy, etc, ipa.cloudlab.freeipa.org
dn: cn=ipa-http-delegation,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org
objectClass: ipaKrb5DelegationACL
objectClass: groupOfPrincipals
objectClass: top
cn: ipa-http-delegation
memberPrincipal: HTTP/ipa.cloudlab.freeipa.org@IPA.CLOUDLAB.FREEIPA.ORG
ipaAllowedTarget: cn=ipa-ldap-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc
 =cloudlab,dc=freeipa,dc=org
ipaAllowedTarget: cn=ipa-cifs-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc
 =cloudlab,dc=freeipa,dc=org

Now, I can’t reuse this rule: it would give Horizon the ability to talk directly to LDAP as the user that logged in, and that is WAY too much power. I’m going to create a new delegation rule.

dn: cn=ipa-keystone-delegation,cn=s4u2proxy,cn=etc,dc=ipa,dc=cloudlab,dc=freeipa,dc=org
objectClass: ipaKrb5DelegationACL
objectClass: groupOfPrincipals
objectClass: top
cn: ipa-keystone-delegation
memberPrincipal: HTTP/ayoungdevstack20.cloudlab.freeipa.org@IPA.CLOUDLAB.FREEIPA.ORG
ipaAllowedTarget: cn=ipa-keystone-delegation-targets,cn=s4u2proxy,cn=etc,dc=ipa,dc
 =cloudlab,dc=freeipa,dc=org

This may be working. It may not. All I know now is I can still log in with a userid and password, but only if I’ve performed a kinit. The next step is to convert Horizon to use Kerberos to talk to Keystone. That means that Horizon has to make use of a Kerberos Auth plugin. Will this work? Tune in next time and find out.

running the freeipa CLI from a non-client machine

A developer does things that are at odds with a production deployment. Case in point: the FreeIPA assumes that it should be run on an ipa-client machine. But as a developer, I need to talk to remote FreeIPA servers. Here’s how to make the CLI work without performing a client install.

1. Set up Kerberos
edit /etc/krb5.conf (or put into an alternative)

[realms]
 IPA.CLOUDLAB.FREEIPA.ORG = {
  kdc = ipa.cloudlab.freeipa.org:88
  master_kdc = ipa.cloudlab.freeipa.org:88
  admin_server = ipa.cloudlab.freeipa.org:749
  default_domain = cloudlab.freeipa.org
  pkinit_anchors = FILE:/etc/ipa/ca.crt

[domain_realm]
 .cloudlab.freeipa.org = IPA.CLOUDLAB.FREEIPA.ORG
 cloudlab.freeipa.org = IPA.CLOUDLAB.FREEIPA.ORG


}

If you do not have write access to /etc, you can copy the remote one over

scp ipa.cloudlab.freeipa.org:/etc/krb5.conf .

KRB5_CONFIG=./krb5.conf

2. Copy over the ca.crt and install into the NSS database

scp ipa.cloudlab.freeipa.org:/etc/ipa/ca.crt .
sudo certutil -A -n 'IPA CA' -d /etc/pki/nssdb -t CT,, -a -i ca.crt

Note that I had done this before, and needed to remove the old IPA CA cert with:

sudo certutil -D -n 'IPA CA' -d /etc/pki/nssdb

There doesn’t appear to be an alternative. While NSS_DEFAULT_DB_TYPE is a standard environment variable, there does not seem to be a NSS_DEFAULT_DB variable.

3. Fetch the FreeIPA config file

 scp ipa.cloudlab.freeipa.org:/etc/ipa/default.conf ./ipa.conf

4. Run the ipa command, indicating that you should use an alternative configuration file. Use a fully qualified path or you get a nasty error.

  $ ipa -c $PWD/ipa.conf user-show ayoung
  User login: ayoung
  First name: Adam
  Last name: Young
  Home directory: /home/ayoung
  Login shell: /bin/sh
  Email address: ayoung@redhat.com
  UID: 1387400001
  GID: 1387400001
  Account disabled: False
  Password: True
  Member of groups: admins, ipausers, osprey, eagle, hawk, wheel
  Kerberos keys available: True

May 28, 2014

Keeping DHCP from changing the Nameserver

I’m running FreeIPA in an OpenStack lab. I don’t control the DHCP server. When a host renews its lease, the dhclient code overwrites the nameserver values in /etc/resolv.conf. To avoid this, I modified /etc/dhcp/dhclient.conf

interface "eth0" {
 prepend domain-name-servers 192.168.187.12;
}

This makes sure my custom nameserver stays at the top of the list. Its a small hack that is perfect for developer work.

TGT Forwarding and cleanup

Kerberos provides single sign-on. However, if you don’t take care, you will end up having to do a kinit on a remote machine. Not a big deal, but the TGT on the remote machine will not necessarily be cleaned up when you log out.

If you are used to ssh Key forwarding, you might be pleased to know that Kerberos provides a similar option: ssh -K will forward your ticket when you ssh in to the remote machine, and, more importantly, will remove it and related service tickets from the credentials cache when you log out.

[ayoung@ayoung530 stack]$ ssh -K ipa.cloudlab.freeipa.org
Last login: Wed May 28 17:23:29 2014 from vpn-60-114.rdu2.redhat.com
-sh-4.2$ klist
Ticket cache: FILE:/tmp/krb5cc_1387400001_Z9Vo1UZcbc
Default principal: ayoung@IPA.CLOUDLAB.FREEIPA.ORG

Valid starting       Expires              Service principal
05/28/2014 17:23:55  05/29/2014 17:05:19  krbtgt/IPA.CLOUDLAB.FREEIPA.ORG@IPA.CLOUDLAB.FREEIPA.ORG
-sh-4.2$ exit
logout
Connection to ipa.cloudlab.freeipa.org closed.
[ayoung@ayoung530 stack]$ ssh ipa.cloudlab.freeipa.org
Last login: Wed May 28 17:23:57 2014 from vpn-60-114.rdu2.redhat.com
-sh-4.2$ klist
klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_1387400001)
-sh-4.2$ 

This behavior can be set as the default with the ssh config option: GSSAPIDelegateCredentials

[ayoung@ayoung530 stack]$ cat ~/.ssh/config 

Host *.cloudlab.freeipa.org
     GSSAPIDelegateCredentials  yes

[ayoung@ayoung530 stack]$ ssh ipa.cloudlab.freeipa.org
Last login: Wed May 28 17:19:20 2014 from vpn-60-114.rdu2.redhat.com
-sh-4.2$ klist
Ticket cache: FILE:/tmp/krb5cc_1387400001_Ca685gumJQ
Default principal: ayoung@IPA.CLOUDLAB.FREEIPA.ORG

Valid starting       Expires              Service principal
05/28/2014 17:21:14  05/29/2014 17:05:19  krbtgt/IPA.CLOUDLAB.FREEIPA.ORG@IPA.CLOUDLAB.FREEIPA.ORG
-sh-4.2$
More entropy with haveged

When a system’s entropy pool is depleted, reads from /dev/random will block. For applications that require lots of entropy, in environments where little entropy is available, long delays can result.

A side-note: on Linux, information about the amount of entropy available can be found under /proc/sys/kernel/random/, along with other parameters of the kernel entropy device and a UUID source. Be aware that other systems may not have this interface.

So if you are running out of entropy, what can you do? The haveged program exists to remedy this problem. It implements a variant of the HAVEGE (HArdware Volatile Entropy Gathering and Expansion) algorithm. In brief, HAVEGE leverages the fact that modern processors have thousands of bits of volatile internal state that affect how long it takes to execute particular routines. The nondeterminism in the time taken to execute a particular routine, also known as flutter, can be determined by reading the hardware clock counter. Using this entropy to seed a PRNG, HAVEGE can provide orders of magnitude more entropy than the standard Linux entropy device.

Let’s install haveged and see it in action:

sudo yum install -y haveged
sudo systemctl start haveged.service

That’s all there is to it. This runs /usr/sbin/haveged -w 1024 -v 1 --Foreground. The -w argument specifies the write wakeup threshold. When /dev/random has fewer than this many bits of entropy available, processes writing to the entropy pool are awakened. haveged wakes up, produces some entropy and feeds it to Linux for other applications to use.

The availability and quality of entropy can be tested using the rngtest tool, available in the rng-tools package. Compare running cat /dev/random | rngtest -c 1000 both with and without haveged working to feed /dev/random. You should find that haveged does a good job of ensuring ample entropy is available for programs.

Another solution to low entropy on Linux is rngd, which works similarly to haveged but reads entropy from hardware RNGs. Of course, you need a hardware RNG for rngd to be effective. The default location for a hardware RNG is /dev/hwrandom; rngd uses this device by default but can be configured to use any device that provides the Linux /dev/random ioctl API. Some Linux distributions (including recent releases of Fedora) ship with rngd enabled by default.

Let it again be noted that the entropy devices provided by other operating systems may (read: do) operate differently from the Linux entropy device, and some have native support for hardware RNGs when present, so while the approach to entropy replenishment shared by haveged and rngd works well for Linux, it may be incorrect or simply unnecessary for other systems.

May 20, 2014

Kerberizing Keystone in HTTPD

Configuring Kerberos as the authentication mechanism for Keystone is not much different than Kerberizing any other Web application. The general steps are:

  1. Configure Keystone to Run with an LDAP backend
  2. Configure Keystone to Run in Apache HTTPD
  3. Register the Keystone server as an Kerberos Client (I use FreeIPA)
  4. Establish a Kerberized URL for $OS_AUTH_URL

CLARIFICATION: In this configuration, Keystone is using Kerberos to authenticate, and then Keystone uses the LDAP identity backend, mapping REMOTE_USER to a user_id the external auth plugin.

Instructions for much of this have been written on the RDO site:

Note that I use a fairly minimal LDAP configuration. I use the same attribute for user id and user name. The IPA server allows anonymous browsing, which is read only. For the rest, I take the defaults.
I do not recommend putting Assignment data in the LDAP backend. For my setup, I put assignments in SQL.

I only set the following values in the LDAP section of my keystone.conf

[identity]
driver = keystone.identity.backends.ldap.Identity
#many lines removed

[LDAP]
url=ldap://ipa.cloudlab.freeipa.org
user_tree_dn=cn=users,cn=accounts,dc=ipa,dc=cloudlab,dc=freeipa,dc=org
user_id_attribute=uid
user_name_attribute=uid
group_tree_dn=cn=groups,cn=accounts,dc=ipa,dc=cloudlab,dc=freeipa,dc=org

[assignment]
driver = keystone.assignment.backends.sql.Assignment

Note that Kerberos without SSL is subject to replay attacks. You should configure the HTTPD server to run in NSS. I’ve laid out the steps to do that. In addition, you should use a real CA for managing the certificates, but you get that if you use FreeIPA.

sudo yum install mod_auth_kerb mod_nss

edit /etc/httpd/conf.d/nss.conf as I wrote about before:

Change Listen 8443 to
Listen 443

And <VirtualHost _default_:8443>
to
<VirtualHost _default_:443>

When you install mod_nss, it puts a selfsigned certificate into the nss database used for the HTTPD server. Get rid of old cert before requesting a new one.

certutil -d /etc/httpd/alias/ -L -n Server-Cert

How to get a certificate with Certmonger and FreeIPA

sudo ipa-getcert request -d /etc/httpd/alias -n Server-Cert -K HTTP/ayoungdevstack20.cloudlab.freeipa.org -N 'CN=ayoungdevstack20.cloudlab.freeipa.org,O=cloudlab.freeipa.org'

Make sure Apache can read the Keystone conf file. Since this contains a passwords (MySQL) it should not be world readable.

sudo chgrp -R apache /etc/keystone/
sudo chmod g+rx /etc/keystone/keystone.conf

edit wsgi-keystone.conf and add:

WSGIScriptAlias /keystone/krb  /var/www/cgi-bin/keystone/main

<location>
  WSGIProcessGroup keystone_wsgi
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  Require valid-user
  NSSRequireSSL
</location>

Request a Keytab

[fedora@ayoungdevstack20 conf.d]$ ipa-getkeytab -s ipa.cloudlab.freeipa.org -p HTTP/ayoungdevstack20.cloudlab.freeipa.org -k ~/openstack.keytab
Keytab successfully retrieved and stored in: /home/fedora/openstack.keytab
[fedora@ayoungdevstack20 conf.d]$ sudo mv /home/fedora/openstack.keytab /etc/httpd/conf

Apache Needs to be able to read the keytab, but no one else should.

[fedora@ayoungdevstack20 conf.d]$ sudo chown apache /etc/httpd/conf/openstack.keytab
[fedora@ayoungdevstack20 conf.d]$ sudo chmod a+r  /etc/httpd/conf/openstack.keytab

Restart HTTPD before testing.

Hit from a browser:

https://ayoungdevstack20.cloudlab.freeipa.org/keystone/krb

On the client. Fetch the CA cert for NSS.

wget http://ipa.cloudlab.freeipa.org/ipa/config/ca.crt

Test with curl:

curl --cacert ca.crt   --negotiate -u :    https://ayoungdevstack20.cloudlab.freeipa.org/keystone/krb
SSL/TLS Trends

My friend Hubert has started compiling statistics of Alexa’s top 1 million websites.  Specifically, he’s looking at their SSL/TLS settings and attempting to show trends in the world that is port 443.  He recently released his May numbers showing a slow but mostly improving security environment.  I’m hoping he’ll be able to chart these trends in a way to make it easier for people to consume the data and be able to dynamically look for data that they are interested in.  I guess we’ll have to wait and see what come about.  Until then I believe he’ll continue to post his monthly numbers on the Fedora Security List.


Operational Integrity

Introducing Operational Integrity

Let’s take the next step in integrity and look at the integrity of running systems over the life of an application service – operational integrity.

So far we have talked about applications as software. An application service is looking at using the application. Instead of thinking of it as software, we look at it from a user perspective as a set of capabilities that can be applied to business problems. An application service is the user interacting with the software, data, hardware and network infrastructure, storage, and everything required to allow a user to effectively use the application.

As an example of the difference between application and application service, consider a user of an MRP application running in a remote datacenter when a local router fails. In this case, the application is still running – however, the application service is no longer available to the user. The user doesn’t care why they can’t access the application or that other people are still running; all that matters is that the application service is not available to them!

Operational Integrity has three main components: Availability of application services, integrity of application services, and operations management.

Availability of Application Services

The first element is that the user can access the application services – that they have access to the applications and data and can use them to perform business tasks. Availibility also includes performance – a system that takes a minute to respond to user input instead of a fraction of a second will destroy productivity.

Thus, any definition of availability must include a response time metric. It is worth noting that, from a user perspective, the only truly acceptable response time is zero… Studies have shown increases in user productivity from reductions in response time, even when the response time goes below a tenth of a second.

Integrity of Application Services

Integrity of application services means that you can trust the results and that information is not compromised – either by denying access to people who should have access or by allowing access to people who shouldn’t.

Integrity of application services also includes resilience – maintaining correct operation even in the presence of attacks, system failures, or human error. Experience shows that human error tends to be the greatest challenge…

Operations Management

Operations management means maintaining the quality and integrity of application services over the life of the application. Basically, this means “install once, run for years”.

Once an application is installed, people start using it. Then more people start using it, and it slows down. Ongoing system monitoring and tuning is required. More processing power, memory and storage have to be added. The application – in fact the entire infrastructure – have to be patched for security issues and bugs. Software upgrades come out and must be installed. An enterprise application may have a life of 10-15 years – or longer! Hardware must be upgraded and replaced. New technologies must be incorporated, such as cloud computing or SSD based storage. New versions of the software come out, with new features, new requirements, and new bugs. New modules become available. The application must be integrated with other applications. Problems must be solved. Backups must be done. People must be trained.

All in all, maintaining and managing an application service over a 10-15 year lifespan is a much larger job than simply installing an application and checking initial integrity.


May 19, 2014

Docker build context and symbolic links

Docker is an application container system for Linux. Under the hood it’s like FreeBSD jails, but on top of that it provides powerful image specification and indexing capabilities. Images are built up in layers; each image depends on some other image (down to a base image), so a particular image might be a small delta on some shared dependency.

In investigating ways to Dockerize FreeIPA, particularly for ease of sharing development builds, it makes sense to base builds on an image that contains all the build dependencies. Since the dependencies are the same for any build, they can be made available in a single image to shave down the build time and reduce the size of the final images.

So there will be one Dockerfile for the builddeps image. But we will need another Dockerfile for the build itself, which will depend on the builddeps image. Each Dockerfile must reside in its own directory – there is no facility for specifying a different filename, e.g. Dockerfile.builddep – yet each Dockerfile needs to access some files in the root of the repository, e.g. freeipa.spec.in.

My initial approach is to have the builddep Dockerfile live in the repository at docker/freeipa-builddep/Dockerfile. The file consists of instructions specifying the image to build the new image FROM, files to ADD into the image, and commands to RUN. The initial Dockerfile is:

FROM fedora:20
ADD ../../freeipa.spec.in freeipa.spec.in
RUN cp freeipa.spec.in freeipa-builddep.spec
RUN yum-builddep freeipa-builddep.spec

Let’s attempt to build the image:

% sudo docker build .
Uploading context 3.072 kB
Uploading context
Step 0 : FROM fedora:20
 ---> b7de3133ff98
Step 1 : ADD ../../freeipa.spec.in freeipa.spec.in
2014/05/19 16:34:21 ../../freeipa.spec.in: no such file or directory

The context of a build is the contents of the directory containing the Dockerfile. Attempting to reference files outside the context fails. The ADD instruction documentation does kindly mention this.

We certainly don’t want multiple copies of freeipa.spec.in floating around, so perhaps we can use a symbolic link. The Dockerfile now reads:

FROM fedora:20
ADD freeipa.spec.in freeipa.spec.in
RUN cp freeipa.spec.in freeipa-builddep.spec
RUN yum-builddep freeipa-builddep.spec

Creating the symlink and trying the build again:

% cd docker/freeipa-builddep/
% ln -s ../../freeipa.spec.in
% docker build .
Uploading context 3.072 kB
Uploading context
Step 0 : FROM fedora:20
 ---> b7de3133ff98
Step 1 : ADD freeipa.spec.in freeipa.spec.in
2014/05/19 16:45:06 freeipa.spec.in: no such file or directory

Docker really does not like symlinks.

I’m not sure how to proceed from here, and will be seeking feedback from the other FreeIPA developers since the other options are either intrusive (different Dockerfile = different branch) or hacky (e.g. pulling things in from URLs, or multiple copies of spec file in repository). Perhaps I am overlooking a nice solution, or perhaps one will come about soon given that Docker is still under heavy development.

Stay tuned.

May 14, 2014

Application and System Integrity

We have defined integrity as one of the three pillars of IT. Now let’s define what we mean by application and system integrity:

  • The application returns the expected results.
  • Applications and data are available to authorized users.
    • Access and modification based on authorization.
  • Applications and data are not available to unauthorized users.
    • Access and modification prevented.
  • Systems and applications have not be modified in an unapproved way.
    • All modifications and attempts are recorded and reported.
  • Systems and applications can be verified and validated.
  • The system is resilient.

An interesting list – we need to break it down in more detail:

The application returns the expected results.

This is really the core of everything. If we can’t rely on the application to return the expected results then nothing else matters. Note that we aren’t saying “correct results” – there may be a variety of reasons for a system to return “incorrect results”, such as rule or procedure changes, logic errors, or even using work-arounds to force a system to do what is desired. Expected results means that they system is behaving predictably.

Applications and data are available to authorized users.

Another key element of system integrity is that authorized users can always get to their applications and data.

Access and modification based on authorization.

A key part of integrity – as well as security – is that users must be authorized to access and modify data. This includes the initial setup of access controls as well as maintenance over time. For example, if a user who has access to 20 different applications leaves the company, how are the systems updated? In general, a domain based authorization and authentication approach such as Red Hat Identity Manager or Microsoft Active Directory is strongly recommended.

Applications and data are not available to unauthorized users.

Giving access to everyone is often easy. Restricting access to only the people who are authorized to have access can be more difficult. To maintain system integrity, unauthorized users must not have access to data or applications.

Unauthorized Access and modification prevented.

The system must prevent unauthorized access to information and especially must prevent unauthorized modification of information. Doing this involves work in multiple areas, including technology, configuration, infrastructure, and policies and procedures.

Managing access to applications and data typically involves authentication and access control – often a combination of individual user access and Role Based Access Controls (RBAC).

Systems and applications have not been modified in an unapproved way.

Now we’re clearly getting into the domain of security. There are two things to keep in mind here: First, systems must be maintained. It is not acceptable to allow systems to “just work” – they must be actively maintained and monitored. Second, you must ensure that no unauthorized or unapproved modifications to the system occur.

All modifications and attempts are recorded and reported.

Key for both operations management and security is that all attempts to change a system must be noticed, recorded and reported. It is vital to track all changes made to a system.

Systems and applications can be verified and validated.

It isn’t enough for a system to work – you must be able to determine if the system has the correct software installed, that the software is properly configured, that the software is at the proper revision level, and that the software and configurations have not been modified.

Modifications can occur in multiple ways. A virus can infect a program. A hardware failure can corrupt the software, often in unusual ways. A patch can be applied incorrectly. No matter how changes occur, you must be able to verify that the software installed on the system is correct down to the bit level.

The system is resilient.

Resiliency is interesting. It simply means that the system continues function correctly in the face of degradation. You may see reductions in performance or the loss of some features, but the core capabilities of the system that generate business value must be available and produce the expected results.

System resiliency can be achieved in many ways. In future articles we will explore a variety of threats to a system and ways they system can continue to function. These threats range from hackers to hardware failure, natural disasters to user error – and even the actions of management!


Dogtag profile definitions

In the previous post I began an exploration of Dogtag’s certificate profiles feature by looking at the certificate request process and the relationship between PKCS #10 CSRs and Dogtag certificate enrolment requests, which are used to submit CSRs in the context of a profile. In this post we will look at how Dogtag profiles are defined and learn a little about how Dogtag uses them in the certificate enrolment process.

Each instance of Dogtag or Certificate Server starts out with a default set of profiles; these are found in the Dogtag instance directory in /var/lib/pki/<instance-name>/ca/profiles/ca/. There are dozens of profiles, but since we are already familiar with caServerCert let’s open up caServerCert.cfg and have a look:

desc=This certificate profile is for enrolling server certificates.
visible=true
enable=true
enableBy=admin
auth.class_id=
name=Manual Server Certificate Enrollment
input.list=i1,i2
input.i1.class_id=certReqInputImpl
input.i2.class_id=submitterInfoInputImpl
output.list=o1
output.o1.class_id=certOutputImpl
policyset.list=serverCertSet
policyset.serverCertSet.list=1,2,3,4,5,6,7,8
policyset.serverCertSet.1.constraint.class_id=subjectNameConstraintImpl
policyset.serverCertSet.1.constraint.name=Subject Name Constraint
policyset.serverCertSet.1.constraint.params.pattern=.*CN=.*
policyset.serverCertSet.1.constraint.params.accept=true
policyset.serverCertSet.1.default.class_id=userSubjectNameDefaultImpl
policyset.serverCertSet.1.default.name=Subject Name Default
policyset.serverCertSet.1.default.params.name=
policyset.serverCertSet.2.constraint.class_id=validityConstraintImpl
policyset.serverCertSet.2.constraint.name=Validity Constraint
policyset.serverCertSet.2.constraint.params.range=720
policyset.serverCertSet.2.constraint.params.notBeforeCheck=false
policyset.serverCertSet.2.constraint.params.notAfterCheck=false
policyset.serverCertSet.2.default.class_id=validityDefaultImpl
policyset.serverCertSet.2.default.name=Validity Default
policyset.serverCertSet.2.default.params.range=720
policyset.serverCertSet.2.default.params.startTime=0
policyset.serverCertSet.3.constraint.class_id=keyConstraintImpl
policyset.serverCertSet.3.constraint.name=Key Constraint
policyset.serverCertSet.3.constraint.params.keyType=-
policyset.serverCertSet.3.constraint.params.keyParameters=1024,2048,3072,4096,nistp256,nistp384,nistp521
policyset.serverCertSet.3.default.class_id=userKeyDefaultImpl
policyset.serverCertSet.3.default.name=Key Default
... (on it goes, through to policyset.serverCertSet.8.*)

There is an obvious relationship between the profile configuration, the certificate enrolment request template retrieved via pki cert-request-profile-show and the behaviour of the CA when submitting or approving enrolment requests. For example, there are two inputs: one for a certificate request (PKCS #10 CSR) and one for submitter information. These are the same two inputs we had to fill out in the XML certificate enrolment request template. And there are constraint declarations; again, we have observed the effects of these declarations when non-conformant enrolment requests were rejected.

Let’s break down the profile configuration. The top-level settings such as name, desc and enable are self-explanatory. Moving down, we see the input.list key specifying the list i1,i2, followed by keys input.i1.class_id and input.i2.class_id. This pattern of foo.list=f1,f2,.. followed by foo.f1..., foo.f2..., and so on also occurs further down for output and policyset, and seems to provide a simple, deterministic way to read ordered declarations from the profile configuration.

The class_id key also occurs in the output and policy set contexts. To what does its value refer? The file /etc/pki/<instance-name>/ca/registry.cfg holds the answer, mapping the values in the profile configuration to Java classes. These classes implement interfaces relevant to their role in the profile system: IProfileInput, IProfileOutput, IPolicyConstraint for inputs, outputs and policy constraints, and IPolicyDefault and ICertInfoPolicyDefault for policy defaults.

Whilst inputs and outputs have no further configuration beyond the class_id, policy set constraints and defaults are parameterised, with each class offering named parameters that relate to its function. For example, subjectNameConstraintImpl has parameters pattern (a regular expression) and accept (boolean; I infer that this controls whether to accept or reject a CSR on match). When a profile is used, e.g. to generate an enrolment request template, submit an enrolment request, or to generate a certificate, Dogtag instantiates the classes according to the profile configuration and uses their behaviours to carry out the requested action – or to decide how or whether to carry it out.

Armed with an understanding of how profiles are configured, let’s try and define a new profile. My first action was to simply copy caServerCert.cfg to caServerCertTest.cfg (ensuring the new file can be read by pkiuser). The name and desc values were changed and the subject name constraint pattern was updated to .*CN=test.* to make it easy to verify that the new profile is being used correctly. Let’s restart the server (the service name depends on the Dogtag instance name) and see if Dogtag has learned about the new profile:

$ sudo systemctl restart pki-tomcatd@pki-tomcat.service
$ pki cert-request-profile-show caServerCertTest
BadRequestException: Cannot provide enrollment template for profile `caServerCertTest`.  Profile not found

There must be more to configure. A thorough search turns up a few references to caServerCert in /etc/pki/<instance-name>/ca/CS.cfg:

...
profile.caServerCert.class_id=caEnrollImpl
profile.caServerCert.config=/var/lib/pki/<instance-name>/ca/profiles/ca/caServerCert.cfg
...
profile.list=caUserCert,caECUserCert,...,caServerCert,...
...

We have found what appears to be the canonical list of profiles and furthermore can see that the full path to the profile is configurable and that each profile specifies a class_id. The class_id values that can be used here appear in the same registry.cfg we learned about above. The classes referred to implement the IProfile interface.

After adding the profile.caServerCertTest configuration, appending caServerCertTest to profile.list and restarting Dogtag again, we can finally use our new profile:

$ pki cert-request-profile-show caServerCertTest
--------------------------------------------------
Enrollment Template for Profile "caServerCertTest"
--------------------------------------------------
  Profile ID: caServerCertTest
  Renewal: false

  Input ID: i1
  Name: Certificate Request Input
  Class: certReqInputImpl

    Attribute Name: cert_request_type
    Attribute Description: Certificate Request Type
    Attribute Syntax: cert_request_type

    Attribute Name: cert_request
    Attribute Description: Certificate Request
    Attribute Syntax: cert_request

  Input ID: i2
  Name: Requestor Information
  Class: submitterInfoInputImpl

    Attribute Name: requestor_name
    Attribute Description: Requestor Name
    Attribute Syntax: string

    Attribute Name: requestor_email
    Attribute Description: Requestor Email
    Attribute Syntax: string

    Attribute Name: requestor_phone
    Attribute Description: Requestor Phone
    Attribute Syntax: string

Adding the --output <filename> argument to the above command downloads the certificate enrolment request template for our new caServerCertTest profile. Using it to submit a CSR with a subject common name (CN) not starting with test. results in summary rejection as hoped, and submission succeeds when the CN does satisfy our constraint.

In the next post we’ll dive into some code to look at how inputs, constraints and defaults are actually implemented, and perhaps implement one or two of our own.

May 12, 2014

DAC_READ_SEARCH/DAC_OVERRIDE - common SELinux issue that people handle badly.
MYTH: ROOT is all powerful.

Root is all powerful is a common misconception by administrators and users of Unix/Linux systems.  Many years ago the Linux kernel tried to break the power of root down into a series of capabilities.  Originally there were 32 capabilities, but recently that grew to 64.  Capabilities allowed programmers to code application in such a way that the ping command can create rawip-sockets or httpd can bind to a port less then 1024 and then drop all of the other capabilities of root.

SELinux also controls the access to all of the capabilities for a process.    A common bugzilla is for a process requiring the DAC_READ_SEARCH or DAC_OVERRIDE capability.  DAC stands for Discretionary Access Control.  DAC Means standard Linux Ownership/permission flags.  Lets look at the power of the capabilities.

more /usr/include/linux/capability.h
...
/* Override all DAC access, including ACL execute access if
   [_POSIX_ACL] is defined. Excluding DAC access covered by
   CAP_LINUX_IMMUTABLE. */

#define CAP_DAC_OVERRIDE     1

/* Overrides all DAC restrictions regarding read and search on files
   and directories, including ACL restrictions if [_POSIX_ACL] is
   defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */

#define CAP_DAC_READ_SEARCH  2


If you read the descriptions these basically say a process running as UID=0 with DAC_READ_SEARCH can read any file on the system, even if the permission flags would not allow a root process to read it.  Similarly DAC_OVERRIDE, means the process can ignore all permission/ownerships of all files on the system.  Usually when I see AVC messages that require this access, I take a look at the process UID, and almost always I see the process is running as uid=0.

What users often do when they see this access denial is to add the permissions, which is almost always wrong.  These AVC's indicate to me that you have permission flags to tight on a file. Usually a config file.

Imagine the httpd process needs to read /var/lib/myweb/content which is owned by the httpd user and has permissions 600 set on it.

 ls -l /var/lib/myweb/content
-rw-------. 1 apache apache 0 May 12 13:50 /var/lib/myweb/content

If for some reason the httpd process needs to read this file while it is running as UID=0, the system will deny access and generate a DAC_* AVC.  A simple fix would be to change the permission on the file to be 644.

# chmod 644 /var/lib/myweb/content
# ls -l /var/lib/myweb/content
-rw-r--r--. 1 apache apache 0 May 12 13:50 /var/lib/myweb/content


Which would now allow a root process to read the file using the "other" permissions.

Another option would be to change the group to root and change the permissions to 640.

# chmod 640 /var/lib/myweb/content
# chgrp root /var/lib/myweb/content
# ls -l /var/lib/myweb/content
-rw-r-----. 1 apache root 0 May 12 13:50 /var/lib/myweb/content


Now root can read the file based on the group permissions. but others can not read it.  You could also use ACLs to provide access.    Bottom line this is probably not an SELinux issue, and not something you want to loosen SELinux security around.

One problem with SELinux system here is the capabilities AVC message does not tell you which object on the file system blocked the access by default.  The reason for this is performance as I explained in previous blog.


Why doen't SELinux give me the full path in an error message?


If you turn on full auditing and regenerate the AVC, you will get the path of the object with the bad DAC Controls, as I explained in the blog.
User Needs

In the last post we postulated that business value is created by users running applications which provide value added transformations of data. Users need three things from applications:

  1. Availability
  2. Performance
  3. Integrity

Of course, applications depend on the entire IT infrastructure.

Let’s look at these in more detail.

Availability means that the user can use the application when they need to – that the application, data, and IT infrastructure work together to allow the user to generate business value.

Performance means that the application has an acceptable response time to user requests. “Acceptable response time” depends on the context. For a large simulation job, multiple hours may be entirely acceptable. For many interactive operations, instantaneous is acceptable. For example, when entering data into a form, there should be no delay in echoing user input and in moving from one field to the next. In general, performance needs to be at a level where the user is productive and not frustrated with the system.

Integrity can have multiple meanings. In fact, integrity deserves a post of its own.


Dogtag certificate profiles – certificate requests

The certificate enrolment profiles feature of Dogtag PKI can be used to specify default values and constraints for X.509 certificate fields. This post explores Dogtag certificate profiles and their relationship with the PKCS #10 certificate signing request (CSR) format with a focus on signing request submission. Future posts in this series will focus on the Certificate Authority (CA) side of the profiles feature, and on modifying and defining profiles for specialised use cases.

Let us begin by generating a CSR. This occurs in isolation from Dogtag profiles or certificate enrolment, and is done using certutil (CSRs can also be generated with openssl req).

certutil -R -d .pki/nssdb -o no-CN.req -a -s 'C=AU, ST=Queensland, L=Brisbane, O=Red Hat'

The -o no-CN.req instructs certutil to output the CSR to a file, while -a specifies ASCII output. Note that the subject (given by -s) does not contain a common name (CN) component.

CSRs are submitted to Dogtag in the context of some certificate profile. Available profiles can be listed via pki cert-request-profile-find "", and the Certificate Enrolment Request template for a profile can be retrieved via pki cert-request-profile-show <profile ID> --output <filename>. Let’s have a look at the caServerCert profile template:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CertEnrollmentRequest>
    <ProfileID>caServerCert</ProfileID>
    <Renewal>false</Renewal>
    <SerialNumber></SerialNumber>
    <RemoteHost></RemoteHost>
    <RemoteAddress></RemoteAddress>
    <Input id="i1">
        <ClassID>certReqInputImpl</ClassID>
        <Name>Certificate Request Input</Name>
        <Attribute name="cert_request_type">
            <Value></Value>
            <Descriptor>
                <Syntax>cert_request_type</Syntax>
                <Description>Certificate Request Type</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="cert_request">
            <Value></Value>
            <Descriptor>
                <Syntax>cert_request</Syntax>
                <Description>Certificate Request</Description>
            </Descriptor>
        </Attribute>
    </Input>
    <Input id="i2">
        <ClassID>submitterInfoInputImpl</ClassID>
        <Name>Requestor Information</Name>
        <Attribute name="requestor_name">
            <Value></Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Name</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="requestor_email">
            <Value></Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Email</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="requestor_phone">
            <Value></Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Phone</Description>
            </Descriptor>
        </Attribute>
    </Input>
</CertEnrollmentRequest>

The template is XML, containing fields with attributes whose values are not yet specified. Filling out these attributes with the content of the CSR generated earlier along with some ancillary information, we end up with the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CertEnrollmentRequest>
    <ProfileID>caServerCert</ProfileID>
    <Renewal>false</Renewal>
    <SerialNumber></SerialNumber>
    <RemoteHost></RemoteHost>
    <RemoteAddress></RemoteAddress>
    <Input id="i1">
        <ClassID>certReqInputImpl</ClassID>
        <Name>Certificate Request Input</Name>
        <Attribute name="cert_request_type">
            <Value>pkcs10</Value>
            <Descriptor>
                <Syntax>cert_request_type</Syntax>
                <Description>Certificate Request Type</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="cert_request">
                <Value>
MIIBhjCB8AIBADBHMRAwDgYDVQQKEwdSZWQgSGF0MREwDwYDVQQHEwhCcmlzYmFu
ZTETMBEGA1UECBMKUXVlZW5zbGFuZDELMAkGA1UEBhMCQVUwgZ8wDQYJKoZIhvcN
AQEBBQADgY0AMIGJAoGBAJvkY6CyMdY0u7hwFzfG9ZdajT+69bbRh1vqFIArGhhv
vL09Em2MrlAhQEKF6PuAcdED7U7ryoBByeXDRfivFwQS5W5msVBkA5gZ1i9LyH82
xULvkdnNFu6He8QnxLr8+bl/r9tdlktP/3k79hHmWRpqBtOqVKtBCwMqEdPltF7H
AgMBAAGgADANBgkqhkiG9w0BAQUFAAOBgQB5Slu71g30osgQd25puSrUxNf6+eQk
KEpWfrsrpRh7nOkAo3QmBmR4L7i5tUChnIv6UGi8qTeEWNHnMBcwgoe56tg5vqpK
mmaz3W1w8hxima/cSqzqWgw4U/JMDU1nBSYz2WJTyEUUvdDD1lSsWzrqFi5f/vC3
VjjWvio/DSvrgw==
                </Value>
            <Descriptor>
                <Syntax>cert_request</Syntax>
                <Description>Certificate Request</Description>
            </Descriptor>
        </Attribute>
    </Input>
    <Input id="i2">
        <ClassID>submitterInfoInputImpl</ClassID>
        <Name>Requestor Information</Name>
        <Attribute name="requestor_name">
            <Value>ftweedal</Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Name</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="requestor_email">
            <Value>ftweedal@redhat.com</Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Email</Description>
            </Descriptor>
        </Attribute>
        <Attribute name="requestor_phone">
            <Value></Value>
            <Descriptor>
                <Syntax>string</Syntax>
                <Description>Requestor Phone</Description>
            </Descriptor>
        </Attribute>
    </Input>
</CertEnrollmentRequest>

With these fields filled out, the enrolment request can now be submitted to Dogtag:

$ pki cert-request-submit no-CN-req.xml
-----------------------------
Submitted certificate request
-----------------------------
  Request ID: 12
  Type: enrollment
  Request Status: rejected
  Operation Result: success

Boo! The enrolment request was rejected. Why? Certificate profiles can specify constraints on user-supplied values in a certificate request. In this case, it was the lack of a CN field in the subject, but profiles can also summarily reject an enrolment request based on other aspects of the embedded CSR, including key type and size.

Let’s now bring some extensions into the mix by generating a new signing request – this time with a valid subject, and with the Key Usage extension configured to indicate a certificate signing certificate (i.e., an intermediate CA). It obviously makes no sense to have this extensions on a server certificate, but let’s submit it with the caServerCert profile again and see what happens.

$ certutil -R -d .pki/nssdb -o usage-ca.req -a --keyUsage certSigning -s 'CN=c2.vm-096.idm.lab.bos.redhat.com'
...
$ openssl req -text < usage-ca.req
Certificate Request:
    Data:
        Version: 0 (0x0)
        Subject: CN=c2.vm-096.idm.lab.bos.redhat.com
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (1024 bit)
                Modulus:
                    00:bc:6e:11:11:6f:e5:3c:34:03:8a:5f:92:41:44:
                    ...
                    9b:bf:86:8e:df:96:9e:e6:ef
                Exponent: 65537 (0x10001)
        Attributes:
        Requested Extensions:
            X509v3 Key Usage:
                Certificate Sign
    Signature Algorithm: sha1WithRSAEncryption
         b0:4a:19:2c:c1:36:07:db:6a:bb:a9:36:0b:a4:53:c9:39:6d:
         ...

We can see that Key Usage extension is present in the request, and contains (only) the Certificate Sign declaration. We fill out and submit the enrolment request with this CSR:

$ pki cert-request-submit usage-ca-req.xml
-----------------------------
Submitted certificate request
-----------------------------
  Request ID: 14
  Type: enrollment
  Request Status: pending
  Operation Result: success

Perhaps surprisingly, this succeeds and the enrolment request is now pending, waiting for approval (or rejection) by a CA agent. It seems that, at least for the caServerCert profile, the value of the Key Usage extension in a CSR is ignored. The agent interface does allow adjustment of the Key Usage extension, however, and enforces sensible constraints, so no request submitted in the caServerCert profile will ever result in a certificate that could be used as an intermediate CA.

We have seen that Dogtag ignores the Key Usage extension information present in a CSR, but in fact, Dogtag ignores all information in the CSR except for what it specifically extracts. Therefore, requesting a particular key signing algorithm does not necessarily result in a certificate signed using that algorithm, and requesting some extension unknown in the selected profile (e.g., the Certificate Policies extension, which can be included in a CSR via the --extCP argument to certmonger) will certainly not be present in the certificate.

As a newcomer to the Dogtag PKI I find this behaviour somewhat limiting and would like to investigate whether the profiles system supports profiles that afford more control over the presense of extensions or the signing process, or what it would take to get this support.

The next post in this series will investigate how profiles are defined and the kinds of inputs and constraints they support.