April 17, 2014

256 Bits of Security

This is an incomplete discussion of SSL/TLS authentication and encryption.  This post only goes into RSA and does not discuss DHE, PFS, elliptical, or other mechanisms.

In a previous post I created an 15,360-bit RSA key and timed how long it took to create the key.  Some may have thought that was some sort of stunt to check processor speed.  I mean, who needs an RSA key of such strength?  Well, it turns out that if you actually need 256 bits of security then you’ll actually need an RSA key of this size.

According to NIST (SP 800-57, Part 1, Rev 3), to achieve 256 bits of security you need an RSA key of at least 15,360 bits to protect the symmetric 256-bit cipher that’s being used to secure the communications (SSL/TLS).  So what does the new industry-standard RSA key size of 2048 bits buy you?  According to the same document that 2048-bit key buys you 112 bits of security.  Increasing the bit strength to 3072 will bring you up to the 128 bits that most people expect to be the minimum protection.  And this is assuming that the certificate and the certificate chain are all signed using a SHA-2 algorithm (SHA-1 only gets you 80 bits of security when used for digital signatures and hashes).

So what does this mean for those websites running AES-256 or CAMELLIA-256 ciphers?  They are likely wasting processor cycles and not adding to the overall security of the circuit.  I’ll make two examples of TLS implementations in the wild.

First, we’ll look at wordpress.com.  This website is protected using a 2048-bit RSA certificate, signed using SHA256, and using AES-128 cipher.  This represents 112 bits of security because of the limitation of the 2048-bit key.  The certificate is properly chained back to the GoDaddy CA which has a root and intermediate certificates that are all 2048 bits and signed using SHA-256.  Even though there is a reduced security when using the 2048-bit key, it’s likely more efficient to use the AES-128 cipher than any other due to chip accelerations that are typically found in computers now days.

Next we’ll look at one of my domains: christensenplace.us.  This website is protected using a 2048-bit RSA certifcate, signed using SHA-1, and using CAMELLIA-256 cipher.  This represents 80 bits of security due to the limitation of the SHA-1 signature used on the certificate and the CA and intermediate certificates from AddTrust and COMODO CA.  My hosting company uses both the RC4 cipher and the CAMELLIA-256 cipher.  In this case the CAMELLIA-256 cipher is a waste of processor since the certificates used aren’t nearly strong enough to support such encryption.  I block RC4 in my browser as RC4 is no longer recommended to protect anything.  I’m not really sure exactly how much security you’ll get from using RC4 but I suspect it’s less than SHA-1.

So what to do?  Well, if system administrators are concerned with performance then using a 128-bit cipher (like AES-128) is a good idea.  For those that are concerned with security, using a 3072-bit RSA key (at a minimum) will give you 128 bits of security.  If you feel you need more bits of security than 128 then generating a solid, large RSA key is the first step.  Deciding how many bits of security you need all depends on how long you want the information to be secure.  But that’s a post for another day.


Configuring mod_nss for Horizon

Horizon is the Web Dashboard for OpenStack. Since it manages some very sensitive information, it should be accessed via SSL. I’ve written up in the past how to do this for a generic web server. Here is how to apply that approach to Horizon.

These instructions are based on a Fedora 20 and packstack install.

As a sanity check, point a browser at your Horizon server before making any changes. If hostname is not set before you installed packstack, you might get an exception about bad request header suggesting you might need to set ALLOWED_HOSTS: If so, you have to edit /etc/openstack-dashboard/local_settings

ALLOWED_HOSTS = ['192.168.187.13','ayoungf20packstack.cloudlab.freeipa.org', 'localhost', ]

Once Horizon has been shown to work on port 80, proceed to install the Apache HTTPD module for NSS:

sudo yum install mod_nss

While this normally works for HTTPD, something is different with packstack; all of the HTTPD module loading is done with files in /etc/httpd/conf.d/ whereas the mod_nss RPM assumes the Fedora approach of putting them in /etc/httpd/conf.modules.d/. I suspect it has to do with the use of Puppet. To adapt mod_nss to the packstack format, after installing mod_nss, you need to mv the file:

sudo mv /etc/httpd/conf.modules.d/10-nss.conf   /etc/httpd/conf.d/nss.load

Note that mv keeps SELinux Happy, but cp does not: ls -Z to confirm

$ ls -Z /etc/httpd/conf.d/nss.load 
-rw-r--r--. root root system_u:object_r:httpd_config_t:s0 /etc/httpd/conf.d/nss.load

If you get a bad context there, the cheating way is to fix is yum erase mod_nss and rerun yum install mod_nss and then do the mv. That is what I did.

edit /etc/httpd/conf.d/nss.conf:

#Listen 8443
Listen 443

and in the virtual host entry change 8443 to 443

Add the following to /etc/httpd/conf.d/openstack-dashboard.conf

<virtualhost>
   ServerName ayoungf20packstack.cloudlab.freeipa.org
   Redirect permanent / https://ayoungf20packstack.cloudlab.freeipa.org/dashboard/
</virtualhost>

replacing ayoungf20packstack.cloudlab.freeipa.org with your hostname.

Lower in the same file, in the section

<directory>

add

  NSSRequireSSL

To enable SSL.

SSL certificates really should not be self signed. To have a real security strategy, your X509 certificates should be managed via a Certificate Authority. Dogtag PKI provides one, and is deployed with FreeIPA. So, for this setup, the Horizon server is registered as an IPA client.

There will be a selfsigned certificate in the nss database from the install. We need to remove that:

sudo certutil -d /etc/httpd/alias/ -D -n Server-Cert

In order to fetch the certificates for this server, we use the IPA command that tells certmonger to fetch and track the certificate.

ipa service-add HTTP/`hostname`
sudo ipa-getcert request -d /etc/httpd/alias -n Server-Cert -K HTTP/`hostname` -N CN=`hostname`,O=cloudlab.freeipa.org

If you forgot to add the service before requesting the cert, as I did on my first iteration, the request is on hold: it will be serviced in 12 (I think) hours by certmonger resubmitting it, but you can speed up the process:

sudo getcert resubmit -n Server-Cert  -d /etc/httpd/alias

You can now see the certificate with:

 sudo certutil -d /etc/httpd/alias/ -L -n Server-Cert

Now, if you restart the HTTPD server,

sudo systemctl restart httpd.service

and point a browser at http://hostname, it should get redirected to https://hostname/dashboard and a functioning Horizon application.

Note that for devstack, the steps are comparable, but different:

  • No need to mv the 10-nss.conf file from modules
  • The Horizon application is put into /etc/httpd/conf.d/horizon.conf
  • The horizon app is in a virtual host of <VirtualHost *:80> you can’t just change this to 443, or you lose all of the config from nss.conf. The two VirtualHost sections should probably be merged.

April 14, 2014

New SELinux Feature: File Name Transitions

In Red Hat Enterprise Linux 7, we have fixed one of the biggest issues with SELinux where initial creation of content by users and administrators can sometimes get the wrong label.

The new feature makes labeling files easier for users and administrators. The goal is to prevent the accidental mislabeling of file objects.

Accidental Mislabeling

Users and administrators often create files or directories that do not have the same label as the parent directory, and then they forget to fix the label. One example of this would be an administrator going into the /root directory and creating the .ssh directory. In previous versions of Red Hat Enterprise Linux, the directory would get created with a label of admin_home_t, even though the policy requires it to be labeled ssh_home_t. Later when the admin tries to use the content of the .ssh directory to log in without a password, sshd (sshd_t) fails to read the directory’s contents because sshd is not allowed to read files labeled admin_home_t. The administrator would need to run restorecon -R -v /home/.ssh to fix the labels, and often they forget to do so.

Another example would be a user creating the public_html directory in his home directory. The default label for content in the home directory is user_home_t, but SELinux requires the public_html directory to be labeled http_user_content_t, which allows the Apache process (httpd_t) to read the content. We block the Apache process from reading user_home_t as valuable information like user secrets and credit-card data could be in the user’s home directory.

File Transitions Policy

Policy writers have always be able to write a file transition rule that includes the type of the processes creating the file object (NetworkManger_t), the type of the directory that will contain the file object (etc_t), and the class of the file object (file). They can also specify the type of the created object (net_conf_t):

filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t)

This policy line says that a process running as NetworkManager_t creating any file in a directory labeled etc_t will create it with the label net_conf_t.

Named File Transitions Policy

Eric Paris added a cool feature to the kernel that allows the kernel to label a file based on four characteristics instead of just three. He added the base file name (not the path).

Now policy writers can write policy rules that state:

  • If the unconfined_t user process creates the .ssh directory in a directory labeled admin_home_t, then it will get created with the label ssh_home_t: `filetrans_pattern(unconfined_t, admin_home_t, dir, ssh_home_t, “.ssh”)
  • If the staff_t user process creates a directory named public_html in a directory labeled user_home_dir_t, it will get labeled http_user_content_t: `filetrans_pattern(staff_t, user_home_dir_t, dir, http_user_content_t, “public_html”)

Additionally, we have added rules to make sure that if the kernel creates content in /dev, it will label it correctly rather than waiting for udev to fix the label.

filetrans_pattern(kernel_t, device_t, chr_file, wireless_device_t, "rfkill")

Better Security

This can also be considered a security enhancement, since in Red Hat Enterprise Linux 6, policy writers could only write rules based on the the destination directory label. Consider the example above using NetworkManager_t. In Red Hat Enterprise Linux 6, a policy writer would write filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t), which means the networkmanager process could create any file in an etc_t directory (/etc) that did not exist. If for some reason the /etc/passwd file did not exist, SELinux policy would not block NetworkManager_t from creating /etc/passwd. In Red Hat Enterprise Linux 7, we can write a tighter policy like this:

filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t, "resolv.conf")

This states that NetworkManger can only create files named resolv.conf in directories labeled etc_t. If it tries to create the passwd file in an etc_t directory, the policy would check if NetworkManager_t is allowed to create an etc_t file, which is not allowed.

Bottom Line

This feature should result in less occurrences of accidental mislabels by users and hopefully a more secure and better-running SELinux system.

April 11, 2014

New Red Hat Enterprise Linux 7 Security Feature: systemd-journald

A lot has already been written about systemd-journald. For example, this article describes the security benefits of the journal.

I would argue that systemd-journal is not a full replacement for syslog. The syslog format is ubiquitous, and I don’t see it going away. On all Red Hat Enterprise Linux 7 machines, syslog will still be on by default. This is because it’s still the defacto mechanism for centralizing your logging data, and most tools that analyze log data read syslog data. The journald actually makes syslog better, as syslog gathers its data from the journal, and because the journal runs from bootup to shutdown, it can feed more data to syslog, saving it until the syslog process starts.

When journald was first being created, many people who were working on Structured Logging got all up in arms over it because Lennart Poettering and Kay Sievers did not work with them. Despite that problem, I still like it.

When it comes to launching system apps, systemd has become the central point. It can be thought of as the systems process manager. It knows more about what is going on in the system then any other process, save for the kernel.

Years ago when the audit system was being built, Karl MacMillan of Tresys believed that some of the problems that the audit system was trying to fix could be handled by extending syslog to record all information about the sending process. You see syslog records very little metadata about who sent the syslog message. The audit subsystem was created to record all of the critical identity data, such as all of the UIDs associated with a process as well as the SELinux context; journald now collects all of data.

Let me give an example of where systemd-journal could be used to increase security.

SELinux controls what a process is allowed to do based on what it was designed to do. Sometimes even less, depending on the security goals of the policy writer. This means SELinux would prevent a hacked ntpd process from doing anything other then handling Network Time. SELinux would prevent the hacked ntpd from reading MySQL databases or credit-card data from a user’s home directory, even if the ntpd process was running as root. However, as the ntpd process sends syslog messages, SELinux would allow the hacked process to continue to send syslog messages.

The hacked ntpd could format syslog messages to match other daemons and potentially trick an administrator or (even better) a tool that reads the syslog file (like intrusion detection tools) into doing something bad. If all messages were verified with the systemd-journal, then the administrator or syslog analysis tool could see that ntpd_t was sending messages forged as if they were coming from the sshd daemon. The intrusion detection tools, realizing the ntpd daemon had been hacked, could then be coded to recognize those bad messages.

.cursor=s=f328cc4b2615417189ab76b00c7ae041;i=2;b=4c3d0faf6b774fb7930972c1a4a5f87
.realtime=1329940273078467
...skipping...
SYSLOG_IDENTIFIER=sshd
SYSLOG_PID=2302
MESSAGE=sshd Fake message from sshd.
_PID=2302
_UID=0
_GID=0
_COMM=ntpd
_EXE=/usr/sbin/ntpd
_CMDLINE=/usr/sbin/ntpd -n -u ntp:ntp -g
_SYSTEMD_CGROUP=/system/ntpd.service
_SYSTEMD_UNIT=ntpd.service
_SELINUX_CONTEXT=system_u:system_r:ntpd_t:s0
_SOURCE_REALTIME_TIMESTAMP=1330527027590337
_BOOT_ID=4c3d0faf6b774fb7930972c1a4a5f870
_MACHINE_ID=432d8198a8fc421caf2dca48ccde1cf2\
_HOSTNAME=x.example.com

April 10, 2014

New SELinux Feature: File Name Transitions

In Red Hat Enterprise Linux 7, we have fixed one of the biggest issues with SELinux where initial creation of content by users and administrators can sometimes get the wrong label.

The new feature makes labeling files easier for users and administrators. The goal is to prevent the accidental mislabeling of file objects.

Accidental Mislabeling

Users and administrators often create files or directories that do not have the same label as the parent directory, and then they forget to fix the label. One example of this would be an administrator going into the /root directory and creating the .ssh directory. In previous versions of Red Hat Enterprise Linux, the directory would get created with a label of admin_home_t, even though the policy requires it to be labeled ssh_home_t. Later when the admin tries to use the content of the .ssh directory to log in without a password, sshd (sshd_t) fails to read the directory’s contents because sshd is not allowed to read files labeled admin_home_t. The administrator would need to run restorecon -R -v /home/.ssh to fix the labels, and often they forget to do so.

Another example would be a user creating the public_html directory in his home directory. The default label for content in the home directory is user_home_t, but SELinux requires the public_html directory to be labeled http_user_content_t, which allows the Apache process (httpd_t) to read the content. We block the Apache process from reading user_home_t as valuable information like user secrets and credit-card data could be in the user’s home directory.

File Transitions Policy

Policy writers have always be able to write a file transition rule that includes the type of the processes creating the file object (NetworkManger_t), the type of the directory that will contain the file object (etc_t), and the class of the file object (file). They can also specify the type of the created object (net_conf_t):

filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t)

This policy line says that a process running as NetworkManager_t creating any file in a directory labeled etc_t will create it with the label net_conf_t.

Named File Transitions Policy

Eric Paris added a cool feature to the kernel that allows the kernel to label a file based on four characteristics instead of just three. He added the base file name (not the path).

Now policy writers can write policy rules that state:

  • If the unconfined_t user process creates the .ssh directory in a directory labeled admin_home_t, then it will get created with the label ssh_home_t: `filetrans_pattern(unconfined_t, admin_home_t, dir, ssh_home_t, “.ssh”)
  • If the staff_t user process creates a directory named public_html in a directory labeled user_home_dir_t, it will get labeled http_user_content_t: `filetrans_pattern(staff_t, user_home_dir_t, dir, http_user_content_t, “public_html”)

Additionally, we have added rules to make sure that if the kernel creates content in /dev, it will label it correctly rather than waiting for udev to fix the label.

filetrans_pattern(kernel_t, device_t, chr_file, wireless_device_t, "rfkill")

Better Security

This can also be considered a security enhancement, since in Red Hat Enterprise Linux 6, policy writers could only write rules based on the the destination directory label. Consider the example above using NetworkManager_t. In Red Hat Enterprise Linux 6, a policy writer would write filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t), which means the networkmanager process could create any file in an etc_t directory (/etc) that did not exist. If for some reason the /etc/passwd file did not exist, SELinux policy would not block NetworkManager_t from creating /etc/passwd. In Red Hat Enterprise Linux 7, we can write a tighter policy like this:

filetrans_pattern(NetworkManager_t, etc_t, file, net_conf_t, "resolv.conf")

This states that NetworkManger can only create files named resolv.conf in directories labeled etc_t. If it tries to create the passwd file in an etc_t directory, the policy would check if NetworkManager_t is allowed to create an etc_t file, which is not allowed.

Bottom Line

This feature should result in less occurrences of accidental mislabels by users and hopefully a more secure and better-running SELinux system.

April 09, 2014

Teaching Horizon to Share

Horizon is The OpenStack Dashboard. It is a DJango (Python) Web app. During a default installation, Horizon has resources at one level under the main Hostname in the URL scheme. For example, authentication is under http://hostname/auth.

Devstack performs single system deployments. Packstack has an “all-in-one” option that does the same thing. If these deployment tools are going to deploy other services via HTTPD, Horizon needs to be taught how to share the URL space. Fortunately, this is not hard to do.

A naive approach might just say “why not reserve suburls for known applications, like Keystone or Glance?” There are two reasons that this is not a decent approach. First is that you do not know what other applications OpenStack might need to deploy in the future. The other reason is that DJango has a well known mechanism that allows an administrator to deploy additional web UI functionality alongside existing. Thus, an end deployer could very well have a custom webUI for Keystone running at http://hostname/dashboard/keystone. We can’t block that. DJango provides the means for a very neat solution.

In horizon, you can get away with setting the WEBROOT variable. I made my changes in openstack_dashboard/settings.py as I am intending on submitting this as a patch:

WEBROOT='/dashboard'
LOGIN_URL =  WEBROOT+'/auth/login/'
LOGOUT_URL =  WEBROOT+'/auth/logout/'
 # LOGIN_REDIRECT_URL can be used as an alternative for
 # HORIZON_CONFIG.user_home, if user_home is not set.
 # Do not set it to '/home/', as this will cause circular redirect loop
LOGIN_REDIRECT_URL =  WEBROOT 

However, you can achieve the same effect for your deployments by setting the values in openstack_dashboard/local/local_settings.py for devstack or the comparable file in a Puppet or Chef based install.

However, DJango is no longer managing the root url for your application. If you want to make the transition seamless, you need to provide for a redirect from http://hostname to http://hostname/dashboard. This is an Apache HTTPD configuration issue, and not something you can do inside of DJango.

On a Fedora based install, you will find a file /etc/httpd/conf.d/welcome.conf. In a default deployment, it points to the Fedora “Welcome” page.

Alias /.noindex.html /usr/share/httpd/noindex/index.html

There are many ways to use this. Here’s how I did it. It might not be the neatest. Feel free to comment if you have something better.

Create a file /var/www/html/index.html with the following body:

<META HTTP-EQUIV="Refresh" Content="0; URL=/dashboard/">

Then, modify welcome.conf to have:

Alias /.noindex.html /var/www/html/index.html

Now users that land on http://hostname are redirected to http://hostname/dashboard.

Amending a patch in git

From a co-worker:

amend is new to me… will the updated patch be a full patch to the original source or a patch to the previous patch?

Here’s how I explain it.

in git, nothing is ever lost, but you can rewrite history
if you do git commit (no args) it will create a new commit, and that will be HEAD fro whatever branch you are on
if you then edit a file, and do

git add 
git commit --amend

it will modify that last commit with the changes
however, your original commit is still safely tucked away deep in the bowels of git.
To find it, run

 git reflog

and you will see a transaction log of everything you have ever done in that git repo/

Let me show you how this works.

 cd /opt/stack/dogtag/pki/
 #that is my repo
 git fetch;  git rebase origin/master

I am going to make a simple commit, for the README file

diff --git a/README b/README
index 2c2c36b..214c774 100644
--- a/README
+++ b/README
@@ -1,5 +1,5 @@
 # BEGIN COPYRIGHT BLOCK
-# (C) 2008 Red Hat, Inc.
+# (C) 2008-13 Red Hat, Inc.
 # All rights reserved.
 # END COPYRIGHT BLOCK

After I make that change, but before I commit:
git reflog | fpaste gives me http://paste.fedoraproject.org/92947/13970643

9738598 HEAD@{0}: rebase finished: returning to refs/heads/master
9738598 HEAD@{1}: rebase: checkout origin/master
7a02522 HEAD@{2}: clone: from ssh://git.fedorahosted.org/git/pki.git

If I execute

  git add README
  git commit -m "Updated Copyright on README"

git reflog shows

 
2c7b42d HEAD@{0}: commit: Updated Copyright on README
9738598 HEAD@{1}: rebase finished: returning to refs/heads/master
9738598 HEAD@{2}: rebase: checkout origin/master
7a02522 HEAD@{3}: clone: from ssh://git.fedorahosted.org/git/pki.git

Oops. I realize I made the date 2013 not 2014. Fix that.

 
vi README 
git add README
git commit --amend

git reflog | fpaste now shows

cad8fed HEAD@{0}: commit (amend): Updated Copyright on README
2c7b42d HEAD@{1}: commit: Updated Copyright on README
9738598 HEAD@{2}: rebase finished: returning to refs/heads/master
9738598 HEAD@{3}: rebase: checkout origin/master
7a02522 HEAD@{4}: clone: from ssh://git.fedorahosted.org/git/pki.git

Now, for some wonky reason, I want that -13 edit back
but…lets say I don’t want to lose what I was working on

 git checkout -b update-copyright

that will check out a new branch named update-copyright
It happens that I forget to create a branch before doing work.
So now my master also has that commit on it….I’ll fix that first

 git checkout master

shows

Your branch is ahead of ‘origin/master’ by 1 commit.
(use “git push” to publish your local commits)

nah…I am not ready to push….so

 git reset --hard origin/master

now my master matches origin/master
how can I tell? Couple ways…easiest is:

git log shows my top commit is 9738598e37effc5f68e8f2d211a6273b8846a6fc

and

git log origin/master shows the exact same thing. Now, what about reflog?

9738598 HEAD@{0}: reset: moving to origin/master
cad8fed HEAD@{1}: checkout: moving from update-copyright to master
cad8fed HEAD@{2}: checkout: moving from master to update-copyright
cad8fed HEAD@{3}: commit (amend): Updated Copyright on README
2c7b42d HEAD@{4}: commit: Updated Copyright on README
9738598 HEAD@{5}: rebase finished: returning to refs/heads/master
9738598 HEAD@{6}: rebase: checkout origin/master
7a02522 HEAD@{7}: clone: from ssh://git.fedorahosted.org/git/pki.git

It shows every one of those moves. And that original commit, where I put -13?
that is

2c7b42d HEAD@{4}: commit: Updated Copyright on README

if I look at it with git show 2c7b42d

commit 2c7b42dde010848f2b60e0f585701fe2ef76e732
Author: Adam Young <ayoung>
Date:   Wed Apr 9 13:26:28 2014 -0400

    Updated Copyright on README

diff --git a/README b/README
index 2c2c36b..214c774 100644
--- a/README
+++ b/README
@@ -1,5 +1,5 @@
 # BEGIN COPYRIGHT BLOCK
-# (C) 2008 Red Hat, Inc.
+# (C) 2008-13 Red Hat, Inc.
 # All rights reserved.
 # END COPYRIGHT BLOCK
New Red Hat Enterprise Linux 7 Security Feature: PrivateTmp

One of the reasons I am really excited about Red Hat Enterprise Linux 7 is the amount of new security features we have added, and not all of them involve SELinux.

Today, I want to talk about PrivateTmp.

One of my goals over the years has been to stop system services from using /tmp. I blogged about this back in 2007.

Anytime I have discovered a daemon using /tmp, I have tried to convince the packager to move the temporary content and FIFO files to the /run directory. If the content was permanent, then it should be in /var/lib. All users on your system are able to write to /tmp, so if an application creates content in /tmp that is guessable (i.e., has a well-known name), a user could create a link file with that name in /tmp and fool the privileged app to unlink or overwrite the destination of the link. Not only would you have to worry about users doing this, but you would also have to worry about any application that the user runs and any service that you have running on your system. They are all allowed to write to /tmp based on permissions.

Over the years, there have been several vulnerabilities (CVEs) about this. For example, CVE-2011-2722 covered a case where hplib actually included code like this:

fp = fopen ("/tmp/hpcupsfax.out", "w"); // <- VULN
system ("chmod 666 /tmp/hpcupsfax.out"); // <- "

This means that if you set up a machine running the cups daemon, a malicious user or an application that a user ran could attack your system.

I have convinced a lot of packages to stop using /tmp, but I can’t get them all. And in some cases, services like Apache need to use /tmp. Apache runs lots of other packages that might store content in /tmp.

Well, systemd has added a lot of new security features (more on these later).

PrivateTmp, which showed up in Fedora 16, is an option in systemd unit configuration files.

> man system.unit
  ...
  A unit configuration file encodes information about a service, 
  a socket, a device, a mount point, an automount point, a  swap
  file or partition, a start-up target, a file system path or a
  timer controlled and supervised by systemd(1).

> man systemd.exec
  NAME
    systemd.exec - systemd execution environment configuration
  SYNOPSIS
    systemd.service, systemd.socket, systemd.mount, systemd.swap
  DESCRIPTION
    Unit configuration files for services, sockets, mount points
    and swap devices share a subset of configuration options which
    define the execution environment of spawned processes.
  ...
  PrivateTmp=
    Takes a boolean argument. If true sets up a new file system
    namespace for the executed processes and mounts a private /tmp
    directory inside it, that is not shared by processes outside of
    the namespace. This is useful to secure access to temporary files
    of the process, but makes sharing between processes via /tmp
    impossible. Defaults to false.

PrivateTmp tells systemd to do the following anytime it starts a service with this option turned on:

   Allocate a private "tmp" directory
   Create a new file system namespace 
   Bind mount this private "tmp" directory within the namespace over
   /tmp
   Start the service. 

This means that processes running with this flag would see a different and unique /tmp from the one users and other daemons sees or can access.

NOTE: We have found bugs using PrivateTmp in Fedora 16, so make sure you test this well before turning it on in production.

For Fedora 17, I opened a feature page that requested all daemons that were using systemd unit files and /tmp to turn this feature on by default.

Several daemons, including Apache and cups, now have PrivateTmp turned on by default in Red Hat Enterprise Linux 7.

Given the three options as a developer of system service, I still believe that you should not use /tmp. You should instead use /run or /var/lib. But if you have to use /tmp and do not communicate with other users, then use PrivateTmp. If you need to communicate with users, be careful….

April 08, 2014

New Red Hat Enterprise Linux 7 Security Feature: systemd Starting Daemons

Why is this a security feature?

In previous releases of Red Hat Enterprise Linux, system daemons would be started in one of two ways:

  • At boot, init (sysV) launches an initrc script and then this script launches the daemon.
  • An admin can log in and launch the init script by hand, causing the daemon to run.

Let me show you what this means from an SELinux point of view.

NOTE: In the code below, @ means execute, --> indicates transition, and === indicates a client/server communication.

The init process executes an Apache init script, which in turn executes the Apache executable.

~~
init_t @ initrc_exec_t –> initrc_t @ httpd_exec_t –> httpd_t:
~~

This Apache processes would end up running with the full label of:

system_u:system_r:httpd_t:s0 

If Apache created content, it would probably be labeled

system_u:object_r:httpd_sys_content_rw_t:s0 

When an administrator, probably running as unconfined_t, started or restarted a service using service httpd restart:

unconfined_t @initrc_exec_t --> initrc_t @httpd_exec_t --> httpd_t 

Notice the process would adopt the user portion of the SELinux label that started it.

unconfined_u:system_r:httpd_t:s0

Content created by this Apache process would be:

unconfined_u:object_r:httpd_sys_content_rw_t:s0 

SELinux ends up confusing the user. In SELinux targeted and MLS policy, we ignore the user component of the SELinux label. Even if you wanted to write policy to confine based on user type, you can’t.

However, systemd fixes this problem. The transitions is very different:

init_t @ httpd_exec_t --> httpd_t
system_u:system_r:httpd_t:s0 

If you want to restart the Apache daemon as admin, do so now.

unconfined_t === init_t @ httpd_exec_t --> httpd_t
system_u:system_r:httpd_t:s0

With systemd, we don’t have the labeling problem and we can tighten up the SELinux policy.

Systemd Starting Daemons Affects More Than SELinux

Admins restarting daemons results in having to work around a lot of vulnerabilities and administration failures. Daemons need to be coded to clean up any leaked information from the admin process influencing the way the daemon ran

  • Need to clean $ENV
  • Need to change working directory in order to make sure they don’t blow up because they lack access to the current working directory (if you look at the /sbin/service script, you will see that one of the first things it does is cd /.
  • Need do something with the terminal (close stdin, stdout, stderr after they start)
  • Any open File descriptor in the user session also needs to be closed to make sure a daemon does not potentially gain access to tcp sockets or important content that the user had access to (in SELinux, we are always in a quandary about this because if we allow the daemon access to the terminal, a hacked daemon could present the admin with passwd and trick him into revealing the admin password)
  • Change the controlling terminal
  • Change the handling of signals

If a daemon writer screws up on one of these, he could make the system vulnerable or end up with unexpected bugs.

Using systemd to start daemons guarantees the daemon always gets started with the same environment, whether they are started at boot or restarted by an administrator.

April 02, 2014

caff gpg.conf file settings

After years of using caff for my PGP key-signing needs I finally come across the answer to a question I’ve had since the beginning.  I document it here so that I may keep my sanity next time I go searching for the information.

My question was “how do you make a specific certification in a signature?”.  As defined in RFC 1991, section 6.2.1, the four types of certifications are:

     <10> - public key packet and user ID packet, generic certification
          ("I think this key was created by this user, but I won't say
          how sure I am")
     <11> - public key packet and user ID packet, persona certification
          ("This key was created by someone who has told me that he is
          this user") (#)
     <12> - public key packet and user ID packet, casual certification
          ("This key was created by someone who I believe, after casual
          verification, to be this user")  (#)
     <13> - public key packet and user ID packet, positive certification
          ("This key was created by someone who I believe, after
          heavy-duty identification such as picture ID, to be this
          user")  (#)

Generally speaking, the default settings in caff only provide the first level “generic” certification. Tonight I found information specific to ~/.caff/gnupghome/gpg.conf. This file can contain, as far as I know, can contain three lines:

personal-digest-preferences SHA256
cert-digest-algo SHA256
default-cert-level 2

I can’t find any official information on this file as the man pages are a little slim on details.  That said, if you use caff you should definitely create this file and populate it with the above at a minimum with the exception of the default-cert-level.  The default-cert-level should be whatever you feel comfortable setting this as.  My default is “2″ for key signing parties (after I’ve inspected an “official” identification card and/or passport).  The other two settings are important as they provide assurances of using a decent SHA-2 hash instead of the default


April 01, 2014

Public Key Document Signing for Oslo Messaging

The PKI version of the Keystone tokens use a standard format for cryptographic signing of documents. Crypto Message Syntax (CMS) is the mechanism behind S/MIME and is well supported by the major cryptographic libraries: OpenSSL and NSS both have well documented CMS support. Messaging in OpenStack requires guaranteed identification of the author.

The publish-subscribe integration pattern allows one process to produce messages and multiple to consume them. More complex patterns build on this idea to allow linking multiple producers with multiple consumers. Dynamic Message Routing has many specific variations such as the process manager pattern which have been implemented inside of Open Stack.

Dynamic Router

Lets take the case where a compute node needs to contact a scheduler, any scheduler. The scheduler needs to know which compute node has contacted it. If a hypervisor exploit compromises a compute node, and the messages are not signed, that compute node can impersonate and other compute node in the group. If, on the other hand, messages from the compute node to the scheduler topic are signed, the scheduler can confirm that the message producer is the hypervisor in question.

Message Envelope Wrapper Pattern

Note that not every piece of the Message should be signed by the same entity. The Routing Slip Pattern provides for multiple entities to process a message. Each needs to indicate that they have seen it. Thus, “checking off” the slip means adding a cryptographic signature, most likely of the Fingerprint of the message.

Routing Table or Routing Slip Pattern

Routing Table or Routing Slip Pattern

The CMS format allows the signer to embed the certificates in the signed message. The benefit is that that message is self contained, assuming that you have a trusted the CA certificate. The downside is that the certificate data greatly increases the message size. Even compression does not completely remove this shortcoming. In the first PKI implementation, this added data of the certificate was not required, as we know exactly who is signing the certificate. However, in the future, we want to be able to have multiple Keystone servers signing certificates. There is a field inside the CMS document that indicates the signer. This value can be extracted and uswd to determine the signer of the certificate, and thus used to select the certificate used for signature validation.

All CMS can tell us is whether the signature of a document is valid. It does not tell us if the signer had authority to sign that document. Some portion of the document has to be compared with the identity of the certificate owner to confirm authorization to sign. In the case of our computer node posting to the scheduler topic, the consumer would want to confirm the signature of the request, and then confirm that the node ID of the request matches the node id of the certificate holder.

Directory

Directory

It is this last problem that points to a missing piece in Open Stack: a centralized registry for undercloud entities. Keystone does not have this view of the undercloud, at most it knows about endpoints. But the majority of the hosts in the undercloud are driven by the message queue, and are not HTTP endpoints. Keystone has no way to uniquely identify them.

There are several efforts that are converging on a solution.  The Recent Key Distribution Server (KDS) extension to Keystone provides for a unique identifier for the sharing of symmetric keys.

Kerberos Logo

Kerberos

In more traditional datacenter management approaches, hosts can be registered in LDAP and identified via Kerberos principals.

Zookeeper

The Zookeeper project provides a comparable mechanism for tracking the hosts enrolled in complex workflows.

Regardless of the mechanism, the base requirement is to provide a contextual mapping from the X509 certificate for the public key used to validate the signature to the OpenStack entity that signed the document.

A scalable system based on X509 and PKI requires a certificate authority. It does not have certificate signed by one of the public CAs such as you would find pre-populated in your browsers certificate database. It merely needs to be established a-priori as part of the OpenStack infrastructure, with a secure mechanism for distributing the CA certificate itself.

The actual mechanism to perform the signing and verification would mirror that used for the tokens, as I talked about here.

March 26, 2014

Enhance application security with FORTIFY_SOURCE

The FORTIFY_SOURCE macro provides lightweight support for detecting buffer overflows in various functions that perform operations on memory and strings. Not all types of buffer overflows can be detected with this macro, but it does provide an extra level of validation for some functions that are potentially a source of buffer overflow flaws. It protects both C and C++ code. FORTIFY_SOURCE works by computing the number of bytes that are going to be copied from a source to the destination. In case an attacker tries to copy more bytes to overflow a buffer, the execution of the program is stopped, and the following exception is returned:

*** buffer overflow detected ***: ./foobar terminated
======= Backtrace: =========
/lib64/libc.so.6[0x382d875cff]
/lib64/libc.so.6(__fortify_fail+0x37)[0x382d906b17]
...

FORTIFY_SOURCE provides buffer overflow checks for the following functions:

memcpy, mempcpy, memmove, memset, strcpy, stpcpy, strncpy, strcat, 
strncat, sprintf, vsprintf, snprintf, vsnprintf, gets.

The Feature Test Macros man page (man feature_test_macros) states:

If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc -O1) and above, checks that shouldn’t change the behavior of conforming programs are performed.  With _FORTIFY_SOURCE set to 2  some  more  checking  is added, but some conforming programs might fail.  Some of the checks can be performed at compile time, and result in compiler warnings; other checks take place at run time, and result in a run-time error if the check fails.  Use of this macro requires compiler support, available with gcc(1) since version 4.0.

Consider the following example that shows potentially dangerous code:

// fortify_test.c
#include<stdio.h>

/* Commenting out or not using the string.h header will cause this
 * program to use the unprotected strcpy function.
 */
//#include<string.h>

int main(int argc, char **argv) {
char buffer[5];
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));
strcpy(buffer,argv[1]);
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));
}

We can compile the above example to use FORTIFY_SOURCE (-D_FORTIFY_SOURCE) and optimization flags (-g -02) using the following command:

~]$ gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 fortify_test.c \
    -o fortify_test

If we disassemble the binary that is the output of the above command, we can see that no extra check function is called to uncover any potential buffer overflows when copying a string:

~]$ objdump -d ./fortify_test
0000000000400440 :
400440:  48 83 ec 18             sub $0x18,%rsp
400444:  ba 05 00 00 00          mov $0x5,%edx
400449:  bf 10 06 40 00          mov $0x400610,%edi
40044e:  48 89 e6                mov %rsp,%rsi
400451:  31 c0                   xor %eax,%eax
400453:  e8 b8 ff ff ff          callq 400410
400458:  48 b8 64 65 61 64 62    movabs $0x6665656264616564,%rax
40045f:  65 65 66
400462:  48 89 e6                mov %rsp,%rsi
400465:  ba 05 00 00 00          mov $0x5,%edx
40046a:  48 89 04 24             mov %rax,(%rsp)
40046e:  bf 10 06 40 00          mov $0x400610,%edi
400473:  31 c0                   xor %eax,%eax
400475:  c6 44 24 08 00          movb $0x0,0x8(%rsp)
40047a:  e8 91 ff ff ff          callq 400410
40047f:  31 c0                   xor %eax,%eax
400481:  48 83 c4 18             add $0x18,%rsp
400485:  c3                      retq
400486:  66 90                   xchg %ax,%ax

This means that any potential buffer overflows would be undetected and could allow an attacker to leverage the flaw in the program. Debugging the same program shows that we can overwrite arbitrary data (in our case with the character ‘A’ represented by ‘\x41′) in the RAX register:

$ gdb -q ./fortify_test
Reading symbols from /home/sid/security/fortify/fortify_test...done.

(gdb) br 8
Breakpoint 1 at 0x4004b8: file fortify_test.c, line 8.

(gdb) r $(python -c 'print "\x41" * 360')
Starting program: /home/sid/security/fortify/fortify_test \ 
$(python -c 'print "\x41" * 360')

Buffer Contains: ����� , Size Of Buffer is 5
Breakpoint 1, main (argc=, argv=0x7fffffffd788) at fortify_test.c:8
8 printf ("Buffer Contains: %s , Size Of Buffer is %d\n",buffer,
sizeof(buffer));

(gdb) i r
rax 0x7fffffffd690 140737488344720
rbx 0x7fffffffd788 140737488344968
rcx 0x4141414141414141 4702111234474983745
rdx 0x41 65
rsi 0x7fffffffdd40 140737488346432
rdi 0x7fffffffd7ef 140737488345071
rbp 0x0 0x0
rsp 0x7fffffffd690 0x7fffffffd690
r8 0x2d 45
r9 0x0 0
r10 0x7fffffffd450 140737488344144
r11 0x382d974c60 241283058784
r12 0x4004d4 4195540
r13 0x7fffffffd780 140737488344960
r14 0x0 0
r15 0x0 0
rip 0x4004b8 0x4004b8
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0

(gdb) x /100x $rax
0x7fffffffd690: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffd6a0: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffd6b0: 0x41414141 0x41414141 0x41414141 0x41414141
...
0x7fffffffd7d0: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffd7e0: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffd7f0: 0x41414141 0x41414141 0xffffde00 0x00007fff
0x7fffffffd800: 0xffffde6a 0x00007fff 0xffffde85 0x00007fff
0x7fffffffd810: 0xffffde93 0x00007fff 0xffffdeae 0x00007fff

Next, let us uncomment the inclusion of the string.h header in our test program and supply a string to the strcpy function with the length that exceeds the defined length of our buffer:

// fortify_test.c
#include<stdio.h>
#include<string.h>

int main(int argc, char **argv) {
char buffer[5];
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));
// Here the compiler the length of string to be copied
strcpy(buffer,"deadbeef");
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));
}

If we attempt to compile the above program using FORTIFY_SOURCE and an appropriate optimization flag, the compiler returns a warning because it correctly detects the buffer overflaw in the buffer variable:

~]$ gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 fortify_test.c \ 
-o fortify_test

In file included from /usr/include/string.h:636:0,
from fortify_test.c:2:
In function ‘strcpy’,
inlined from ‘main’ at fortify_test.c:7:8:
/usr/include/bits/string3.h:104:3: warning: call to 
__builtin___memcpy_chk will always overflow destination buffer 
[enabled by default]
return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
^

If we disassemble the binary output of the above command, we can see the call to <__memcpy_chk@plt>, which checks for a potential buffer overflow:

~]$ objdump -d ./fortify_test
...
00000000004004b0 :
4004b0:       48 83 ec 18             sub    $0x18,%rsp
4004b4:       ba 05 00 00 00          mov    $0x5,%edx
4004b9:       bf 80 06 40 00          mov    $0x400680,%edi
4004be:       48 89 e6                mov    %rsp,%rsi
4004c1:       31 c0                   xor    %eax,%eax
4004c3:       e8 a8 ff ff ff          callq  400470 <printf@plt>
4004c8:       48 89 e7                mov    %rsp,%rdi
4004cb:       b9 05 00 00 00          mov    $0x5,%ecx
4004d0:       ba 09 00 00 00          mov    $0x9,%edx
4004d5:       be b0 06 40 00          mov    $0x4006b0,%esi
4004da:       e8 b1 ff ff ff          callq  400490<__memcpy_chk@plt> 
4004df:       48 89 e6                mov    %rsp,%rsi
4004e2:       ba 05 00 00 00          mov    $0x5,%edx
4004e7:       bf 80 06 40 00          mov    $0x400680,%edi
4004ec:       31 c0                   xor    %eax,%eax
4004ee:       e8 7d ff ff ff          callq  400470 <printf@plt>
4004f3:       31 c0                   xor    %eax,%eax
4004f5:       48 83 c4 18             add    $0x18,%rsp
4004f9:       c3                      retq   
4004fa:       66 90                   xchg   %ax,%ax
...

However, if the program is modified so that the strcpy function takes a value that has a variable length, compiling it will not return any warnings:

// fortify_test.c
#include<stdio.h>
#include<string.h>

int main(int argc, char **argv) {
char buffer[5];
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));

// String length is determined at runtime
strcpy(buffer,argv[1]);
printf ("Buffer Contains: %s , Size Of Buffer is %d\n",
                               buffer,sizeof(buffer));
}
~]$ gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 fortify_test.c \ 
    -o fortify_test
~]$

Because FORTIFY_SOURCE cannot predict the length of the string that is passed from argv[1], the compiler does not return any warning related to a buffer overflow at compile time. If we run this program and supply it with a string that triggers a buffer overflow, the program terminates:

~]$ ./fortify_test $(python -c 'print "\x41" * 360')
Buffer Contains: �Q��� , Size Of Buffer is 5
*** buffer overflow detected ***: ./fortify_test terminated
======= Backtrace: =========
/lib64/libc.so.6[0x382d875cff]
/lib64/libc.so.6(__fortify_fail+0x37)[0x382d906b17]
/lib64/libc.so.6[0x382d904d00]
./fortify_test[0x4004dd]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x382d821d65]
./fortify_test[0x400525]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fd:05 9967292 /home/sid/security/ \
fortify/fortify_test
00600000-00601000 r--p 00000000 fd:05 9967292 /home/sid/security/ \
fortify/fortify_test
00601000-00602000 rw-p 00001000 fd:05 9967292 /home/sid/security/ \
fortify/fortify_test
013ee000-0140f000 rw-p 00000000 00:00 0 [heap]
382d400000-382d420000 r-xp 00000000 fd:03 922951 /usr/lib64/ld-2.18.so
382d61f000-382d620000 r--p 0001f000 fd:03 922951 /usr/lib64/ld-2.18.so
382d620000-382d621000 rw-p 00020000 fd:03 922951 /usr/lib64/ld-2.18.so
382d621000-382d622000 rw-p 00000000 00:00 0
382d800000-382d9b4000 r-xp 00000000 fd:03 928040 /usr/lib64/ \
libc-2.18.so
382d9b4000-382dbb4000 ---p 001b4000 fd:03 928040 /usr/lib64/ \
libc-2.18.so
382dbb4000-382dbb8000 r--p 001b4000 fd:03 928040 /usr/lib64/ \
libc-2.18.so
382dbb8000-382dbba000 rw-p 001b8000 fd:03 928040 /usr/lib64/ \
libc-2.18.so
382dbba000-382dbbf000 rw-p 00000000 00:00 0
382f800000-382f815000 r-xp 00000000 fd:03 928048 /usr/lib64/ \
libgcc_s-4.8.2-20131212.so.1
382f815000-382fa14000 ---p 00015000 fd:03 928048 /usr/lib64/ \
libgcc_s-4.8.2-20131212.so.1
382fa14000-382fa15000 r--p 00014000 fd:03 928048 /usr/lib64/ \
libgcc_s-4.8.2-20131212.so.1
382fa15000-382fa16000 rw-p 00015000 fd:03 928048 /usr/lib64/ \
libgcc_s-4.8.2-20131212.so.1
7ffb727a5000-7ffb727a8000 rw-p 00000000 00:00 0
7ffb727cc000-7ffb727cf000 rw-p 00000000 00:00 0
7fffa1945000-7fffa1967000 rw-p 00000000 00:00 0 [stack]
7fffa19fe000-7fffa1a00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted

To conclude this post, the use of FORTIFY_SOURCE is highly recommended when compiling source code. Developers should ensure their code uses protected functions by including the correct headers, use the -D_FORTIFY_SOURCE option, and use optimization flags equal to 1 or greater. It is also important to view the log files after a compilation of the source code to spot any anomalies FORTIFY_SOURCE detected.

For more information on FORTIFY_SOURCE, refer to http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02055.html

March 23, 2014

OpenLMI Developer Projects

“How can I get started with OpenLMI?”

From one perspective there is an easy answer – start using LMIShell scripts and lmi commands to manage systems.

At the other extreme you can write a complete OpenLMI Provider.

With the availablity of LMIShell there is another answer – to develop a program or script that builds on the OpenLMI infrastructure to do useful work. Developing client tools in LMIShell provides an easy entry point to learning OpenLMI and understanding how to apply it to more complex problems. Using Python or Java you can develop reasonably small programs that do interesting things. With the examples contained in the LMIshell scripts and the OpenLMI API documentation you have everything you need to learn by doing.

Experience has shown us that the greatest challenge in using OpenLMI is understanding the underlying object model and the low level API. LMIShell provides a task-oriented interface that is easier to learn and use. After starting with LMIShell you can look inside the LMIShell scripts and see how they work with the low level API. Starting from the top down, getting immediate results, and learning by doing makes OpenLMI much more approachable. Working with LMIShell and high level interfaces to accomplish useful tasks is a great help to using OpenLMI for system management and even for doing complex tasks such as developing OpenLMI Providers.

We have added some suggestions for interesting projects to the Sandbox page on the OpenLMI web site. These are interesting, useful projects that you might consider as a starting point. Most are designed to be implemented in Python on top of LMIShell However, if you prefer, there is nothing stopping you from implementing them in Java – or even “C” or C++! Some of them are very straightforward and some are a bit more complex. If nothing else, these might give you some ideas for your own project.

Speaking of which, if you have any ideas for interesting and useful project built on OpenLMI, please propose them!


March 16, 2014

Security through obscurity
It's not uncommon for folks to note that if the source code is available, it's easier for the hackers to figure out what's going on. Those of us in the open source security universe have long claimed that's not true, but it's really hard to back up such a claim. It's a bit like proving the nonexistence of something.

I ran across a blog entry where someone shows how one goes about analyzing a flaw fixed in Windows to know what was fixed, and to construct an reproducer.
CVE-2014-0301 Analysis
This is really well written, and I'm quite happy to admit I learned a fair bit from it.

Remember, the only way to have good security is to have good security. There are no magic tricks.

March 12, 2014

The trouble with snprintf

At least historically, misuse of functions like strcpy, strcat, and sprintf was a common source of buffer overflow vulnerabilities. Therefore, in 1997, the Single UNIX Specification, Version 2, included a new interface for string construction that provided an explicit length of the output string: snprintf. This function can be used for string construction with explicit length checking.

Originally, it could be used in the following way:

    /* buff is a pointer to a buffer of blen characters. */
    /* Note well: This example is now incorrect. */
    char *cp = buff;
    if ((n = snprintf(cp, blen, "AF=%d ", sau->soa.sa_family)) < 0) {
        Warn1("sockaddr_info(): buffer too short ("F_Zu")", blen);
        *buff = '';
        return buff;
    }
    cp += n,  blen -= n;

This example is based on socat, but this coding pattern is still fairly common. The socat case is likely harmless as far as potential security impact is concerned.

The code in the above example avoids writing too much data to the buff pointer because early snprintf implementations returned -1 if the output string was truncated, based on this requirement from the Single UNIX Specification, Version 2:

Upon successful completion, these functions return the number of bytes transmitted excluding the terminating null in the case of sprintf() or snprintf() or a negative value if an output error was encountered.

However, the example code is insecure when compiled on current systems.

The specification quoted above is still ambiguous with regard to truncation, something that would be addressed during the standardization of the next version of C, ISO C99. As a result of that, in 2002, version 3 of the Single UNIX Specification was published, aligning the snprintf behavior with ISO C99:

Upon successful completion, the snprintf() function shall return the number of bytes that would be written to s had n been sufficiently large excluding the terminating null byte.

There is a remaining discrepancy between ISO C99 and POSIX regarding the EOVERFLOW return value, but we will ignore that. As far as the history can be retraced now, the GNU C Library adopted the ISO C99 behavior some time in 1998.

After this specification change, truncated output does not result in an error return value any more. Even worse, the result value exceeds the passed buffer length, making the pointer and length adjustment in the example invalid:

    cp += n,  blen -= n;

If the cp pointer and the blen argument are used in subsequent snprintf calls (which is often the case when the result from snprintf is used in pointer arithmetic), the buffer overflow vulnerability that snprintf was supposed to deal with resurfaces: cp points outside of the original buffer, and blen wraps around (after the conversion in size_t), resulting in a value that does not stop snprintf from writing to the invalid pointer.

Covering this error condition is somewhat difficult:

    char *cp = buff;
    n = snprintf(cp, blen, "AF=%d ", sau->soa.sa_family)
    if (n < 0) {
        Warn1("sockaddr_info(): snprintf failed: %s", strerror(errno));
        *buff = '';
        return buff;
    } else if (n >= blen) {
        Warn1("sockaddr_info(): buffer too short ("F_Zu")", blen);
        *buff = '';
        return buff;
    }
    cp += n,  blen -= n;

As a more convenient substitute, it is possible to ignore the return value from snprintf altogether, and acquire the number of written characters using strlen:

    char *cp = buff;
    assert(blen >= 1);
    *buff = 0;
    snprintf(cp, blen, "AF=%d ", sau->soa.sa_family)
    blen -= strlen(cp);
    cp += strlen(cp);

This code assumes that the snprintf implementation does not write an unterminated string to the destination buffer on error, which is quite reasonable as far as such assumptions go.  The value of strlen(cp) will always be less than the value of blen, so a subsequent snprintf will have room to write the null terminator.

Enhancing -D_FORTIFY_SOURCE=2 to cover the original example code reliably is difficult because GCC cannot track the size information through the pointer arithmetic following the snprintf call, so it is not available to subsequent snprintf calls. Another option would be to have snprintf abort in fortify mode when the buffer length passed in is INT_MAX or larger. Adding logic to GCC to deal with this snprintf oddity specifically is a bit dubious, considering that this only deals with misuse of a single library function.

Curiously, snprintf is not the only function that suffered from an interface change as the result of standardization. Another example is strerror_r, the thread-safe variant of strerror. Even today, it exists in two variants in the GNU C library, one that returns a pointer value (used with -D_GNU_SOURCE) and one that returns an int (the standardized version).

One can only hope that with increased openness of standardization processes and more participation from the free software community in the creation of mostly proprietary standard documents, future recurrences of this kind of problem can be avoided, for example by standardizing interfaces with conflicting implementations under completely new names.

March 11, 2014

OpenLMI CLI Interface Now Supports Networking

The OpenLMI CLI now includes commands to configure networks.

lmi> help net

Networking service management.

Usage:

net device (–help | show [<device_name> ...] | list [<device_name> ...])

net setting (–help | <operation> [<args>...])

net activate <caption> [<device_name>]

net deactivate <caption> [<device_name>]

net enslave <master_caption> <device_name>

net address (–help | <operation> [<args>...])

net route (–help | <operation> [<args>...])

net dns (–help | <operation> [<args>...])

Commands:

device Display information about network devices.

setting Manage the network settings.

activate Activate setting on given network device.

deactivate Deactivate the setting.

enslave Create new slave setting.

address Manipulate the list of IP addresses on given setting.

route Manipulate the list of static routes on given setting.

dns Manipulate the list of DNS servers on given setting.

As you can see from the help command, you can determine what network devices are available, read and modify their configuration, and bring NICs up and down.

List network devices:

lmi> net device list

ElementName OperatingStatus MAC Address

em1 InService 3C:97:0E:4B:2E:53

lo NotAvailable 00:00:00:00:00:00

virbr0-nic NotAvailable 52:54:00:DF:BD:C4

wlp3s0 Dormant 60:67:20:C9:0B:DC

Get details on a network device:

lmi> net device show em1

Device em1:

Operating Status InService

MAC Address 3C:97:0E:4B:2E:53

IP Address 10.18.57.222/255.255.127.0

IP Address fe80::3e97:eff:fe4b:2e53/64

IP Address 2620:52:0:1238:3e97:eff:fe4b:2e53/64

Default Gateway 10.18.57.254

Default Gateway ::

Default Gateway ::

DNS Server 10.5.30.160

DNS Server 10.11.5.19

Active Setting System em1

Available Setting System em1

The LMI CLI commands for networking allow you to show the device settings, change device settings, activate and deactivate a network device, and add or remove devices to a set of  bonded or bridged devices.


March 10, 2014

OpenLMI At Red Hat Summit

OpenLMI will be represented at the upcoming Red Hat Summit, which is being held in San Francisco from April 14-17.

Stephen Gallagher and I will be giving a talk on OpenLMI on Tuesday, April 15, at 10:40am. This talk will provide an overview of OpenLMI, cover its functional capabilities, and demonstrate using the LMIShell CLI and Scripts to accomplish common management tasks.

There will also be an OpenLMI demo in the Red Hat Pavilion on Wednesday, April 16, from 1:00pm-3:00pm. Drop by to see OpenLMI in action and to ask questions.

Finally, we would love to have the opportunity to discuss OpenLMI with you. Contact me to see about scheduling time for a meeting. This is a great chance to meet with the experts and make sure that your needs and requirements are being addressed.


March 06, 2014

OpenLMI CLI Interface Updates

Exciting news – a major update to the OpenLMI CLI is available! The new CLI adds support for configuring networks, reworks the storage commands, provides a command hierarchy in the interactive shell, and includes internal improvements.

Let’s start with the new LMI command structure. Use lmi help to see the new interface:

lmi> help

Static commands

===============

EOF exit help

Application commands (type help <topic>):

=========================================

file group hwinfo net power service storage sw system user

Built-in commands (type :help):

===============================

:.. :cd :pwd :help

One obvious change is that all of the storage related commands have been combined into a single top level storage command – you now begin all storage operations with the keyword storage. At the same time, a new shell option supports command hierarchy. If you are going to be entering a series of storage commands, you can “move down” to that level by using the “:cd” command. You move back up to a higher level in the command hierarchy by using the “:..” command.

lmi> :cd storage

>storage>

You can now enter storage commands directly – for example, list available storage devices:

lmi> :cd storage

>storage>

>storage> list

Name Size Format

/dev/sda 320072933376 MS-DOS partition table

/dev/mapper/luks-fe998f70-9da9-4049-88db-47d9db936b82 319545147392 physical volume (LVM)

/dev/sda1 524288000 ext4

/dev/sda2 319547244544 Encrypted (LUKS)

/dev/mapper/vg_rd230-lv_home 259879075840 ext4

/dev/mapper/vg_rd230-lv_root 53687091200 ext4

/dev/mapper/vg_rd230-lv_swap 5972688896 swap

>storage>

In a change from the previous version, the storage list command now includes only the friendly device name. This means you no longer need a 200 column terminal window to get all the information on one like.

Like the previous version of the storage command, you can get detailed information on each device. Entering storage show will list details for all storage devices. Entering storage show devicename will give detailed information for just that device. For example, to get details on device sda:

>storage> show sda

Name Value

/dev/disk/by-id/ata-HITACHI_HTS725032A7E630_TF1401Y1G0EZAF:

Name Value

Type Generic block device

DeviceID /dev/disk/by-id/ata-HITACHI_HTS725032A7E630_TF1401Y1G0EZAF

Name /dev/sda

ElementName sda

Total Size 320072933376

Block Size 512

Data Type Partition Table

Partition Table Type MS-DOS

Partition Table Size (in blocks) 1

Largest Free Space 0

Partitions /dev/sda1 /dev/sda2

>storage>

Another significant improvement is support for thin provisioning through the thinpool and thinlv commands:

>storage> help thinpool

Thin Pool management.

Usage:

thinpool list

thinpool create <name> <vg> <size>

thinpool delete <tp> …

thinpool show [ <tp> ...]

Commands:

list List all thin pools on the system.

create Create Thin Pool with given name and size from a Volume Group.

delete Delete given Thin Pools.

show Show detailed information about given Thin Pools. If no

Thin Pools are provided, all of them are displayed.

Options:

vg Name of the volume group, with or without `/dev/` prefix.

tp Name of the thin pool, with or without `/dev/` prefix.

size Requested extent size of the new volume group, by default in

bytes. ‘T’, ‘G’, ‘M’ or ‘K’ suffix can be used to specify

other units (TiB, GiB, MiB and KiB) – ’1K’ specifies 1 KiB

(=1024 bytes).

The suffix is case insensitive, i.e. 1g = 1G = 1073741824 bytes.

>storage> help thinlv

Thin Logical Volume management.

Usage:

thinlv list [ <tp> ...]

thinlv create <tp> <name> <size>

thinlv delete <tlv> …

thinlv show [ <tlv> ...]

Commands:

list List available thin logical volumes on given thin pools.

If no thin pools are provided, all thin logical volumes are

listed.

create Create a thin logical volume on given thin pool.

delete Delete given thin logical volume.

show Show detailed information about given Thin Logical Volumes. If no

Thin Logical Volumes are provided, all of them are displayed.

Options:

tp Name of the thin pool, with or without `/dev/` prefix.

size Size of the new logical volume, by default in bytes.

‘T’, ‘G’, ‘M’ or ‘K’ suffix can be used to specify other

units (TiB, GiB, MiB and KiB) – ’1K’ specifies 1 KiB

(= 1024 bytes).

The suffix is case insensitive, i.e. 1g = 1G = 1073741824

bytes.

>storage>

We will take a look at other parts of the OpenLMI CLI commands in future articles.


March 05, 2014

DAC check before MAC check. SELinux will stop wine'ing.
When it comes to SELinux, one of the most aggravating bugs we see are when the kernel does a MAC check before a DAC Check. 

This means SELinux checks happen before normal ownership/permission checks.  I always prefer to have the DAC check happen first.  This is important because code that is attempting the denied access usually will handle the EPERM silently and go down a different code path.    But if a MAC Failure happens, SELinux writes an AVC to the audit log, and setroubleshoot reports it to the user.

One of the biggest offenders of this was the mmap_zero check.  Every time a process tries to map low kernel memory, the kernel denies it, in both DAC and MAC.  Wine applications are notorious for this.  We block mmap_zero because it can potentially trigger kernel bugs which can lead to privilege escalation.

Eric Paris explains the vulnerability here.

Since the MAC check was done before the DAC check, the wine applications tend to work correctly.  When the wine application attempts to mmap low memory, it gets denied, and then reattempts the mmap with a higher memory value.  On an SELinux system the kernel generates AVC.  The user sees something like:

SELinux is preventing /usr/bin/wine-preloader from 'mmap_zero' accesses on the memprotect.

Reading about the mmap_zero, scares the user and they think their machine is vulnerable.  The only thing SELinux policy writers can do is write a dontaudit rule or allow the access, which defeats the purpose of the check.

We still want to block this access if a privileged confined process got it and report the SELinux violation.   If an confined application running as root, attempts a mmap_zero access, SELinux should block it and report the AVC.  If a normal unprivileged process triggered the access check, we would prefer to allow DAC to handle it, and not print the message.

To give you an idea of how often people have seen this; Google "SELinux mmap_zero" and you will get more then 13,000 hits.

Today the upstream kernel has been fixed to report check for mmap_zero for MAC AFTER DAC.

Thanks to Eric Paris and Paul Moore for fixing this issue.
certmonger-session

There is more to the certmonger story. A lot more. After my last attempt I tried to use certmonger:

  • as a user-launched process
  • to get a user certificate
  • direct from the dogtag instance behind FreeIPA

I was not 100% successful, but the attempt did have some positive results.


The whole thing started with an ipa-server-install.

To be able to talk direct to Dogtag, though, I opened a few additional ports on the Firewall

Added these lines in /etc/sysconfig/iptables

-A INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 8443 -j ACCEPT
service iptables restart

As part of the IPA install, it records a couple of PKCS12 files with enough information to talk to your Dogtag instance. These are in /root, but I copied them to the /home/fedora directory to make them accessible via scp to my desktop.

scp fedora@hostname:ca-agent.p12 .

I ran Firefox and imported the certificates
From the menu select
edit->preferences
That will pop up the preferences dialog, and select the Advanced tab and the Certificates subtab.
advancedClick view certificates and then “Your Certificates” on that panel, and finally click the Import Button.

cert-manager

Yeah, that is pretty hostile.

I had to add the hostname and ip address for the ipa server int /etc/hosts for my laptop in order to get an https connect to dogtag, but I got it with: https://hostname:8443/ Note that it uses the certificates imported from ca-agent.p12 to connect.

First I made things work as root: I created a new CA configuration for the certmonger system daemon
[root@ipa cas]# cat /var/lib/certmonger/cas/ipa-ca

id=dogtag-ipa
ca_is_default=0
ca_type=EXTERNAL
ca_external_helper=/usr/libexec/certmonger/dogtag-ipa-renew-agent-submit -i /etc/ipa/ca.crt -E http://ipa.openstack.freeipa.org:8080/ca/ee/ca -A https://$HOSTNAME:8443/ca/agent/ca/

And I was able to request a certificate with:

sudo getcert request -c dogtag-ipa -f /etc/pki/testcert -k /etc/pki/testkey -N cn=someothername

That submitted the request. In an ordinary dogtag setup, certificate are not automatically approved. I had to approve it using the web UI on
Approval on: https://$HOSTNAME:8443/ca/agent/ca/

Once it is approved, you can either wait 8 hours for the next query or run a resubmit to get the status and fetch the cert:

sudo getcert resubmit -i 20140226001609

And if you forgot the id you can get it with

sudo getcert list -s

Cert is fetched and placed into /etc/pki/testcert. The key should already have been made in /etc/pki/testkey

(Actually, certmonger crashed, due to a bug that had already been fixed but not yet landed in the yum repo. I reinstalled with an internal build, but you can find yours here: When we restarted the certmonger daemon, it picked right up and succeeded.)

Now to try and get a user cert using the user-runnable version of certmonger. This is called certmonger-session, but you don’t run it directly. It is a dbus driven service. This was done on the ipa server machine, which has no X server on it. If there was, there would be a dbus daemon running, but since there is not:

export DBUS_SESSION_BUS_ADDRESS=`dbus-daemon --session --fork --print-address`

It looks like this:

unix:abstract=/tmp/dbus-ym4pEDwUcj,guid=e7a9adfd8ae299213a39f57a530d3891

To run certmonger:

getcert list-cas -s

The -s option makes it a session request as opposed to talking to the systemd managed server. This creates
~/.config/certmonger but no cas subfolders under there yet. They get written upon certmonger-session exit:

killall certmonger-session

Will dump them into .config/certmonger/cas/. They are named by time stamp:

ls .config/certmonger/cas/
20140226004653  20140226004653-1  20140226004653-2  20140226004653-3

now create .config/certmonger/cas/ipa-dogtag (while daemon is not running, as killing daemon will overwrite the config…) with these contents (using your own hostname):

id=ipa-dogtag
ca_is_default=0
ca_type=EXTERNAL
ca_external_helper=/usr/libexec/certmonger/dogtag-ipa-renew-agent-submit -i /etc/ipa/ca.crt -E http://ipa.openstack.freeipa.org:8080/ca/ee/ca -A https://ipa.openstack.freeipa.org:8443/ca/agent/ca/ -T caUserCert

And checkout out that your new CA entry is available with: getcert list-cas -s

CA 'SelfSign':
	is-default: no
	ca-type: INTERNAL:SELF
	next-serial-number: 01
CA 'IPA':
	is-default: no
	ca-type: EXTERNAL
	helper-location: /usr/libexec/certmonger/ipa-submit
CA 'certmaster':
	is-default: no
	ca-type: EXTERNAL
	helper-location: /usr/libexec/certmonger/certmaster-submit
CA 'dogtag-ipa-renew-agent':
	is-default: no
	ca-type: EXTERNAL
	helper-location: /usr/libexec/certmonger/dogtag-ipa-renew-agent-submit
CA 'ipa-dogtag':
	is-default: no
	ca-type: EXTERNAL
	helper-location: /usr/libexec/certmonger/dogtag-ipa-renew-agent-submit -i /etc/ipa/ca.crt -E http://ipa.openstack.freeipa.org:8080/ca/ee/ca -A https://ipa.openstack.freeipa.org:8443/ca/agent/ca/ -T caUserCert

Need a place to hold the certs:

mkdir ~/.pki

Since you are running as a user, unconfined, you don’t need to worry about SELinux. If you were somehow to run this non-interactivly, you would need to run

sudo chcon -t cert_t .pki/

Or do something more permanent.

To request a user certificate:

getcert request -c ipa-dogtag -s -f ~/.pki/ayoung.cert.pem -k ~/.pki/ayoung.key.pem -N "uid=ayoung,cn=users,cn=accounts,dc=openstack,dc=freeipa,dc=org"

and to see the request

$ getcert list -s
Number of certificates and requests being tracked: 1.
Request ID '20140226005825':
status: CA_REJECTED
ca-error: Server at "http://$HOSTNAME:8080/ca/ee/ca/profileSubmit" replied: Subject Name Not Found

WHAT! Yeah, doesn’t work yet. Kill the request for now:

getcert stop-tracking -s -i 20140226005825

So it looks like the caUserCert profile does not accept a PKCS11 request, as it is set up to come from the Web Browser, and those use Certificate Request Message Format (CRMF). I’ve been able to hack around it (I think) but there is still more to learn here.

March 01, 2014

FreeIPA web call from Python

This was a response to a post of mine in 2010. The comment was unformatted in the response, and I wanted to get it readable. Its a great example of making a Kerberized web call.

Courtesy of Rich Megginson

Note: requires MIT kerberos 1.11 or later if you want to skip doing the kinit, and just let the script do the kinit implicitly with the keytab.

import kerberos
import sys
import os
from requests.auth import AuthBase
import requests
import json

class IPAAuth(AuthBase):
    def __init__(self, hostname, keytab):
        self.hostname = hostname
        self.keytab = keytab
        self.token = None

        self.refresh_auth()

    def __call__(self, request):
        if not self.token:
            self.refresh_auth()

        request.headers['Authorization'] = 'negotiate ' + self.token

        return request

    def refresh_auth(self):
        if self.keytab:
            os.environ['KRB5_CLIENT_KTNAME'] = self.keytab
        else:
            LOG.warn('No IPA client kerberos keytab file given')
        service = "HTTP@" + self.hostname
        flags = kerberos.GSS_C_MUTUAL_FLAG | kerberos.GSS_C_SEQUENCE_FLAG
        try:
            (_, vc) = kerberos.authGSSClientInit(service, flags)
        except kerberos.GSSError, e:
            LOG.error("caught kerberos exception %r" % e)
            raise e
        try:
            kerberos.authGSSClientStep(vc, "")
        except kerberos.GSSError, e:
            LOG.error("caught kerberos exception %r" % e)
            raise e
        self.token = kerberos.authGSSClientResponse(vc)


hostname, url, keytab, cacert = sys.argv[1:]

request = requests.Session()
request.auth = IPAAuth(hostname, keytab)
ipaurl = 'https://%s/ipa' % hostname
jsonurl = url % {'hostname': hostname}
request.headers.update({'Content-Type': 'application/json',
                        'Referer': ipaurl})
request.verify = cacert

myargs = {'method': 'dnsrecord_add',
          'params': [["testdomain.com", "test4.testdomain.com"],
                     {'a_part_ip_address': '172.31.11.4'}],
          'id': 0}
resp = request.post(jsonurl, data=json.dumps(myargs))
print resp.json()

myargs = {'method': 'dnsrecord_find', 'params': [["testdomain.com"], {}], 'id': 0}
resp = request.post(jsonurl, data=json.dumps(myargs))
print resp.json()

Run the script like this:

python script.py ipahost.domain.tld ‘https://%(hostname)s/ipa/json’ myuser.keytab /etc/ipa/ca.crt

February 28, 2014

SELinux Transitions do not happen on mountpoints mounted with nosuid.
Today one of our customers was trying to run openshift enterprise and it was blowing up because of SELinux.
Openshift sets up the Apache daemon to run /var/www/openshift/broker/script/broker_ruby.

When looked at the log, it was stating that Apache was not allowed to execute broker_ruby permission denied.

ls -lZ /var/www/openshift/broker/script/broker_ruby
Shows that broker_ruby is labeled as httpd_sys_content_t

I went and looked at policy, I saw.

sesearch -A -s httpd_t -t httpd_sys_content_t -p execute -C
DT allow httpd_t httpdcontent : file { ioctl read write create getattr setattr lock append unlink link rename execute open } ; [ httpd_enable_cgi httpd_unified && httpd_builtin_scripting && ]


This shows that the httpd_t (Apache) process is allowed to execute the broker_ruby script if all of the following booleans are enabled.
httpd_enable_cgi, httpd_unified, httpd_builtin_scripting

Turns out the were.  I then went back and looked at the AVC.

type=AVC msg=audit(28/02/14 13:56:52.702:24992) : avc:  denied  { execute_no_trans } for  pid=6031 comm=PassengerHelper path=/var/www/openshift/broker/script/broker_ruby dev=dm-3 ino=817 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:httpd_sys_content_t:s0 tclass=file

This AVC means that the Apache daemon (httpd_t) is not allowed to execute the broker_ruby application (httpd_sys_content_t) without a transition, meaning in the current label (httpd_t).

Which I understood, since when the above booleans are turned on httpd_t is supposed to transition to httpd_sys_script_t when executing httpd_sys_content_t.  This sesearch command shows the transition rule.

sesearch -T -s httpd_t -t httpd_sys_content_t -c process -C
DT type_transition httpd_t httpd_sys_content_t : process httpd_sys_script_t; [ httpd_enable_cgi httpd_unified && httpd_builtin_scripting && ]


Why wasn't the process transitioning?

Then I remembered that SELinux transitions do not happen on mounted partitions that are mounted with the nosuid flag.

man mount
...
       nosuid Do not allow set-user-identifier or set-group-identifier bits to take effect. (This seems safe, but is in fact rather  unsafe  if  you have suidperl(1) installed.)


SELinux designers feel that a transition can be a potential privilege escalation similar to a suid root application.  Therefore if an administrator has told the system that no suid apps should be allowed on a mount point, then it also means no SELinux transitions will happen.

Removing the nosuid flag from the mount point fixes the problem.

February 26, 2014

Security audits through reimplementation

For many networking protocols and file formats exist which interoperate with each other. Developing an implementation for a protocol or format diverges from previous implementations in subtle ways, at least initially. Such differences can uncover previously unnoticed corner cases which are not handled properly, and sometimes reveal security vulnerabilities.

For example, in the mid-90s, it was discovered that Samba’s SMB client, smbclient, did not restrict user name length in the same way Windows does, so that you could crash Windows SMB file servers using the smbclient program.  (Microsoft fixed this around Service Pack 3 of Windows NT 4.0 in 1997, in the patch Q161830.)  Today, such issues would be considered security problems, but at the time, they were barriers to interoperability (and occurred in the other direction as well, when an unexpected reply from a Windows system crashed Samba).

In a sense, this is a form of fuzz testing with non-random input data. In fact, parsers implemented in a more declarative manner than sequential, imperative code can often be used as a template for very efficient fuzzers.

Similarly, porting from 32 bit to 64 bit exposes raw pointers used over the network.  Implementing strategies for network protocol parsers in C vary, but older code often uses (packed) structs.  If these include pointers, their sizes change, resulting in a fairly obvious interoperability failure.

Cryptographic misuse is revealed in a reimplementation as well.  A cursory look at the implementation of the kwallet encrypted password store for KDE suggests that it uses Blowfish in cipher-block-chaining (CBC) mode, with a random initialization vector.  This would be good news.  But when you try to implement an independent decoder outside of the KDE ecosystem, you will notice that you cannot actually decrypt the stored passwords using CBC mode.  Despite source code files called cbc.cc, kwallet does not actually chain cipher blocks, and uses Blowfish in electronic codebook mode, which, in combination with other issues, might make cryptanalysis of stored passwords feasible.  (To my knowledge, this case of misused cryptography was actually discovered in this manner several years ago, but reported privately and not fixed.  It was eventually rediscovered and assigned CVE-2013-7252.)

In short, every time you reimplement a protocol or file format and it causes an existing implementation to crash or enter an infinite loop, you may have discovered a security problem. In this case, please contact us.

February 25, 2014

Using Certmonger to Generate a selfsign Cert for CMS

We want to replace the shell call to openssl for certificate generation in Keystone (and the rest of OpenStack) with calls to Certmonger. Certmonger supports both OpenSSL and NSS. Certmonger can support a selfsigned approach, as well as tie in to a real Certificate Authority. Here are the steps I took to test out selfsigning, as well as my notes for follow on work.

Request a certificate:

sudo selfsign-getcert request -f /etc/pki/testcert -k /etc/pki/testkey

copy certs to /tmp and sign

 cat /opt/stack/python-keystoneclient/examples/pki/cms/auth_token_unscoped.json |  openssl cms -sign -signer /tmp/testcert -inkey /tmp/testkey -outform PEM -nosmimecap -nodetach -nocerts -noattr  -out /tmp/auth_token_unscoped.pem

and verify with

openssl cms -verify -certfile /tmp/testcert -CAfile /tmp/testcert -inform PEM -in auth_token_unscoped.pem

Need to clean up SELinux:

A workging one is shown here:

matchpathcon /etc/pki/tls/private/
/etc/pki/tls/private	system_u:object_r:cert_t:s0

Need to make this look the same-ish

 matchpathcon /etc/keystone/ssl/certs/

what can certmonger do on files? Check with

sudo sesearch --allow -s certmonger_t -c file -t cert_t

Returns

Found 3 semantic av rules:
   allow certmonger_t cert_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow nsswitch_domain cert_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain cert_t : file { ioctl read getattr lock open } ; 

So we need to add a rule in standard policy to label /etc/keystone/ssl (and all subdirs) as cert_t

Thanks to Nalin Dahyabhai for helping me work this out.

February 23, 2014

Hacking a Wifi Kettle

Here is a quick writeup of the protocol for the iKettle taken from my Google+ post earlier this month. This protocol allows you to write your own software to control your iKettle or get notifications from it, so you can integrate it into your desktop or existing home automation system.

The iKettle is advertised as the first wifi kettle, available in UK since February 2014. I bought mine on pre-order back in October 2013. When you first turn on the kettle it acts as a wifi hotspot and they supply an app for Android and iPhone that reconfigures the kettle to then connect to your local wifi hotspot instead. The app then communicates with the kettle on your local network enabling you to turn it on, set some temperature options, and get notification when it has boiled.

Once connected to your local network the device responds to ping requests and listens on two tcp ports, 23 and 2000. The wifi connectivity is enabled by a third party serial to wifi interface board and it responds similar to a HLK-WIFI-M03. Port 23 is used to configure the wifi board itself (to tell it what network to connect to and so on). Port 2000 is passed through to the processor in the iKettle to handle the main interface to the kettle.

Port 2000, main kettle interface

The iKettle wifi interface listens on tcp port 2000; all devices that connect to port 2000 share the same interface and therefore receive the same messages. The specification for the wifi serial board state that the device can only handle a few connections to this port at a time. The iKettle app also uses this port to do the initial discovery of the kettle on your network.

Discovery

Sending the string "HELLOKETTLE\n" to port 2000 will return with "HELLOAPP\n". You can use this to check you are talking to a kettle (and if the kettle has moved addresses due to dhcp you could scan the entire local network looking for devices that respond in this way. You might receive other HELLOAPP commands at later points as other apps on the network connect to the kettle.

Initial Status

Once connected you need to figure out if the kettle is currently doing anything as you will have missed any previous status messages. To do this you send the string "get sys status\n". The kettle will respond with the string "sys status key=\n" or "sys status key=X\n" where X is a single character. bitfields in character X tell you what buttons are currently active:

Bit 6Bit 5Bit 4Bit 3Bit 2Bit 1
100C95C80C65CWarmOn

So, for example if you receive "sys status key=!" then buttons "100C" and "On" are currently active (and the kettle is therefore turned on and heating up to 100C).

Status messages

As the state of the kettle changes, either by someone pushing the physical button on the unit, using an app, or sending the command directly you will get async status messages. Note that although the status messages start with "0x" they are not really hex. Here are all the messages you could see:

sys status 0x100100C selected
sys status 0x9595C selected
sys status 0x8080C selected
sys status 0x10065C selected
sys status 0x11Warm selected
sys status 0x10Warm has ended
sys status 0x5Turned on
sys status 0x0Turned off
sys status 0x8005Warm length is 5 minutes
sys status 0x8010Warm length is 10 minutes
sys status 0x8020Warm length is 20 minutes
sys status 0x3Reached temperature
sys status 0x2Problem (boiled dry?)
sys status 0x1Kettle was removed (whilst on)

You can receive multiple status messages given one action, for example if you turn the kettle on you should get a "sys status 0x5" and a "sys status 0x100" showing the "on" and "100C" buttons are selected. When the kettle boils and turns off you'd get a "sys status 0x3" to notify you it boiled, followed by a "sys status 0x0" to indicate all the buttons are now off.

Sending an action

To send an action to the kettle you send one or more action messages corresponding to the physical keys on the unit. After sending an action you'll get status messages to confirm them.

set sys output 0x80Select 100C button
set sys output 0x2Select 95C button
set sys output 0x4000Select 80C button
set sys output 0x200Select 65C button
set sys output 0x8Select Warm button
set sys output 0x8005Warm option is 5 mins
set sys output 0x8010Warm option is 10 mins
set sys output 0x8020Warm option is 20 mins
set sys output 0x4Select On button
set sys output 0x0Turn off

Port 23, wifi interface

The user manual for this document is available online, so no need to repeat the document here. The iKettle uses the device with the default password of "000000" and disables the web interface.

If you're interested in looking at the web interface you can enable it by connecting to port 23 using telnet or nc, entering the password, then issuing the commands "AT+WEBS=1\n" then "AT+PMTF\n" then "AT+Z\n" and then you can open up a webserver on port 80 of the kettle and change or review the settings. I would not recommend you mess around with this interface, you could easily break the iKettle in a way that you can't easily fix. The interface gives you the option of uploading new firmware, but if you do this you could get into a state where the kettle processor can't correctly configure the interface and you're left with a broken kettle. Also the firmware is just for the wifi serial interface, not for the kettle control (the port 2000 stuff above), so there probably isn't much point.

Missing functions

The kettle processor knows the temperature but it doesn't expose that in any status message. I did try brute forcing the port 2000 interface using combinations of words in the dictionary, but I found no hidden features (and the folks behind the kettle confirmed there is no temperature read out). This is a shame since you could combine the temperature reading with time and figure out how full the kettle is whilst it is heating up. Hopefully they'll address this in a future revision.

Security Implications

The iKettle is designed to be contacted only through the local network - you don't want to be port forwarding to it through your firewall for example because the wifi serial interface is easily crashed by too many connections or bad packets. If you have access to a local network on which there is an iKettle you can certainly cause mischief by boiling the kettle, resetting it to factory settings, and probably even bricking it forever. However the cleverly designed segmentation between the kettle control and wifi interface means it's pretty unlikely you can do something more serious like overiding safety (i.e. keeping the kettle element on until something physically breaks).

February 15, 2014

Compressed tokens

The maximum header size between a HTTPD and an WSGI process is fixed at 8 Kilobytes. With a sufficiently large catalog, the token in PKI format won’t fit. Compression seems like it would be such an easy solution. But the there is a Hobgoblin or two hiding in the shadows.

Background

The current implementation (as of February 2014) of PKI tokens are produced by signing a JSON document using the CMS (Crypto Message Syntax) utility in the Openssl toolkit.

The command line to sign looks something like this:

 openssl cms -sign -in $json_file -nosmimecap -signer $CERTS_DIR/signing_cert.pem -inkey $PRIVATE_DIR/signing_key.pem -outform PEM -nodetach -nocerts -noattr -out ${json_file/.json/.pem}

PEM is a base64 encoded format, and seems like a reasonable solution. It produces a document like this:

-----BEGIN CMS-----
MIIDKAYJKoZIhvcNAQcCoIIDGTCCAxUCAQExCTAHBgUrDgMCGjCCATUGCSqGSIb3
DQEHAaCCASYEggEieyJhY2Nlc3MiOiB7InRva2VuIjogeyJleHBpcmVzIjogIjIx
MTItMDgtMTdUMTU6MzU6MzRaIiwgImlkIjogIjAxZTAzMmM5OTZlZjQ0MDZiMTQ0
MzM1OTE1YTQxZTc5In0sICJzZXJ2aWNlQ2F0YWxvZyI6IHt9LCAidXNlciI6IHsi
dXNlcm5hbWUiOiAidXNlcl9uYW1lMSIsICJyb2xlc19saW5rcyI6IFtdLCAiaWQi
OiAiYzljODllM2JlM2VlNDUzZmJmMDBjNzk2NmY2ZDNmYmQiLCAicm9sZXMiOiBb
eyduYW1lJzogJ3JvbGUxJ30seyduYW1lJzogJ3JvbGUyJ30sXSwgIm5hbWUiOiAi
dXNlcl9uYW1lMSJ9fX0xggHKMIIBxgIBATCBpDCBnjEKMAgGA1UEBRMBNTELMAkG
A1UEBhMCVVMxCzAJBgNVBAgTAkNBMRIwEAYDVQQHEwlTdW5ueXZhbGUxEjAQBgNV
BAoTCU9wZW5TdGFjazERMA8GA1UECxMIS2V5c3RvbmUxJTAjBgkqhkiG9w0BCQEW
FmtleXN0b25lQG9wZW5zdGFjay5vcmcxFDASBgNVBAMTC1NlbGYgU2lnbmVkAgER
MAcGBSsOAwIaMA0GCSqGSIb3DQEBAQUABIIBAFq4JvODBIaoHiYG6KMCnBEhDjWS
CuW0gq3kbi3j8kOzb4Mr7Muq0XvGMRwDrZlkfSpzIyuri/Fzf2pW58hnjWfDHQ1S
laAWLs6csh2u80hgWpMngCN5ZVFtIIbWlE0ZuLZh8p7E0IJZnNvYmlOVrmIkRo+J
1vMr71HZr5/kFcJzFVgi8QI4XU5iBPsUWOdJJV+0jXkMHVqOX3H297CYCePaotLD
azuquE74N8KMyl8j8jE9wi9O1gVBqO4L66ePjt5zI/TrjbjKwdseqoZR1dDGlp5V
awRwRYCjsKF+asAbuASOwdSgP8V6VgTOUrZh2D8KHtclwS+URoTdVl4ypQA=
-----END CMS-----

This token needs to end up in -X-Auth-Token header in requests to the other OpenStack services. Since this is a multiple line document, it cannot be sent without removing the line breaks.

It also has a problem with encoding: the “/” characters in the Base64 encoding are not valid in a header. To handle that, we swap them for a safe character: “-”.

Since the header and footer are the same in every token, we can strip them off as well.

The following Python code converts the PEM format to a “token”

def cms_to_token(cms_text):
    start_delim = "-----BEGIN CMS-----"
    end_delim = "-----END CMS-----"
    signed_text = cms_text
    signed_text = signed_text.replace('/', '-')
    signed_text = signed_text.replace(start_delim, '')
    signed_text = signed_text.replace(end_delim, '')
    signed_text = signed_text.replace('\n', '')
    return signed_text

Base64

It turns out that the slash-to-dash conversion was a mistake: We now have a non-standard Base64 encoding that we have to support for the near future. Instead, we should have been using a standard Python utility to do the Base64 in n url safe manner. Instead of signing the token using the PEM format and converting, would could and should have signed in the binary format that is underneath it, and encoded ourself. Then, in validation we could have just reversed the process. In python:

        
        text = json.dumps(token_data).encode('utf-8')
        signed = cms.cms_sign_text(text,
                                   signing_cert_file_name,
                                   signing_key_file_name,
                                   "DER")
        encoded = PREFIX + base64.urlsafe_b64encode(compressed)

zlib

Python has built in support for zlib compression. 

encoded = base64.urlsafe_b64encode(compressed)

With the following logic:

        text = json.dumps(token_data).encode('utf-8')
        signed = cms.cms_sign_text(text,
                                   signing_cert_file_name,
                                   signing_key_file_name,
                                   "DER")
        compressed = zlib.compress(signed, 6)
        encoded = base64.urlsafe_b64encode(compressed)

We could produce a token that is the equivalent of the current signed tokens, but about 10% the size: actual results depends on the compression ratio used and the token data.

cms compression

I was excited to find that the OpenSSL CMS command line supports both document signing and compression until I realized that the OpenSSL CMS command line does not support both document signing and compression at the same time.

When you call the command, it either compresses or signs, not both at once. Bah! So, it makes sense to just sign to DER format, and then compress and encode using Python libraries.

Issues

Why can’t we just jump to the format shown above? Several issues, and understanding them should help clarify the approach to the solution

Compatibility

First, we have a commitment to support older versions of the code for a while, and people do not upgrade all of their tools in lockstep. That means that we need a system that can accept both the older and newer formats of the tokens. On the verification side, that means that auth_token_middleware needs to handle the old and the new formats equally. We can do that with code like the following:

           try:
	        data = base64.urlsafe_b64decode(token)
	        decoded = decoder.decode(data)
	        if decoded[0].typeId != 0:
	            return decoded
	    except (error.SubstrateUnderrunError, TypeError):
	        try:
	            copy_of_token = token.replace('-', '/')
	            data = base64.urlsafe_b64decode(copy_of_token)
	            decoded = decoder.decode(data)
	            if decoded[0].typeId != 0:
	                retval = decoded
	            else:
	                retval = False
	            return retval
	        except TypeError:
	            return False

ASN1

Note that ASN1 parsing has to benefits. First, it checks that the data decoded from Base64 is real, and not garbage, which can happened by accident. In addition, it provides access to the signed data prior to calling the signature function. One of the most important pieces of information in there is the “Signer” field, which can be later used to select which certificate to use when validating the token. This will allow for multiple, load balanced, Keystone servers to each have their own signing keys.

Detecting compression

The above code assumes that the underlying format is ASN1. We want to check for compression. The only sure fire way is to decompress the string. This would add yet another failure case to the test if a token is ASN1: Probably best to make the change to URL safe and compressed at the same time.

Code Duplication

There are two copies of the file cms.py in the Keystone code base: one in the server, (keystone/common/cms.py) and on in the client (keystone/common/cms.py) which was copied from the server. The first is used to sign tokens, the second is called by Auth Token Middleware to validate tokens. We have recently fixed a bug that will allow us to use the client code inside the server, and with that, we can make all of the changes for compression in one code base. This change is close to happening.

Token Format Identification

It might make more sense to prepend a text header to the token. Thus instead of a token starting with “MII” Like it currently does (assuming a current length) we could prepend “{cms}” to indicate a CMS signed token. A compressed one would then have a different format “{cmsz}” and so on. This seems encouraging until you realize that it buys us very little: the deprecated form of the token would still have to be supported for a short time, and after that we would be left with a vestige that keeps the token from being any proper file format. The magic string {cms} would not be identified by the ‘file’ command from bash. Adding in any additional transform (switching compression algorithm for example) would require either a different header, or internal detection of the format change.

However, the advantages seem to outweigh the disadvantages. With an explicit prefix, we know what operations to perform. I think the trick is to make the prefix forward compatible. I am currently leaning toward {keystone=v3.5} as this provides both a clear identifier of the purpose of the blob and a version number with which to move forward as formats change.

February 05, 2014

Embedded Vulnerability Detection command line tool

The Victims project is a Red Hat initiative that aims to detect known vulnerable dependencies in Java projects and deployments. Our initial focus was Java projects that were built using Maven. The victims-enforcer plug-in for Maven provides developers with immediate feedback if any of their project dependencies contain known vulnerabilities. However, until recently we did not have a good solution for scanning deployments or tools that work outside of a typical build and release cycle. The alpha release of the victims client for Java hopes to fill this gap. The victims client for Java is a simple command line tool that presently has the ability to scan jar files, directories, and pom.xml files for known vulnerabilities. It also allows you to synchronize with the victims project infrastructure and control local settings.

Getting started with the victims client for Java is relatively simple.

<iframe allowfullscreen="" frameborder="0" height="360" src="http://www.youtube.com/embed/E5crFoOff48" width="480"></iframe>

  1. Download the alpha release of the victims tool.
  2. Build it using the ‘mvn clean package’ command. You need to have a Java SDK >= 1.5 and Maven installed on your machine.
  3. Run it: by default, the victims client for Java will compile into a standalone .jar file. This means that you can launch it from the build directory by running ‘$ java -jar target/victims-client-1.0-SNAPSHOT-standalone.jar’

To simplify the examples we can create an alias:

mkdir ~/.victims
cp target/victims-client*-standlone.jar ~/.victims
alias victims='java -jar $HOME/.victims/victims-client-1.0-SNAPSHOT-standalone.jar'

The goal of this release has been to present a small subset of capabilities to users with the aim of figuring out what additional features people require. The rest of this article will focus on the various use cases for the client tool.

The first and most important step when using the client is to synchronize with the victims Embedded Vulnerability Detection (EVD) service. This will download all the vulnerability definitions from the remote service, which can take a while. To do this you need to specify the ‘–update’ flag when running the client. Specifying ‘–update’ on subsequent runs of the tool will check to make sure that no additional updates are available.

# Getting updates
$ victims --update

Updating EVD definitions:

# Checking last update time
$ victims --db-status
Database last updated on: Mon Dec 16 14:37:48 EST 2013

With the database up to date, it is now possible to scan jar files. If the victims client for Java test ran during the build, then you should have some example files that you can run the scans against in the ‘.testdata’ directory.

To run a scan against a single jar file, simply provide the file name:

$ victims .testdata/org/springframework/spring/2.5.6/spring-2.5.6.jar
.testdata/org/springframework/spring/2.5.6/spring-2.5.6.jar VULNERABLE! CVE-2009-1190 CVE-2011-2730 CVE-2010-1622

You can also recursively scan a directory for known vulnerable artifacts:

# Warning - this will take a while..
$ victims --recursive ~/.m2
/home/gm/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar VULNERABLE! CVE-2012-5783
/home/gm/.m2/repository/xerces/xercesImpl/2.9.1/xercesImpl-2.9.1.jar VULNERABLE! CVE-2009-2625
(etc..)

If you use Maven to build your projects, you can also run the victims client for Java across your entire source directory. Any pom.xml files that are detected will have all dependency Group, Artifact, and Version (GAV) information cross checked against the victims database entries.

That covers the main use cases of the tool. We are looking for alpha testers to help improve the capabilities and iron out any bugs. If you do take the time to test this project out, please give us feedback via raising issues on Github, contacting us on #victi.ms on freenode, or via the development mailing list.

February 04, 2014

MIDI, ALSA, USB, and JACK

Akai recently released a USB version of their Electronic Wind Instrument (EWI) which I was able to purchase for under $200. I was fairly quickly able to get it running using QJackCtl and QSynth. But then I wanted to understand what was happening. That involved spelunking into the four subsystems that make up the title of this post.

Quick, non-canonical definitions

MIDI: Event notification framework for Music and more
ALSA: Advanced Linux Sound Architecture
USB: Universal Serial Bus
JACK: Jack Audio Connection Kit

To get a sound out of the EWI, I ran two Qt based programs:

qjackctl &
qsynth &

Screenshot from 2014-01-27 12:22:15

And I connected the EWI to the synth. Note that it was on the ALSA tab, not Audio nor MIDI. Nothing shows up on the Midi tab, and the Audio tab shows only System and the Synthesizer. What is going on here?

Turns out I don’t know JACK. This is OK, as it turns out there is no JACK here. QJackCtl actually talks direct to ALSA to do all of the setup.

Thanks to the folks on Freenode #jack for setting me straight.

I’ve since plugged in Rosegarden, and it shows up between the EWI and the synth.  You’ll notice that in the output below.

The command line program that shows the info I want is aconnect. The input devices are:

$ aconnect -l -i
client 0: 'System' [type=kernel]
    0 'Timer           '
    1 'Announce        '
	Connecting To: 129:0, 130:0
client 14: 'Midi Through' [type=kernel]
    0 'Midi Through Port-0'
client 20: 'EWI-USB' [type=kernel]
    0 'EWI-USB MIDI 1  '
	Connecting To: 129:0
client 129: 'rosegarden' [type=user]
    1 'sync out        '
    2 'external controller'
    3 'out 1 - General MIDI Device'
	Connecting To: 128:0

and the output devices are

$ aconnect -l -o
client 14: 'Midi Through' [type=kernel]
    0 'Midi Through Port-0'
client 20: 'EWI-USB' [type=kernel]
    0 'EWI-USB MIDI 1  '
	Connecting To: 129:0
client 128: 'FLUID Synth (qsynth)' [type=user]
    0 'Synth input port (qsynth:0)'
	Connected From: 129:3
client 129: 'rosegarden' [type=user]
    0 'record in       '
	Connected From: 0:1, 20:0
    2 'external controller'

I can kill the connection between Rosegarden and the EWI using

aconnect --disconnect 20:0 129:0

And reconnect it using

aconnect 20:0 129:0

I’ve also learned that the list of devices comes from

$ cat /proc/asound/cards
 0 [PCH            ]: HDA-Intel - HDA Intel PCH
                      HDA Intel PCH at 0xf2530000 irq 45
 1 [EWIUSB         ]: USB-Audio - EWI-USB
                      Akai Professional, LLC. EWI-USB at usb-0000:00:1a.0-1.2, full speed
 4 [ThinkPadEC     ]: ThinkPad EC - ThinkPad Console Audio Control
                      ThinkPad Console Audio Control at EC reg 0x30, fw unknown

February 03, 2014

Manualizing System Management

Since the last article talked about Automating System Management, let’s look at Manualizing System Management. By this we mean enabling a human to perform system management tasks.

Why would we want to do this? There are several reasons. At a philosophical level, it is a person that determines that something is needed. An automated system can’t decide “hmm, we need an ERP system”. People are flexible and goal oriented – they understand Why leading to What culminating in How.

Using the example above:

  • A person can notice “hmm, we are having trouble getting all the parts needed to build our products to the right place at the right time”. (Why)
  • Again, it is a person that reasons “Aha! We need an ERP system.” (What)
  • At this point there are multiple ways to deploy the selected ERP system. (How)

In general, the first time you do a task you need to do it manually. Among other reasons, there are often surprises and issues that must be overcome. For any reasonably complex task it is almost impossible to determine all of the details, dependencies, unexpected inputs, corner cases,and systems that don’t behave exactly as expected without actually performing the task. If you are dealing with a unique configuration, such as a database server with local storage, it may make sense to set it up manually rather than spending the time automating the task. People play the ultimate role in system management. Automated systems are good at doing things over and over, but can’t do something the first time.

Once we assert that automated management tools can’t do everything by themselves, we need to look at what a SysAdmin needs. An effective system for manual system management has several characteristics:

  • It presents needed information, especially state and context. A SysAdmin is typically moving rapidly between a number of systems and working on a variety of tasks. It is important for them to be able to immediately discern which system they are on and what the current state of the system is.
  • It provides task oriented functions to perform the needed tasks. These functions should support the way the SysAdmin works, not expose the underlying implementation details.
  • It should be consistent across functions. For example, all functions should use a structure like “command, options, target” rather that having some that are structured “command, target, options”. Similarly, there should be a standard keyword for “create”, rather than a mixture of “create, make, add, instantiate, etc.” across various functions.
  • It helps the user. People are good at big picture goals, but have trouble remembering the exact details of dozens of low level functions. Computers are good at details, but don’t do big picture. It would be nice it people and computers could work together (the goal of User Experience designers everywhere!).

Manual management has a number of things in common with automated management. Both need to be able to talk to systems – a standardized remote API for management functions is a powerful foundation. In fact, once the low level infrastructure is in place you can build both automated and manual systems on top of it.

Ultimately we need a hybrid system for systems management – and integrated system that supports automation, scripting, CLI, and a graphical interface, all working together.

 


February 02, 2014

Efficient Revocation Checking

The majority of web service calls in OpenStack require token validation. Checking a token ID against a list is a cheap hashtable lookup. Comparing a token to a set of events is more expensive. How can we keep costs down?

Design considerations

A revocation event is a dictionary. The keys are a subset of:

  • trust_id
  • consumer_id
  • access_token_id
  • expires_at
  • domain_id
  • project_id
  • user_id
  • role_id

All of the keys of the dictionary need to match as set of the values of the token. It is not a simple one-to-one match: for example, user_id in the revocation event can match trustor_id or trustee_id as well as user_id and the token is invalid.

If the token has multiple roles associated with it, only one of the roles has to match in order for the token to be revoked.

Every event has an filed of issued_before. The token must have a value of issued_at that is earlier than the event.

Events only have certain subsets of values.  For example, an event will only ever have expires_at specified if it also has user_id.  This rule is for revoke all tokens for a give user issued from a given token, and the expiration date is used to link the tokens back to the originator.  Another example is that a roles will either appear by them selves or in conjunction with the other field for revoking a grant:  user_id, and project_id or domain_id.

The expensive solution

A naive approach would check the token against each revocation record. Since most tokens are valid, it means that each token is going to run through the entire list. As the list grows, the number of comparisons grows linearly. This has the potential to perform poorly.

token_revocation_linearA revoked token will, on average, pass through half the events of the list (N/2)  but a valid token will pass through the entire list.  How can we do better?

 

Building an Index

If every revocation event had a user_id field specified, we could narrow our search down to those that matched the token in question by building a hashtable keyed by user id, and returning a list of the revocation events with a matching user_id.  While our situation is more complex than this, this idea points toward an efficient solution.

token_revoke_hashmap

We can treat any of the events that do not specify a value for user_id as “match all user_ids”.  For an event such as “revoke by domain id” which did not indicate a user id, we could create a “don’t care” field.  If the token does not match any of the tokens with an explicit user_id set, we look in the “don’t care” list.

This approach can be extended many levels deep: one for each of the attribute keys in the list above.  We end up building a tree, where each node is a python dictionary.  To evaluate a token, we walk the tree from root node to leaf node.  If we make it all the way to a leaf node, the token should be considered revoked.

token_revocation_treeValidating a token

In order to validat a token, the each value from the token must be matched to the   corresponding node in the tree. It must be matched against both the explicit match and the “don’t care” value (indicated by the Kleene Closure *)  If the token matches a path that takes them all the way to a leaf node of the tree, the token is revoked.

token_check_revocation_in_treeNow this is a lot more efficient.  The revoked token had 4 comparisons in the example above.  A revoked token will likely be found in a fixed number of comparisons:  The depth of the tree, times the number of “don’t cares” it matches.  Fully populated tree, with don’t cares and explicit hash tables at every level is not likely. Its hard to estimate an average path through the tree for a non revoked token.  A lot depends on the ordering of the nodes

Node Ordering

As it turns out, putting the most common attribute at the root node is not the most efficient.  Since each level of the tree will now have many “don’t care” sub maps, each node in the parent is going to require many more intermediate nodes.  It would require less space to have the most common nodes closer to the bottom of the tree.  Contrast the following two pictures.   In the first, the top node has the most children.

Tree_top_node_heavy

However, but moving this node to the bottom of the tree, there are significantly fewer interior nodes.

Tree_botton_node_heavy

The current key ordering it:

‘trust_id’, ‘consumer_id’, ‘access_token_id’, ‘expires_at’, ‘domain_id’, ‘project_id’, ‘user_id’, ‘role_id’

One assumption is the most common revocation event will be role assignment changes.  If it turns out that password changes are more common, then user_id will move to the bottom of the tree.

The “Don’t Care” path is processed first.  This approach of “rush to the bottom of the tree”  will again attempt to handle the expected case first:  assumption is that trust and oauth revocations will be less common.

The order of attributes will likely get updated based on performance data

Event Removal

Tokens do not live for ever.  A revocation event only needs to hang around as long as the token is valid.  Once it expires, the revocation event is redundant, and can be removed.  An expired revocation event is removed from the tree using the same algorithm as it is added, but instead of creating  new nodes if there are none, it removes nodes, and removes parent nodes if there are no longer any children.

In order to clean up the tree, the application holds on to the events in a list ordered by revocation time.  Periodically, the application traverses the list and finds any events that are past their expiration date.  These events are removed from the tree, and from the list of held events.  Since they are ordered, the application only needs to search until it finds the first event that is not expired.  Any new events that come in will be younger than the current set of events and can be appended to the list. token_revoke_cleanupStatus

The code that implements this is currently up for review.

Once it is merged, I will update this document.

 

January 31, 2014

Containers your time is now. Lets look at Namespaces.
Lately I have been spending a lot of time working on Containers.  Containers are a mechanism for controlling what a process does on a system.

Resource Constraints can be considered a form of containerment.

In Fedora and RHEL we use cgroups for this, and with the new systemd controls in Fedora and RHEL7, managing cgroups has gotten a lot easier.  Out of the box all of your processes are put into a cgroup based on whether they are a user, system service or a Machine (VMs).  These processes are grouped at the unit level, meaning two users logged into a system will get and "Fair Share" of the system, even if one user forks off thousands of processes.  Similarly if you run an httpd service and a mariadb service, they each get an equal share of the system, meaning that httpd can not fork 1000 process while mariadb only runs three, the httpd 1000 processes can not dominate the machine leaving no memory of cpu for mariadb.  Of course you can go into the unit files for httpd or mariadb and add a couple of simple resource constraints to further limit them

Adding

MemoryLimit: 500m

to httpd.service  unit file

For example will limit the service to only use 500 megabytes to httpd processes.

Security Containment

Some could say I have been working on containers for years since SELinux is a container technology for controlling what a process does on the system.  I will talk about SELinux and advanced containers in my next blog.

Process Separation Containment

The last component of containers is Namespaces.  The linux kernel implements a few namespaces for process separation.  There are currently 6 namespaces.

Namespaces can be used to Isolate processes. They can create a new environment where changes to the process are not reflected in other namespace.
Once set up, namespaces are transparent for processes.

Red Hat Enterprise Linux  and Fedora currently support 5 namespace

  • ipc

  • ipc namespace allows you to have shared memory, semaphores with only processes within the namespace.

  • pid

  • pid namespace eliminates the view of other processes on the system and restarts pids at pid 1.

  • mnt

  • mnt namespace allows processes within the container to mount file systemd over existing files/directories without affecting file systems outside the namespace

  • net

  • net namespace creates network devices that can have IP Addresses assigned to them, and even configure iptables rules and routing tables

  • uts

  • uts namespace allows you to assign a different hostname to processes within the container. Often useful with the network namespace

Rawhide also supports the user namespace.  We hope to add the user namespace support to a  future Red Hat Enterprise Linux 7.

User namespace allows you to map real user ids on the host to container uids.  For example you can map UID 5000-5100 to 0-100 within the container.  This means you could have uid=0 with rights to manipulate other namespaces within the container.  You could for example set the IP Address on the network namespaced ethernet device.  Outside of the container your process would be treated as a non privileged process.  User namespace is fairly young and people are just starting to use it.

I have put together a video showing namespaces in Red Hat Enterprise Linux 7.
https://www.youtube.com/watch?v=e4NXJ5nM-_M&feature=youtu.be

January 30, 2014

Configuring offlineimap to validate SSL/TLS certificates

I recently upgrade to Fedora 20 and quickly found my offlineimap instance failing.  I was getting all kinds of errors regarding the certificate not being authenticated.  Concerned wasn’t really the word I’d use to describe my feelings around the subject.  Turns out, the version of offlineimap in Fedora 20 (I won’t speculate as to earlier versions) requires a certificate fingerprint validation or a CA validation if SSL=yes is in the configuration file (.offlineimaprc).  I was able to remedy the situation by putting sslcacertfile = /etc/ssl/certs/ca-bundle.crt in the config file.

I won’t speculate as to the functionality in earlier versions but checking to make sure the SSL certificate is valid is quite important (MITM).  If you run across a similar problem just follow the instructions above and all should, once again, be right with the world.


January 25, 2014

Splitting a patch

To make things easier for your code reviewer, each patch should be small, and hold one well defined change. I break this rule all the time, and it comes back to bite me. What happens is that I get heads down coding, and I have a solution that involves changes to wide number of files and subsystems, new abstractions, etc. Here is how I am currently dealing with breaking down a big patch.

I started with a branch named revocation-events. On it I had one big commit. To make sure I could always get back to this commit, I started by creating a new branch:

git checkout -b revocation-events-split

Note I want I want to be able to commit later with the commit message from my original comment.

git log --format=%B -n 1 HEAD > /tmp/commitmsg.txt

I can later use that commit message with

git commit -F /tmp/commitmsg.txt

Now, the scary part. Remove everything from the commit:

git reset –soft HEAD~1

Leaves all of your files staged for review. You can either leave some in and selectively remove others, or remove them all. For my case, I need to inspect each. Since all of my files are in either the etc or the keystone subdir:

git status
# On branch throwaway
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#	modified:   etc/keystone-paste.ini
#	modified:   etc/keystone.conf.sample
#	modified:   keystone/assignment/core.py

and so on, I run

git reset HEAD keystone
git reset HEAD etc

Now, I want to selectively add back in lines from a single file. To see the changes in a single file:

 git diff HEAD keystone/tests/test_notifications.py

To start off the new review, I can select certain lines from this change using

git add -i --patch keystone/tests/test_notifications.py

(I’ve just been informed that the -i is redundant with –patch)

This puts git add in interactive mode it will query on each chunk of the patch. You are given a list of options:
Stage this hunk [y,n,q,a,d,/,e,?]? y = yes and n = no. ? Will show what the others mean.

What if a given chunk is mixed? Use the e for edit option. The below hunk has two tests. I want each in a separate commit.

diff --git a/keystone/tests/test_notifications.py b/keystone/tests/test_notifications.py
index 57741fe..d724dd4 100644
--- a/keystone/tests/test_notifications.py
+++ b/keystone/tests/test_notifications.py
@@ -228,6 +228,21 @@ class NotificationsForEntities(test_v3.RestfulTestCase):
         self.identity_api.delete_user(user_ref['id'])
         self._assertLastNotify(user_ref['id'], 'deleted', 'user')
 
+    def test_delete_domain(self):
+        domain_ref = self.new_domain_ref()
+        self.identity_api.create_domain(domain_ref['id'], domain_ref)
+        domain_ref['enabled'] = False
+        self.identity_api.update_domain(domain_ref['id'], domain_ref)
+        self.identity_api.delete_domain(domain_ref['id'])
+        self._assertLastNotify(domain_ref['id'], 'deleted', 'domain')
+
+    def test_disable_domain(self):
+        domain_ref = self.new_domain_ref()
+        self.identity_api.create_domain(domain_ref['id'], domain_ref)
+        domain_ref['enabled'] = False
+        self.identity_api.update_domain(domain_ref['id'], domain_ref)
+        self._assertLastNotify(domain_ref['id'], 'disabled', 'domain')
+
     def test_update_group(self):
         group_ref = self.new_group_ref(domain_id=self.domain_id)
         self.identity_api.create_group(group_ref['id'], group_ref)
Stage this hunk [y,n,q,a,d,/,e,?]? 

e gives me the message at the top of the screen:

 # Manual hunk edit mode -- see bottom for a quick guide
 @@ -228,6 +228,21 @@ class NotificationsForEntities(test_v3.RestfulTestCase):

And instructions at the bottom:

 # To remove '-' lines, make them ' ' lines (context).
 # To remove '+' lines, delete them.
 # Lines starting with # will be removed.

To remove the second test, I remove the lines with the + next to them:

 # Manual hunk edit mode -- see bottom for a quick guide
 @@ -228,6 +228,21 @@ class NotificationsForEntities(test_v3.RestfulTestCase):
          self.identity_api.delete_user(user_ref['id'])
          self._assertLastNotify(user_ref['id'], 'deleted', 'user')
 
 +    def test_delete_domain(self):
 +        domain_ref = self.new_domain_ref()
 +        self.identity_api.create_domain(domain_ref['id'], domain_ref)
 +        domain_ref['enabled'] = False
 +        self.identity_api.update_domain(domain_ref['id'], domain_ref)
 +        self.identity_api.delete_domain(domain_ref['id'])
 +        self._assertLastNotify(domain_ref['id'], 'deleted', 'domain')
 +
      def test_update_group(self):
          group_ref = self.new_group_ref(domain_id=self.domain_id)
          self.identity_api.create_group(group_ref['id'], group_ref)
 # ---
 # To remove '-' lines, make them ' ' lines (context).
 # To remove '+' lines, delete them.
 # Lines starting with # will be removed.
 #
 # If the patch applies cleanly, the edited hunk will immediately be
 # marked for staging. If it does not apply cleanly, you will be given
 # an opportunity to edit again. If all lines of the hunk are removed,
 # then the edit is aborted and the hunk is left unchanged.

This is in vi mode for me. To exit

:wq

Then , git commit to create a smaller patch.
(update: you can run git log -p to confirm only the changes you want made it in to the commit)

Once I have a patch basically ready to go, and I want to test it, I have a couple choices. The simplest is to do git stash and run the unit tests, but I always get distracted. Instead of forgetting my stash, I tend to create each of the commits in turn. Once I have all three commits created, I can do git checkout HEAD~3 To checkout the first commit, run the tests, and see that things are OK. To go back to the head of the branch

git checkout revocation-events-split

If all three patches are good, and ready for review, git review will submit all three of them.

git reflog is your friend. It shows you everything you have done to change the state of your repo. I refer to the reflog all the time to reorient myself.

January 23, 2014

Java deserialization flaws: Part 2, XML deserialization

All classes which implement the java.io.Serializable interface can be serialized and deserialized, with Java handling the plumbing automatically. In the first part of this two-part series, we looked at some of the unexpected security consequences which can arise from usage of binary deserialization in Java applications. This second part of the series will focus on security issues related to XML deserialization.

XML Deserialization

An alternative approach to Java’s native binary serialization is XML serialization, where the state of in-memory objects is represented as an XML document. XML serialization capabilities are provided by a number of commonly-used libraries. The Java Architecture for XML Binding (JAXB) is the standard implementation, which is available as part of the Java runtime environment, and as a standard API in J2EE environments. Several other XML serialization libraries exist – this article will focus on just two: XMLDecoder and XStream, both of which have exposed serious security issues in recent releases.

XMLDecoder

XMLEncoder/XMLDecoder are components in the Java Development Kit (JDK) that provide long term persistence for Java beans, using an XML serialization format to achieve this. This functionality is very powerful, as the XML format can represent a series of methods that will be called to reconstruct an instance of the object. If an application uses XMLDecoder to deserialize content provided by a user, then the user could inject arbitrary code into the specification of methods to call when reconstructing the object. In other words, any application that allows a user to pass content that will be deserialized by XMLDecoder is exposing a remote code execution flaw.

Dinis Cruz et. al. reported that the Restlet REST framework did just this, using XMLDecoder to deserialize the content of XML REST API requests. This flaw was assigned CVE-2013-4221, and was patched by disabling the vulnerable functionality. While researching this issue, it was also found that Restlet provided similar functionality using binary serialization. This would not expose remote code execution by default, but could expose various security issues, as described in the first article of this series. The binary deserialization flaw in Restlet was assigned CVE-2013-4721, and was also patched by disabling the vulnerable functionality.

XStream

XStream is an open source library, external to the JDK, which aims to simplify XML serialization and deserialization. It is popular due to its ease of use. XStream does not allow the specification of deserialization logic as XMLDecoder does. However, Dinis Cruz et. al. also reported that XStream’s reflection-based approach to deserialization can be used to achieve arbitrary code execution when deserializing user-supplied XML. XStream will deserialize classes of any type. It has a special handler for dynamic proxy instances, which will resolve the implemented interfaces and handler. This allows an attacker to provide XML representing a dynamic proxy class, which implements the interface of a class the application might expect, then also implements a handler that calls arbitrary code when any of the members of the deserialized class are called.

There has been debate within the community as to how this should be resolved. Any application that is deserializing arbitrary user-supplied input is potentially vulnerable to a variety of security issues, as discussed in this series of articles. Should it then be considered a concern for applications using XStream to resolve? Some applications have already done so.

It was found that Spring OXM provided an XStreamMarshaller class, which would by default expose the remote code execution issue. Spring addressed this in documentation, warning users to apply a class type whitelist using a configuration property. It was also found that Sonatype Nexus was using XStream in a fashion that exposed an unauthenticated remote code execution flaw. This was addressed by forking the XStream library, and adding a patch that introduces support for class type whitelisting. Concurrently, the XStream project itself is now working on a patch that introduces support for class type whitelisting. The whitelisting will be disabled by default in the next minor release to maintain backwards compatibility, but will be enabled by default in the next major release. The flaw as it pertains to XStream itself has been assigned CVE-2013-7285.

Conclusion

XML serialization is potentially very dangerous when used to transport untrusted, user-supplied data. In addition to the problem of vulnerable serializable classes that is exposed by binary serialization, XML serialization also introduces several other possibilities for exposing remote code execution flaws. As an application developer, the take-home message is simple: never deserialize untrusted content using any Java serialization format. Content should always be checked to ensure that it is of an acceptable type prior to deserialization.

January 21, 2014

Automating System Management

The future of system management is automation. As the number of systems and virtual machines being managed continues to grow, and the complexity of distributed applications increases, automation is the only way to keep things running smoothly.

Specifically, we need fine grain control of systems. This means that we can do things like configure local storage and networks, start and stop services, install software and patches, and so forth. We are looking at interactive control – making changes, seeing the results of these changes, and making further changes in response. Another aspect of interactive management is responding to changes in a system, such as a hardware failure, a file system running out of space, or perhaps an attack on a system. Interactive management may have a human in the loop or the interaction may be with a script, a program, or perhaps even an advanced expert system.

This interactive manipulation complements configuration management systems such as Puppet. With a configuration management system, you put a system into a known state. With interactive manipulation you work with the system until it does what you want it to. You will usually want to use both approaches, since each has strengths and weaknesses.

This automation requires several things:

  • The ability to query a system. This includes determining its configuration (HW and SW) and current state and status. As an example, if you are monitoring the temperature of a system, is the lm-sensors service installed, configured, enabled, and currently running?
  • The ability to change a system. This includes things like configuring storage, configuring networks, changing firewall rules, and installing software.
  • Generating alerts when something interesting happens. It is not effective to poll 1,000 systems looking for items of interest; it is necessary for the 1,000 systems to tell you if something you are interested in happens. Going back to the lm-sensors example, you might want to trigger an alert when the cpu temperature exceeds 150 degrees F. You might also want to trigger an alert if the lm-sensors service fails.
  • Remote Operation. In general, you don’t want to put a complete management system on each managed system. You want to have a centralized management capability containing the business logic which manages large numbers of systems.

In designing a system to support these elements you end up with a design that has a management console (or management program or management framework) which initiates operations on remote systems. These operations are performed by a program on the remote system. A program that is intended to perform an operation when called from another system is commonly called an agent.

It is straightforward to create an agent to perform a specific task. This tends to result in the creation of large numbers of specialized agents to perform specific tasks. Unfortunately, these agents don’t always work well with each other, come from multiple sources, have to be individually installed, and produce a complex environment.

Building on the Automation Requirements, what if we create:

  • A standard way to query systems.
  • A standard way to change a system.
  • A standard set of alerts.
  • A standard remote API to perform these operations.
  • A standard infrastructure to handle things like communications, security, and interoperation.
  • All included with the operating system and maintained as part of the OS.

Building a system like this means building a standard set of tools and interfaces that can be shared by any application that needs to interact with the managed system. Having a standard API means that applications and scripts can easily call the functions that they need to use. Having a common infrastructure greatly simplifies interoperation and makes it much easier to develop management tools that touch multiple subsystems.

Including these tools with the OS means that applications have a known set of tools that they can rely on. It also means that the tools are updated and maintained to keep in sync with the OS, that security issues are addressed, and that there is a single place to report problems.

A system that implements these capabilities provides a solid foundation for developing automated tools for system management. “Automated Tools” can mean anything from a sophisticated JBoss application using Business Rules and Business Process Management to automate responses to a wide range of system alerts to a custom script to create a specific storage configuration.

A system that implements these capabilities also provides a great foundation for building interactive client applications – client applications that use a command line interface, that are built on scripts, or even a GUI interface.

These are the guiding principles for OpenLMI.


January 19, 2014

Avoiding a deep rebase when posting a patch

OpenStack Milestone Icehouse 2 (I2) is due this Tuesday. The gate is deep and the penalty for messing it up is costly. I recently had to update a patch that depends on three patches that are approved but but not merged to master. Using the git review command line, all 5 patches would get resubmitted. This was too high a risk for me.

Gerrit does magic. When you post a patch to Gerrit, it does not got directly to the branch. You post to a ref, and Gerrit then performs additional workflow before creating a new ref. This is so that the old refs are accessible for review, and also so that a user can’t create an invalid reference.

To avoid accidentally rebasing the earlier patches, and triggering a dump of all the work in the gate queue, I wanted to replicate what Gerrit does without pushing updates to the dependencies.

To push an updated patch, find the the Download URL on the review using the ssh protocol, and remove the revision number. Push a new version to that reference (using the ssh protocol). For example, from a recent review at version 18. This command:

git push ssh://ayoung@review.openstack.org:29418/openstack/keystone revocation-events:refs/changes/08/55908

pushed a new version of the patch which now shows as revision 19: revocation-events:refs/changes/08/55908/19

Update:
Thanks to YorikSar (Yuriy Taraday)
Better way to do it (assuming the branch is checked out)

git push gerrit HEAD:refs/for/master

January 15, 2014

CWE Vulnerability Assessment Report 2013

Common Weakness Enumeration (CWE) is a list or dictionary of common software weaknesses. Red Hat has adopted CWE as a common language for describing and
classifying vulnerabilities, used as a base for evaluation and prevention of weaknesses. Results of classifications are reviewed periodically and are used to direct our efforts in strengthening security of development practices, tools, assessment services, education programs and documentation.

As a part of this effort Red Hat Customer Portal has attained CWE compatibility after review from MITRE Corporation. CWE IDs are currently assigned to high risk vulnerabilities in Red Hat products. This classification is available through CVE database and CWE coverage of Customer Portal is maintained with carefully selected subset of CWE identifiers to provide good abstraction of weaknesses useful for developers and security engineers. Statistics derived from the available data can be also used by
open source community to better understand security challenges of developing open source software. More information is available at CWE Compatibility for Red Hat Customer Portal and previous blog posts.

Statistics for the previous year are based on 37 identified and classified vulnerabilities. Graph below shows top 5 overall weaknesses (as assigned, including chains) with number of their occurrences.

Top overall weaknesses in 2013

2013-numbers

CWE-502 Deserialization of Untrusted Data is at the top with a total of nine occurrences. Closer investigation shows that root cause of six of them is a vulnerability found in Ruby on Rails first identified in January last year. Recurring theme was use of YAML.load to deserialize user-controlled content. YAML standard supports user-defined data types and allows convenient way to serialize and deserialize Ruby objects, which makes it popular choice in many Ruby-based projects.

This weakness gives attacker ability to instantiate objects and could be, depending on the application, exploitable in several ways, including DoS attacks, SQL injections and arbitrary code execution. Eliminating this weakness by identification of code paths that allow attacker to supply content to YAML.load might prove difficult. In this case several gems were found vulnerable and dropped YAML support, and even one of the JSON parsing engines available in ActiveSupport was using YAML.load and therefore exploitable.

Recommended steps to prevent weaknesses such as this one is to use defensive approach from the early stages of development. User-controlled inputs should be handled carefully and filtered using whitelist approach of specifying “known-good” inputs rather than blacklisting malicious ones.

CWE-428 Unquoted Search Path or Element weakness has been found four times in virtualization solutions and affected Windows platform. Use of unquoted delimeters in file and folder names may provide attacker with access to the file system to execute program with privileges of vulnerable application. It depends heavily on the underlying system and even though Linux platforms are not entirely proof, it is more pervasive on Windows operating systems.

CWE-78 Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’) together with CWE-96 Improper Neutralization of Directives in Statically Saved Code (‘Static Code Injection’) and CWE-89 Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’) remain top weaknesses in applications with web interface (see OWASP Top 10 2013).

CWE uses term primary weakness to describe the root cause – initial weakness that can create conditions necessary to cause another weakness. Resultant weakness is only exposed after previous weakness in chain has been exploited.

Top 5 resultant weaknesses in 2013

2013-resultant2

Number one resultant weakness in 2013 was CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer, or just Buffer Error. This is a more abstract type of weakness and covers several more specific weakness bases. As this is one of the more studied problems in software security, automated tools that allow detection of problems early in development life cycle are available. Despite four occurrences, ratio of buffer related vulnerabilities has significantly dropped compared to previous years. In 2012 eight out of twenty-two vulnerabilities were related to CWE-199 weakness and fifteen out of twenty-eight in 2011.

Classification of vulnerabilities using CWE provides a good feedback on effectiveness of already implemented security practices and gives us better insight into nature of vulnerabilities found in our software. In future Red Hat will continue assigning CWE identifiers to critical vulnerabilities and consider extending the efforts to include vulnerabilities with lower impact.

January 14, 2014

Using LMI Commands

The LMI CLI command processor is invoked by entering “LMI”. You can use it to enter either a single command or multiple commands. Entering “LMI” with no arguments puts you into interactive mode until you enter CTRL-D to exit the CLI processor. In interactive mode you can enter multiple LMI commands.

In general you have to provide the target system, user, and password for each LMI command. This can be a nuisance if you are entering multiple LMI commands – in this case it is best to go into the LMI command processor’s interactive mode. You will have to enter the authentication information once, and the command processor will remember it for the rest of the session. The exception to this rule is when you are managing the local system and are logged in with root privileges.

Lets start with a simple example, getting a list of storage on the local system system. This is done by entering the command:

$ lmi storage list

Entered this way, the lmi command processor will default to localhost for the system and prompt you for the username and password to use (unless you have root privileges). The output of the command is printed to the screen:

Size Format

/dev/disk/by-id/dm-name-luks-ffc4ec65-f140-4493-a991-802ad6fa20b4 /dev/mapper/luks-ffc4ec65-f140-4493-a991-802ad6fa20b4 luks-ffc4ec65-f140-4493-a991-802ad6fa20b4 249531727872 physical volume (LVM)

/dev/disk/by-id/ata-ST3250410AS_6RY0WKF9 /dev/sda sda 250059350016 MS-DOS partition table

/dev/sr0 /dev/sr0 sr0 250059350016 Unknown

/dev/disk/by-id/ata-ST3250410AS_6RY0WKF9-part1 /dev/sda1 sda1 524288000 ext4

/dev/disk/by-id/ata-ST3250410AS_6RY0WKF9-part2 /dev/sda2 sda2 249533825024 Encrypted (LUKS)

/dev/disk/by-id/dm-name-fedora-home /dev/mapper/fedora-home home 191931351040 ext4

/dev/disk/by-id/dm-name-fedora-root /dev/mapper/fedora-root root 53687091200 ext4

/dev/disk/by-id/dm-name-fedora-swap /dev/mapper/fedora-swap swap 3909091328 swap

Note that the information includes the user friendly name – sda, sda1 and sda2 – as well as the persistent id’s. All of the LMI Storage commands accept any of the device id’s. The persistent id’s are more robust on servers, where you can run into situations where the user friendly names change. This can occur when you add or remove devices or controllers or move disks to different ports on an hba.

If you are going to enter multiple LMI commands, you should enter the LMI shell by entering

$ lmi

lmi>

At this point you can find out what commands are available by entering “?”.

lmi> ?

Documented commands (type help <topic>):

========================================

EOF help
Application commands (type help <topic>):

=========================================

exit help lf mount partition-table service sw

fs hwinfo lv partition raid storage vg

You can ask for more help on these commands by entering help <command>, for example:

lmi> help storage

Basic storage device information.

Usage:

storage list [ <device> ...]

storage depends [ --deep ] [ <device> ...]

storage provides [ --deep ] [ <device> ...]

storage show [ <device> ...]

storage tree [ <device> ]

lmi>

Let’s now show an example of changing system state with LMI. We will use the lmi service commands to list all available services, show service status, and start and stop a service. First, run lmi service list to list all available services (this takes time to run and produces a long output, so it isn’t shown here). Then use the lmi service show, stop, and start commands:

lmi> service show cups.service

Name=cups.service

Caption=CUPS Printing Service

Enabled=True

Active=True

Status=OK

Now we can stop the cups service and then check its status:

lmi> service stop cups.service

lmi> service show cups.service

Name=cups.service

Caption=CUPS Printing Service

Enabled=True

Active=False

Status=Stopped

Finally, start the cups service and then check its status:

lmi> service start cups.service

lmi> service show cups.service

Name=cups.service

Caption=CUPS Printing Service

Enabled=True

Active=True

Status=OK

lmi>

To use the LMI command processor against a remote system with OpenLMI installed, use the -h (host) option:

$ lmi -h managedsystem.mydomain.org

> service show cups.service

username: pegasus

password:

Name=cups.service

Caption=CUPS Printing Service

Enabled=True

Active=True

Status=OK

This should be enough to get you started using LMI Commands.


January 13, 2014

file_t we hardly new you...
file_t disappeared as a file type in Rawhide today.  It is one of the oldest types in SELinux policy.  It has been aliased to unlabeled_t.

Why did we remove it?

Let's look at the comments written in the policy source to describe file_t.

# file_t is the default type of a file that has not yet been
# assigned an extended attribute (EA) value (when using a filesystem
# that supports EAs).


Now lets look at the description of unlabeled_t

# unlabeled_t is the type of unlabeled objects.
# Objects that have no known labeling information or that
# have labels that are no longer valid are treated as having this type.


Notice the conflict.

If a file object does not have a labeled assigned to it, then it would be labeled unlabeled_t.  Unless it is on a file system that supports extended attributes then it would be file_t?

I always hated explaining this, and we have finally removed the conflict for future Fedora's.  Sadly this change has not been made in RHEL7 or any older RHELs or Fedoras.

We also added a type alias for unlabeled_t to file_t.

Note: Seandroid made this change when the policy was first being written.

One other conflict I would like to fix is that a file with a label that the kernel does not understand, is labeled unlabeled_t. (IE It has a label but it is invalid.)  I have argued for having the kernel differentiate the two situations.

  • No label -> unlabeled_t

  • Invalid Label -> invalid_t.

Upstream has pointed out from a practical/security point of view you really need to treat them both as the same thing.  Confined domains are not allowed to use unlabeled_t objects.  And if it is a file system object you should run restorecon on it.  Putting a legitimate label on the object.  Probably I will not get this change, but I can always hope.

January 10, 2014

Fedora.NEXT at FOSDEM and DevConf

I just want to throw out a quick update on the progress of Fedora.NEXT work.

First of all, the deadline for the submission of the Workstation, Server and Cloud PRDs was extended one week, so they will now need to be delivered by Monday, January 20th. Assuming that the PRDs are ratified by the Fedora Board and FESCo, that means that the next phase will be execution planning.

Execution planning means that we will need to put together a list of resource needs of all varieties (for packaging QA, doc-writers, release-engineering, Ambassadors, etc.). This will also include scoping efforts to determine the delivery schedule.

This is a big task and one that will need help from throughout our community. I’ve reserved time at two of the larger upcoming European FOSS conferences to get people together and brainstorming these needs.

The first such conference is FOSDEM, the Free, Open-Source Developers European Meeting in Brussels, Belgium. I have reserved an hour meeting in the “distributions” devroom on Saturday, February 1 at 1600 local time for this purpose. If you are attending FOSDEM and have any interest in seeing Fedora succeed, I urge you to join us there and help turn these goals into actionable efforts.

The second conference is Devconf.cz, running a week after FOSDEM in Brno, the Czech Republic. Devconf is a very Fedora-friendly conference and will be dedicating an entire day to the Fedora Project: Sunday, February 9th. Matthew Miller, Fedora’s Cloud Architect, will be kicking off the Fedora Day with his keynote entitled “Fedora.next: Future of Fedora Big Picture, plus Working Group report“.

Following Matthew’s keynote, I will be leading a workshop/brainstorming session in the same room entitled “We are Fedora (and so can you!)“. The goal of this session will be similar to that of the FOSDEM devroom session: to get as many people with an interest in Fedora into the same room to help us plan how to meet the visions set forth in the Fedora product PRDs.

If you are going to either of these conferences, I very much hope you will join us!


Update: LMIShell for RHEL 7 Beta

Key characteristics of an Enterprise Linux like Red Hat Enterprise Linux are long term support and stable interfaces. The OpenLMI Providers are designed to be stable, which allowed them to be included in the RHEL 7 Beta.

On the other hand, the LMIShell scripts and commands are rapidly evolving and changing. This means that it is appropriate to include them in environments that allow changes, like Fedora, but it is too early to include them in RHEL. As a result, the LMIShell scripts and commands are packaged outside of the RHEL 7 Beta.

The result is that the RHEL 7 beta includes all software for OpenLMI on managed systems. On the client side, it includes the LMIShell infrastructure, but does not include the LMIShell scripts or commands.

For RHEL 7 Beta, the LMIShell scripts and commands are available from the openlmi.org website as an external repository. To install the LMIShell scripts and commands:

First, download http://www.openlmi.org/sites/default/files/repo/rhel7/noarch/openlmi-scripts.repo to /etc/yum.repos.d on your local system.

Then yum install “openlmi-scripts*”. (Note the quotes around “openlmi-scripts*” and the asterisk at the end of scripts. Both of these must be included for the install to work correctly.)

These scripts require openlmi-tools, which is included as a dependency and is automatically installed when you install the scripts.

To test your installation, run one of the LMIShell commands, such as lmi hwinfo.


January 08, 2014

OpenLMI on RHEL 7 Beta

Getting Started

OpenLMI is under active development, and its first public release on Red Hat Enterprise Linux is with the Red Hat Enterprise Linux 7 Beta.

Install OpenLMI

Install

OpenLMI can be installed by installing the openlmi package. This is a metapackage that installs the OpenLMI infrastructure and a base set of OpenLMI Providers. Additional Providers and other packages can be installed later.

$ yum install openlmi

Start the CIMOM

The OpenLMI CIMOM runs as a service. For security reasons, services are not automatically started. You will need to start the CIMOM manually, using the command:

$ systemctl start tog-pegasus.service

To have the service automatically started when the system boots, use the command:

$ systemctl enable tog-pegasus.service

Firewall

You will then need to open the appropriate firewall ports to allow remote access. This can be done from the firewall GUI by selecting the WBEM-https service, or can be done from the command line by entering:

$ firewall-cmd --add-port 5989/tcp

You will probably want to open this port permanently:

$ firewall-cmd --permanent --add-port 5989/tcp

SELinux

You may need to set SELinux to permissive mode:

$ setenforce 0

Remote Access

You next need to configure the users for remote access. The Pegasus CIMOM can accept either root or pegasus as users (configuring Pegasus to use other users is beyond the scope of this article). You can do one or both of the following actions; doing both will enable using OpenLMI calls using either root or pegasus as the user.

  • The user pegasus is created – without a password – when you install OpenLMI. To use the pegasus user you need to add a password by using the command passwd pegasus (as root) and then giving it a password.
  • Alternatively, you can edit the Pegasus access configuration file to allow root access:
    • Edit the file /etc/Pegasus/access.conf
    • Change the line “ALL EXCEPT pegasus:wbemNetwork” to “ALL EXCEPT root pegasus:wbemNetwork” and save the file.

Install OpenLMI Client

Client Software (updated)

The OpenLMI client consists of the LMIShell environment and a set of system management scripts. The OpenLMI client is installed on the client system – that is, the system that will be used to manage other systems. You don’t need to install the OpenLMI client on managed systems, and you don’t need to install OpenLMI Providers on the client system.

The easiest way to use LMIShell is to use Fedora 20 for your client system – Fedora 20 includes LMIShell and all the management scripts. These management scripts are under active development, and their interfaces were not considered sufficiently mature to include in RHEL 7 Beta. They should be included in a future release.

There are two parts to the client tools provided by the OpenLMI project. The first is the LMIShell, which is a powerful, python-based scripting environment made available in the openlmi-tools package.

You can install this package with the command:

$ yum install openlmi-tools

The second part of the client tool is the OpenLMI scripts, which are a set of Python scripts and simple shell command wrappers (using the ‘lmi’ metacommand tool) to provide very simple interaction with OpenLMI-managed systems. Because these scripts are actively evolving they are not included in the RHEL 7 Beta, and must be downloaded and installed separately:

First, download http://www.openlmi.org/sites/default/files/repo/rhel7/noarch/openlmi-scripts.repo to /etc/yum.repos.d on your local system.

Then yum install “openlmi-scripts*”. (Note the quotes around “openlmi-scripts*”.)

These scripts require openlmi-tools, which is included as a dependency and is automatically installed when you install the scripts if it has not already been installed.

Server Certificate

In order to access a remote LMI managed system, you will need to copy the Pegasus server certificate to the client system. This can be done with:

# scp root@managed-machine:/etc/Pegasus/server.pem
/etc/pki/ca-trust/source/anchors/managed-machine-cert.pem

Where “managed-machine” is the name of the managed system. You then need to:

# update-ca-trust extract

Try It Out

At this point you should be ready to go! Test the installation by running an LMI command from a system with the LMIShell client and scripts installed; this sample will be explained in future articles (replace managed-system with the actual system name):

# lmi -h managed-system
lmi> hwinfo cpu
username: pegasus
password:

CPU: AMD Phenom(tm) 9550 Quad-Core Processor
Topology: 1 cpu(s), 1 core(s), 1 thread(s)
Max Freq: 3000 MHz
Arch: x86_64
lmi>


Securing Openstack’s Dashboard using Django-Secure

When it comes to security it is an unfortunate reality that technologies are rarely straight forward to use or easy to deploy. So it is quite refreshing to find something that breaks that mould. There is a fantastic project called django-secure which I believe does just this.

The idea is to provide a way to enforce secure defaults for django projects. It achieves this in two key ways. The first being a deployment check that you can run as a part of typical django-admin manage.py workflow, the other is a security middleware that enables security features that django doesn’t do by default. To demonstrate the usage of django-secure I will walk through how to apply it to the OpenStack Horizon dashboard and discuss the features which it enables.

For those that are unfamiliar with the OpenStack project it has many facets, most of which can be administrated and controlled by a project called Horizon. It is conceivable that this dashboard could be deployed and exposed over the network so it is worth examining how to make it a little more secure. Horizon is essentially a typical django project so the steps here will be pretty universal for other projects.

The first thing that you need to do is install the django-secure package. This can be done using the Python pip package manager:

$ pip install django-secure

Next, we add django-secure into the list of INSTALLED_APPS within the django settings. I’m assuming you are doing this within a devstack envrionment (you probably shouldn’t be testing things like this in production!) so the file we are looking for is /opt/stack/horizon/openstack_dashboard/settings.py (or as local_settings.py) overrides.

INSTALLED_APPS = (
'djangosecure',
...

With this change alone you will be able to see the benefit of django-secure. You should now be able to issue:

python manage.py checksecure

And have django-secure complain about all the things Horizon isn’t doing by default (you can see the complaining part here -http://fpaste.org/61388/. This is a good start for figuring out what hardening options you can enable to make your site more secure. The remainder of this post will be devoted to deving into what those options are and what they actually do. But first lets add one more option to Horizons settings.

The django-secure project is more than just deployment check, it is actually middleware too so we need to enable the middleware before we continue. To do this add the following line (towards the top) of your MIDDLEWARE_CLASSES.

MIDDLEWARE_CLASSES = (
'djangosecure.middleware.SecurityMiddleware',
...

Ok so what does this do? This gives us the ability to enable some options that do some cool things that can help prevent a number of attacks. I’m going to break these down individually but I’m going to begin by discussing the options that are a part of django and how they are configured in devstack. The correct guideance for these options is available within the OpenStack security guide (http://docs.openstack.org/sec/).

SESSION_COOKIE_SECURE = True

This option will only allow the session cookie to be sent over a secure connection. Horizon does not enable this by default in devstack as it isn’t configured to use HTTPS in devstack.

SESSION_COOKIE_HTTPONLY = True

Horizon has this option set by default. This prevents client side javascript from accessing the session cookie. This is an extremely important flag to have set to prevent leakage of the session cookie through a XSS vulnerability.

SESSION_EXPIRE_AT_BROWSER_CLOSE = True

This option is probably debatable. Essentially when set to true the user will have to reauthenticate everytime they open a new browser window. The alternative will use persistent sessions that have a lifetime as long as the expiry set in the session cookie. Horizon sets this to true by default.

The django-secure framework will also check for the correct usage of CSRF, and strength of the secret key. It also introduces some useful middleware that ensures some of the best security features available within the browser are being utilized. This options are discussed below.

SECURE_SSL_REDIRECT = True

Setting this option to true will ensure that all non HTTPS requests are sent permnant redirects to a HTTPS URL. As the Horizon dashboard is an administrative console it is essential that all traffic travels over a secure channel.

SECURE_HSTS_INCLUDE_SECONDS = 31536000
SECURE_HSTS_INCLUDE_SUBDOMAINS = True

These two options enable HTTP Strict Transport Security. It tells compliant browsers to only access this website (and subdomains) via HTTPS for the specified time period. This helps to prevent SSL stripping attacks and ensures traffic will only travel over a secure connection.

SECURE_FRAME_DENY = True

This enables the X-Frame-Options header which tells browsers not allow the site to be accessed via frames. A common technique employed by clickjackers.

SECURE_CONTENT_TYPE_NOSNIFF = True

This prevents the browser from guessing asset content types leading to assumptions about the content that could be exploited.

SECURE_BROWSER_XSS_FILTER = True

Enables dormant XSS filter capabilities in older browsers.

With these settings enabled almost all of the security features that most modern browsers employ will be enabled. To complete the configuration we need to enable SSL in the Apache web server. This will involve generating or obtaining SSL certificates to use with the dashboard and modifying the web servers configuration.

Example SSL configuration for http.conf

<VirtualHost *:443
ServerName openstack.example.com
SSLEngine On
SSLCertificateFile /etc/apache2/tls/dashboard_cert.pem
SSLCACertificateFile /etc/apache2/tls/dashboard_cert.pem
SSLCertificateKeyFile /etc/apache2/tls/dashboard_key.pem
SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
.
.
. (WSGI configuraiton here)
</VirtualHost>

Finally running `python manage.py checksec` again should no longer report any failures and you should have a much more secure administrative interface to your OpenStack deployment.

January 03, 2014

Stretches for Lower Back Pain

“Sitting is the new smoking.” –Unattributed quote making the rounds.

Lower back pain is part of what has been termed “Silicon Valley Syndrome,” the side effect of using too much technology. I’ve been battling lower back pain for years.

Pain

When one sits for a long period of time, the muscles of the legs are held in a contracted position. Over time, the muscles adapt to this position,and get a shorter “resting” length. When the sitter later stands up, these muscles pull on the lower back.

SittingHere are a couple of stretches to counteract that shortening.

Hamstring

First, the hamstrings.  The start position is in the picture below.  First, pull back your hips (small green arrow) and feel the muscle tighten…and do it slow…remember, the idea is to relax the muscle, and you can’t aggressively relax.

HamstringThen, push your chest toward your toes along the long green arrow.  Don’t stretch your lower back, stretch your leg muscle.  The lower back is the hurt spot because it is the weak spot. You are trying to release the pressure on the lower back, not stretch it.

Hip Flexor

The other muscle (muscle group actually) that suffers from shortening is the Hip Flexor.

To stretch this one, I like to use the following procedure.  Starting position is shown below:

HipFlexorThe red line is the line of the hip flexor muscles.  The forward dot is where you should feel this exercise the most.

You probably want to put a cushion underneath the knee on the ground.  Now, push your hips forward.  You should feel it at the very top of you thigh, above the quadriceps.  You are stretching a muscle that goes through your pelvis, not the leg, and not the back.

Warmup

This is a short segment of the overall physical therapy exercise program.  The stretches themselves can be performed cold, but are far more effective if you have loosened the muscles first.  I’ve been doing 10 minutes on an exercise bike to get the blood flowing and the muscles soft.  This loosening up is probably as important, if not more so, than the stretches themselves.

Note:  All images are mine, drawn by my own hand.  Yes, I admit to them.  They are reusable under the creative commons license; Please let me know if you wish to use them.

January 02, 2014

PSA - Smart Cards are still a Hell - Instructions for CardOS cards

Some time ago I received a Smart Card from work in order to do some testing. Of course as soon as I received it I got drowned into some other work and had to postpone playing with it. Come the winter holiday break and I found some time to try this new toy. Except ...

... except I found out that the Smart Card Hell is still a Hell

I tried to find information online about how to initialize the CardOS card I got and I found very little cohesive documentation even on the sites of the tools I ultimately got to use.

The smart card landscape is still a fragmented lake of incompatibility, where the same tools work for some functions on some cards and lack in any way usability.

Ultimately I couldn't find out the right magic incantation for the reader and card combo I had, and instead had to ask a coworker that already used this stuff.

Luckily he had the magic scroll and it allowed me, at least, to start playing with the card. So for posterity, and for my own sake, let me register here the few steps needed to install a certificate in this setup.

I had to use no less than 3 different CLI tools to manage the job, which is insane in its own right. The tools as you will see have absurd requirements like sometimes specifying a shared object name on the CLI ... I think smart card tools still win the "Unusable jumbled mess of tools - 2013 award".

The cardos-tool --info command let me know that I have a SCM Microsystems Inc. SCR 3310 Reader using a CardOS V4.3B card. Of course you need to know in advance that your card is a CardOS one to be able to find out the tool to use ...

The very Lucky thing about this card is that if can be reformatted to pristine status w/o knowing any PIN or PUK. Of course that means someone can wipe it out, but that is not a big deal in production (someone can always lock it dead by failing enough time to enter PIN and PUK codes), but it is great for developers that keep forgetting whatever test PIN or PUK code was used with the specific card :-) This way the worst case is that you just need to format and generate/install a new cert to keep testing.

So on to the instructions:

Format the card:

cardos-tool -f
and notice how no confirmation at all is requested, and it works as a user on my Fedora 20 machine. I find not asking for confirmation a bit bold, given this operation destroys all current content, but ... whatever ...

Create necessary PKCS#15 and set admin pins:

pkcs15-init -CT --so-pin 12345678 --so-puk 23456789
note, that you have to know that you need to create this stuff and that a tool with obscure switches to do it also exists ...

Separately create user PIN and unlock code:

pkcs15-init -P -a 1 --pin 87654321 --puk 98765432 --so-pin 12345678 --label "My Cert"
No idea why this needs to be a separate operation, part of the magic scroll.

Finally import an existing certificate:

pkcs15-init --store-private-key /path/to/file.cert --auth-id 01 --pin 87654321 --so-pin 12345678
again not sure why a separate command, also note that this assumes a PEM formatted file, if you have a pkcs12 file use the --format pkcs12 switch to feed it into. Note that the tool assumes pkcs12 cert files are passphrase protected so you need to know the code before trying to upload such formatted certs ion the card.

Check everything went well with:

pkcs11-tool --module opensc-pkcs11.so -l --pin 87654321 -O
of course yet another tool, with the most amusing syntax of them all ...

... and that is all I know at this point. If you feel the need to weep at this point feel free, I am reserving a corner of my room to do just that later on after lunch ...

How come somethings get blocked by SELinux in permissive mode?
SELinux can be setup to run in three modes.

* Enforcing (My favorite)
* Permissive
* Disabled

Often permissive is described as the same as enforcing except everything is allowed and logged.

For the most part this is true, except when their are bugs or a "Access Control Manager" does not respect the permissive flag.

Most of SELinux is written where the kernel control's access, and it would be very strange for the kernel to block an access in permissive mode. 

But there are several situations where we want to check access outside the kernel.  For example.

  • Can an application connect to a particular dbus daemon?

  • Can a service start a particular systemd daemon?

  • Can a root process change the password of something?

  • Will sshd allow dwalsh to login as unconfined_t?

All of these checks are not seen by the kernel.  We implement SELinux checks in places like dbus daemon, systemd, X Server, sshd, passwd ...  When one of these services denies access you will see a USER_AVC generated rather then an AVC.  If these SELinux checks are not written correctly to check the permissive flag when an access is denied, you could get a real denial in permissive mode.

Usually we see these as bugs, but in certain situations the upstream does not want to accept patches to check the permissive flag.

If you know of a situation where this happens, open a bugzilla on it and we can work with the packager to fix the problem.

When you see an AVC or USER_AVC that is generated in permissive mode, you should see a flag that states "success=yes" in the AVC record, this indicates that the AVC was generated but still allowed.  If it says "success=no" in permissive mode then that should be considered a bug.

December 20, 2013

Authorization Scope

Much of the future work we need to do on Keystone falls into issued of scope. I’m going to merely try and define the problems here, and avoid talking about solutions. I’ll try to address more specific aspects in future posts.

There are two phases to the problem.

  1. Administration
  2. Use

Here are the scope issues of the Definitions stage:

  1. Different services need different role-names to document specific functions, but there is no easy way to delegate the creation of those role names.
  2. Role names are often in conflict. What is meant by Admin? Depends on who you ask.
  3. There is no cross checking between the set of role assignments available and the policy rules for a given service.
  4. Any administrator can create any role assignment for any user within the scope of their project or domain. To be honest, this might not be a problem, but it keeps coming up.

Here are the scope issues of the Use stage:

  1.  The service catalog is too big. By default, every endpoint for every service is included, and most are rarely used.
  2. A user cannot request a token for a specific service
  3. A user cannot request a token for a limited set of roles.
  4. A token can be used multiple times even if it was intended for a single use
  5. Tokens live a long time, which means, that they are susceptible to abuse long after their intended use.
  6. Almost all token used by the CLI are used once, but don’t expire until the default expiry.
  7. Domain scoping and unscoped tokens are confusing.
  8. Only an administrator can check if a token is valid.
  9. Tokens can only be signed by a single private key, regardless of the number of Keystone servers deployed.
  10. There is no way to explicitly limit what scope Keystone can sign for, or what roles-assignments should be allowed from a specific Keystone server.

December 18, 2013

Awesome new coreutils with improved SELinux support

When I first started working on SELinux over 10 years ago, one of the first packages I worked on was coreutils.    We were adding SELinux support to insure proper handling of labeling.  After that we did not touch it for several years.

Last year, I decided to investigate if I could improve coreutils handling of labels on initial content creation.   Well it took a while but my patches were finally accepted, with lots of fixes from the upstream, and coreutils-8.22 just showed up today in  Rawhide.

I am very excited about this release.  I believe it can allow Administrators to fix one of the biggest problems users have with SELinux, objects getting created with the incorrect context.

My patches basically standardized "-Z" with no options to indicate you wanted the target directory to get the "default" label.

Example:

# touch /tmp/foobar
# mv /tmp/foobar /etc
# ls -lZ /etc/foobar
# -rw-r--r--. root root staff_u:object_r:user_tmp_t:s0   /etc/foobar


As opposed to:

# touch /tmp/foobar
# mv -Z /tmp/foobar /etc
# ls -lZ /etc/foobar
# -rw-r--r--. root root staff_u:object_r:etc_t:s0   /etc/foobar


The traditional use of a command like mv was to maintain the "Security" of an object you are moving.  mv command would maintain the ownership, permissions, and SELinux Labels.  The problem with this is users/administrators would not expect this, by adding the "-Z" to the mv command, the administrator guarantees that the object will get he correct label based on the destination path, which over the years, I believe is what the administrator would expect.  The "-Z" option in coreutils now indicates the equivalent of running restorecon on the target, except in most cases the label is correct on creation of the content.

"mv -Z /tmp/foobar /etc/foobar" == "mv /tmp/foobar /etc/foobar; restorecon /tmp/foobar"

One of the reasons we did not do this sooner, was the speed of reading in the labeling database.  The latest SELinux toolchain loads the labeling database in a fraction of the previous time, allowing us to make these changes.

Setting up coreutils alias

I would even suggest that it would be a good idea to alias

alias mv='mv -Z'

for most users.

A common mistake is to mv content around in the homedir.   A mistake I have made in the past was to an html file to a my account on people.fedoraproject.org and then to ssh into the machine and then mv it to the public_html directory.  ~/public_html is labeled httpd_user_content_t which is readable by default from the apache server, while the default label of my homedir is not, user_home_t.

mv ~/content.html ~/public_html/

This command would end up with the ~/public_html/content.html being labeled user_home_t, and the page would not show up on the web site.  Users would not know why, and would probably not no about SELinux.  But if the admistrator changed the alias for the mv command, everything would just work.

Other Commands

Similarly the -Z option has been implemented for all of the commands that create content in coreutils.

mknod -Z, mkdir -Z, mkfifo -Z, install -Z

Currently in init scripts we have lots of code that does; \

mkdir /run/myapp; restorecon /run/myapp

Which can be replaced with

mkdir -Z /run/myapp

What about Disabled Machines, or machines that do not support SELinux?

On an SELinux disabled system, the -Z option will be ignored.

Conclusion

Getting the Label correct at file creation has been improved greatly in the current Fedora's with the introduction of file name transitions.  Fixing coreutils to allow administrators to change the default of standard tools to set default labels on object creation is nice.

alias mv='mv -iZ'
alias cp='cp -iZ'
alias mkdir='mkdir -Z"
alias mknod='mknod -Z"
alias install='install -Z"


I hope to get this new coreutils backported into RHEL7...

Security

One thing to remember about this from a security point of view.  A calling confined domain would still be prevented from creating content with the default label, if it was not allowed by SELinux policy to create content with that label.  The change to coreutils, just allows the process to attempt to create the content with the correct label.

Thanks to coreutils upstream for working on these patches with us.

December 17, 2013

golang support for libselinux in Rawhide.
Every so often I get to spend a couple of days working on a new computer language, but it has been a while.

I am working on a project to bring SELinux support to docker.

The basic idea is to launch containers with a specific SELinux type and Random MCS label.  Using pretty much the same technology as we use with sVirt.  We do this using libvirt and virt-sandbox-service in Fedora now, but we want to implement similar support for docker.

One problem I had when I first starting working on this project was that docker is written in the go programming language. I did not know the go language and there were no libselinux bindings for go.

Luckily go is fairly easy to bind to the C Language using cgo.  After a couple of weeks work, I put together selinux.go which implements all of the functions that I needed to get containers running with SELinux labels.  Going forward it would be nice to hook up all of the libselinux functions. (Patches welcomed).

Package will show up in libselinux-2.2.1-3.fc21

/usr/share/gocode/selinux/selinux.go

Any input for improvements to go code would be welcome.

December 13, 2013

Using the Openstack common client with Keystone

My last post showed how to load the user data using curl. This is only interesting if you love curl. Its pretty easy to do the same thing from the command line. Now, we at Keystone central hate responsibility. We have no desire to do more than we have to. That includes wrint the Command Line Client.

There is an effort afoot to move to a unified command line. Here is a sneak peek:

To get this to work took a little finagling: When a user gets a token, it contains the URL for the Keystone admin port, and the CLI uses this to perform the user create action. There is work going to to do better discoverability (figure out which version of the API is supported), but until then, you can do the following hack (not recommended for production)

Edit the database

 mysql --user keystone --password=keystone keystone

Make the admin URL V3 specific:

update endpoint set url='http://127.0.0.1:35357/v3'  where url like 'http://127.0.0.1:35357/%';

Restart Keystone.

And you can use the command:

export OS_AUTH_URL=http://127.0.0.1:5000/v3
export OS_USERNAME=admin
export OS_PASSWORD=freeipa4all
export OS_TENANT_NAME=admin
openstack --os-identity-api-version=3  user create testname2 --password=testme --project=demo  --domain=default

So my previous example would be reduced to:

 while read USERNAME ; do openstack --os-identity-api-version=3    user create  $USERNAME  --password=changeme --project=demo  ; done  < usernames.txt