October 30, 2014

Ability to remove TLS 1.0 from httpd in CentOS 6

Due to a bug in mod_ssl, the ability to remove TLS 1.0 (and only support TLS 1.1 and/or TLS 1.2) has not been available.  The fix has now made it to CentOS 6 and you can now fine-tune your cryptographic protocols with ease.

Before the fix my /etc/httpd/conf.d/ssl.conf file had this line:

SSLProtocol all -SSLv2 -SSLv3

This allows all SSL protocols except SSLv2 and SSLv3 to be used with httpd.  This isn’t a bad solution but there are a couple of sites that I’d prefer to further lock down by removing TLS 1.0 and TLS 1.2 1.1.  With the fix now in mod_ssl my settings can now look like this:

SSLProtocol all -SSLv2 -SSLv3 -TLSv1 -TLSv1.1

…and I’ll only support TLS 1.2 and beyond.  Of course doing this will significantly reduce the number of clients that can connect to my server.  According to SSLLabs I’m blocking all IE users before IE 11, Android before 4.4.2, Java 7, and Firefox 24.2.0 ESR.  But luckily I really don’t have a problem with any of these browsers for a couple of things I do so I’ll likely tighten up security there and leave my more public sites alone.


Security Specifications

There are many potential sources for security specifications. Some of them are government standards. For example, in the United States, HIPAA, the Health Insurance Portability and Accountability Act of 1996, specifies requirements for administrative safeguards, physical safeguards, and technical safeguards of medical records and personally identifiable information. Anyone dealing with Protected Health Information must comply with HIPAA.

The credit card industry has the Payment Card Industry Data Security Standard or PCI DSS, which must be followed by anyone who is handling credit card information.

The SANS Institute offers a wide range of security training and resources, including a set of Information Security Policy Templates that provide examples of best practices that can be customized for your organization. For example, the Server Security Policy specifies things like “all internal servers deployed at <company name> must be owned by an operational group that is responsible for system administration” – in other words, no zombie servers that no-one is responsible for!

The United States Department of Defense has prepared Security Technical Implementation Guides which specify how government computers will be configured and managed.

Of course there are numerous books on computer security, many including guidelines and checklists.

Finally, each organization must prepare their own security guide which lays out the security rules that they choose to follow. This is critical because each organization has their own set of requirements, needs, and threats. You can’t simply say “all computer systems must be completely secure” – first, this is impossible. There is a famous observation that the only truly secure computer system is one that is melted into slag, ground into dust, cast into a block of concrete, and dumped into the deepest part of the ocean. Second, implementing the highest levels of security for all systems is expensive and makes the systems very difficult to use.

As we have discussed before, computer security in the real world is a risk management exercise. Risk can’t be eliminated, it can’t be ignored, and it should be managed intelligently.

An organizations security guide should be based on applicable government and industry requirements, accepted best practices, and the specific requirements of the organization.


October 27, 2014

Computer Security Audits

In conversations with large companies and small companies, literature review and looking at best practices for security, one of the most common tools that essentially everyone uses is a security audit. In most cases the security audit is performed regularly – it isn’t just a one time event. OK, this sounds good, but what is it?

Dictionary.com definitions of audit include “an official examination and verification of accounts and records”, “the inspection or examination of a building or other facility to evaluate or improve its appropriateness, safety, efficiency or the like” as well as “to examine and verify an account by references to vouchers”.

A computer security audit means verifying that the system complies with specifications for computer security.

So far we haven’t really said anything – we’ve just laid the foundation for the three big questions:

  • What are these magical specifications for computer security? And where do they come from?
  • How is compliance against the security specifications measured?
  • How is the computer security audit performed?

Missing from this list is the huge question of what is done about lack of compliance against the security specifications. Is fixing any security issues identified in the audit considered part of the audit, or is it a separate exercise?


October 22, 2014

LISA14 – Simplified Remote Management of Linux Servers

I am giving a talk on Simplified Remote Management of Linux Servers at the upcoming LISA14 conference in Seattle, which runs from November 9-14. My talk is 9:45-10:30am on Friday, November 14. LISA is Large Installation System Administration SIG of Usenix.

If you are attending LISA I would enjoy meeting you and discussing anything around system administration, security, and open source in general! Drop me a line and let’s see about scheduling some time.

Abstract:

How do you manage a hundred or a thousand Linux servers? With practice! Managing Linux systems is typically done by an experienced system administrator using a patchwork of standalone tools and custom scripts running on each system. There is a better way to work – to manage more systems in less time with less work – and without learning an entirely new way of working.

OpenLMI (the Linux Management Infrastructure program) delivers remote management of production servers – ranging from high end enterprise servers with complex network and storage configurations to virtual guests. Designed to support bare metal servers and to directly manipulate storage, network and system hardware, it is equally capable of managing and monitoring virtual machine guests.

In this session we will show how a system administrator can use the new tools to function more effectively, focusing on how they extend and improve existing management workflows and expertise.


Configuring FreeBSD as a FreeIPA client

A recent thread on the freeipa-users mailing list highlighted one user’s experience with setting up FreeBSD as a FreeIPA client, complete with SSSD and Sudo integration. GNU+Linux systems have ipa-client-install, but the lack of an equivalent on FreeBSD means that much of the configuration must be done manually. There is a lot of room for error, and this user encountered several "gotchas" and caveats.

Services that require manual configuration include PAM, NSS, Kerberos and SSSD. Certain features may require even more services to be configured, such as sshd, for known_hosts management. Most of the steps have been outlined in a post on the FreeBSD forums.

But before one can even begin configuring all these services, SSSD, Sudo and related software and dependencies must be installed. Unfortunately, as also outlined in the forum post, non-default port options and a certain make.conf variable must be set in order to build the software such that the system can be used as a FreeIPA client. Similarly, the official binary package repositories do not provide the packages in a suitable configuration.

This post details how I built a custom binary package repository for FreeBSD and how administrators can use it to install exactly the right packages needed to operate as a FreeIPA client. Not all FreeBSD administrators will want to take this path, but those who do will not have to worry about getting the ports built correctly, and will save some time since the packages come pre-built.

Custom package repository

poudriere is a tool for creating binary package repositories compatible with FreeBSD’s next-generation pkg(8) package manager (also known as "pkgng".) The official package repositories are built using poudriere, but anyone can use it to build their own package repositories. Repositories are built in isolated jails (an OS-level virtualisation technology similar to LXC or Docker) and can build packages from a list of ports (or the entire ports tree) with customised options. A customised make.conf file can also be supplied for each jail.

Providing a custom repository with FreeIPA-compatible packages is a practical way to help people wanting to use FreeBSD with FreeIPA. It means fewer steps in preparing a system as a FreeIPA client (fewer opportunities to make mistakes), and also saves a substantial amount of time since the administrator doesn’t need to build any ports. The BSD Now podcast has a detailed poudriere tutorial; all the detail on how to use poudriere is included there, so I will just list the FreeIPA-specific configuration for the FreeIPA repository:

  • security/sudo is built with the SSSD option set
  • WANT_OPENLDAP_SASL=yes appears in the jail’s make.conf

The repository is currently being built for FreeBSD 10.0 (both amd64 and i386.) 10.1 is not far away; once it is released I will build it for 10.1 instead. If anyone out there would like it built for FreeBSD 9.3 I can do that too – just let me know!

Assuming the custom repository is available for the release and architecture of the FreeBSD system, the following script will enable the repository and install the required packages.

#!/bin/sh
pkg install -y ca_root_nss
ln -s /usr/local/share/certs/ca-root-nss.crt /etc/ssl/cert.pem
mkdir -p /usr/local/etc/pkg/repos
cat >/usr/local/etc/pkg/repos/FreeIPA.conf <<"EOF"
FreeIPA: {
  url: "https://frase.id.au/pkg/${ABI}_FreeIPA",
  signature_type: "pubkey",
  pubkey: "/usr/share/keys/pkg/FreeIPA.pem",
  enabled: yes
}
EOF
cat >/usr/share/keys/pkg/FreeIPA.pem <<EOF
-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAopt0Lubb0ur+L+VzsP9k
i4QrvQb/4gVlmr/d59lUsTr9cz5B5OtNLi+WMVcNh4EmmNIiWoVuQY4Wqjm2d1IA
VCXw+OqeAuj9nUW4jSvI/lDLyErFBXezNM5yggeesiV2ii+uO41zOjUxnSkupFzh
zOWr+Oj4kJI/iNU++3RpzyrBSmSGK9TN9k3afhyDMNlJi5SqK/wOrSjqAMfaufHE
MkJqBibDL/+xx48SbtInhtD4LIneHoOGxVtkLIcTSS5EpnIsDWZgXX6jBatv9LJe
u2UeQsKLKcCgrhT3VX+pc/aDsUFS4ZqOonLRt9mcFVxC4NDNMKsfXTCd760HQXYU
enVLydNavvGtGYQpbUWx5IT3IphaNxWANACpWrcvTawgPyGkGTPd347Nqhm5YV2c
YRf4rVX/S7U0QOzMPxHKN4siZVCspiedY+O4P6qe2R2cTyxntjLVGZcTBlXAdQJ8
UfQuuX97FX47xghxR6wyWfkXGCes2kVdVo0fF0vkYe1652SGJsfWjc5ojR9KFKkD
DN3x3Wu6kW0koZMF3Tf0rtSLDmbZEBddIPFrXo8QHiyqFtU3DLrYWGmbLRkYKnYR
KvG3XCJ6EmvMlfr8GjDIaEiGo7E7IyLusZXXzbIW2EKQdwa6p4N8wrW/30Ov53jp
rO+Bwn10+9DZTupQ3c04lsUCAwEAAQ==
-----END PUBLIC KEY-----
EOF
pkg update
pkg install -r FreeIPA -y cyrus-sasl-gssapi sssd sudo

Once the packages are installed from the custom repository, configuration can continue as indicated in the forum post.

Future efforts

This post was concerned with package installation. This is an important but relatively small part of setting up a FreeBSD client. There is more that can be done to make it easier to integrate FreeBSD (and other non-GNU+Linux systems) with FreeIPA. I will conclude this post with some ideas along this trajectory.

Recent versions of FreeIPA include the ipa-advise tool, which explains how various legacy systems can be configured to some extent as FreeIPA clients. ipa-advise config-freebsd-nss-pam-ldapd shows advice on how to configure a FreeBSD system, but the information is out of date in many respects – it references the old binary package tools (which have now been completely removed) and has no information about SSSD. This information should be updated. I have had this task on a sticky-note for a little while now, but if someone else beats me to it, that would be no bad thing.

The latest major version of SSSD is 1.12, but the FreeBSD port is back at 1.9. The 1.9 release is a long-term maintenance (LTM) release, but any efforts to bring 1.12 to FreeBSD alongside 1.9 would undoubtedly be appreciated by the port maintainer and users.

A longer term goal should be a port of (or an equivalent to) ipa-client-install for FreeBSD. Most of the software needed for FreeIPA integration on FreeBSD is similar or identical to that used on GNU+Linux, but there are some differences. It would be a time consuming task – lots of trial runs and testing – but probably not particularly difficult.

In regards to the package repository, work is underway to add support for package flavours to the FreeBSD packaging infrastructure. When this feature is ready, a small effort should be undertaken to add a FreeIPA flavour to the ports tree, and ensure that the resultant packages are made available in the official package repository. Once this is achieved, neither manual port builds nor the custom package repository will be required –
everything needed to configure FreeBSD as a FreeIPA client will be available to all FreeBSD users by default.

October 20, 2014

Can SSL 3.0 be fixed? An analysis of the POODLE attack.

SSL and TLS are cryptographic protocols which allow users to securely communicate over the Internet. Their development history is no different from other standards on the Internet. Security flaws were found with older versions and other improvements were required as technology progressed (for example elliptic curve cryptography or ECC), which led to the creation of newer versions of the protocol.

It is easier to write newer standards, and maybe even implement them in code, than to adapt existing ones while maintaining backward compatibility. The widespread use of SSL/TLS to secure traffic on the Internet makes a uniform update difficult. This is especially true for hardware and embedded devices such as routers and consumer electronics which may receive infrequent updates from their vendors.

The fact that legacy systems and protocols need to be supported, even though more secure options are available, has lead to the inclusion of a version negotiation mechanism in SSL/TLS protocols. This mechanism allows a client and a server to communicate even if the highest SSL/TLS version they support is not identical. The client indicates the highest version it supports in its ClientHello handshake message, then the server picks the highest version supported by both the client and the server, then communicates this version back to the client in its ServerHello handshake message. The SSL/TLS protocols implement protections to prevent a man-in-the-middle (MITM) attacker from being able to tamper with handshake messages that force the use of a protocol version lower than the highest version supported by both the client and the server.

Most popular browsers implement a different out-of-band mechanism for fallback to earlier protocol versions. Some SSL/TLS implementations do not correctly handle cases when a connecting client supports a newer TLS protocol version than supported by the server, or when certain TLS extensions are used. Instead of negotiating the highest TLS version supported by the server the connection attempt may fail. As a workaround, the web browser may attempt to re-connect with certain protocol versions disabled. For example, the browser may initially connect claiming TLS 1.2 as the highest supported version, and subsequently reconnect claiming only TLS 1.1, TLS 1.0, or eventually SSL 3.0 as the highest supported version until the connection attempt succeeds. This can trivially allow a MITM attacker to cause a protocol downgrade and make the client/server use SSL 3.0. This fallback behavior is not seen in non HTTPS clients.

The issue related to the POODLE flaw is an attack against the “authenticate-then-encrypt” constructions used by block ciphers in their cipher block chaining (CBC) mode, as used in SSL and TLS. By using SSL 3.0, at most 256 connections are required to reliably decrypt one byte of ciphertext. Known flaws already affect RC4 and non block-ciphers and their use is discouraged.

Several cryptographic library vendors have issued patches which introduce the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) support to their libraries. This is essentially a fallback mechanism in which clients indicate to the server that they can speak a newer SSL/TLS versions than the one they are proposing. If TLS_FALLBACK_SCSV was included in the ClientHello and the highest protocol version supported by the server is higher than the version indicated by the client, the server aborts the connection, because it means that the client is trying to fallback to a older version even though it can speak the newer version.

Before applying this fix, there are several things that need to be understood:

  • As discussed before, only web browsers perform an out-of-band protocol fallback. Not all web browsers currently support TLS_FALLBACK_SCSV in their released version. Even if the patch is applied on the server, the connection may still be unsafe if the browser is able to negotiate SSL 3.0
  • Clients which do not implement out-of-protocol TLS version downgrades (generally anything which does not speak HTTPS) do not need to be changed. Adding TLS_FALLBACK_SCSV is unnecessary (and even impossible) if there is no downgrade logic in the client application.
  • Thunderbird shares a lot of its code with the Firefox web browser, including the connection setup code for IMAPS and SMTPS. This means that Thunderbird will perform an insecure protocol downgrade, just like Firefox. However, the plaintext recovery attack described in the POODLE paper does not apply to IMAPS or SMTPS, and the web browser in Thunderbird has Javascript disabled, and is usually not used to access sites which require authentication, so the impact on Thunderbird is very limited.
  • The TLS/SSL server needs to be patched to support the SCSV extension – though, as opposed to the client, the server does not have to be rebuilt with source changes applied. Just installing an upgrade TLS library is sufficient. Due to the current lack of browser support, this server-side change does not have any positive security impact as of this writing. It only prepares for a future where a significant share of browsers implement TLS_FALLBACK_SCSV.
  • If both the server and the client are patched and one of them only supports SSL 3.0, SSL 3.0 will be used directly, which results in a connection with reduced security (compared to currently recommended practices). However, the alternative is a total connection failure or, in some situations, an unencrypted connection which does nothing to protect from an MITM attack. SSL 3.0 is still better than an unencrypted connection.
  • As a stop-gap measure against attacks based on SSL 3.0, disabling support for this aging protocol can be performed on the server and the client. Advice on disabling SSL 3.0 in various Red Hat products and components is available on the Knowledge Base.

Information about (the lack of ongoing) attacks may help with a decision. Protocol downgrades are not covert attacks, in particular in this case. It is possible to log SSL/TLS protocol versions negotiated with clients and compare these versions with expected version numbers (as derived from user profiles or the HTTP user agent header). Even after a forced downgrade to SSL 3.0, HTTPS protects against tampering. The plaintext recovery attack described in the POODLE paper (Bodo Möller, Thai Duong, Krzysztof Kotowicz, This POODLE Bites: Exploiting The SSL 3.0 Fallback, September 2014) can be detected by the server and just the number of requests generated by it could be noticeable.

Red Hat has done additional research regarding the downgrade attack in question. We have not found any clients that can be forcibly downgraded by an attacker other than clients that speak HTTPS. Due to this fact, disabling SSL 3.0 on services which are not used by HTTPS clients does not affect the level of security offered. A client that supports a higher protocol version and cannot be downgraded is not at issue as it will always use the higher protocol version.

SSL 3.0 cannot be repaired at this point because what constitutes the SSL 3.0 protocol is set in stone by its specification. However, starting in 1999, successor protocols to SSL 3.0 were developed called TLS 1.0, 1.1, and 1.2 (which is currently the most recent version). Because of the built-in protocol upgrade mechanisms, these successor protocols will be used whenever possible. In this sense, SSL 3.0 has indeed been fixed – an update to SSL 3.0 should be seen as being TLS 1.0, 1.1, and 1.2. Implementing TLS_FALLBACK_SCSV handling in servers makes sure that attackers cannot circumvent the fixes in later protocol versions.

October 15, 2014

POODLE – An SSL 3.0 Vulnerability (CVE-2014-3566)

Red Hat Product Security has been made aware of a vulnerability in the SSL 3.0 protocol, which has been assigned CVE-2014-3566. All implementations of SSL 3.0 are affected. This vulnerability allows a man-in-the-middle attacker to decrypt ciphertext using a padding oracle side-channel attack.

To mitigate this vulnerability, it is recommended that you explicitly disable SSL 3.0 in favor of TLS 1.1 or later in all affected packages.

A brief history

Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols designed to provide communication security over networks. The SSL protocol was originally developed by Netscape.  Version 1.0 and was never publicly released; version 2.0 was released in February 1995 but contained a number of security flaws which ultimately led to the design of SSL 3.0. Over the years, several flaws were found in the design of SSL 3.0 as well. This ultimately lead to the development and widespread use of the TLS protocol.

Most TLS implementations remain backward compatible with SSL 3.0 to incorporate legacy systems and provide a smoother user experience. Many SSL clients implement a protocol downgrade “dance” to work around the server side interoperability issues. Once the connection is downgraded to SSL 3.0, RC4 or a block cipher with CBC mode is used; this is where the problem starts!

What is POODLE?

The POODLE vulnerability has two aspects. The first aspect is a weakness in the SSL 3.0 protocol, a padding oracle. An attacker can exploit this vulnerability to recover small amounts of plaintext from an encrypted SSL 3.0 connection, by issuing crafted HTTPS requests created by client-side Javascript code, for example. Multiple HTTPS requests are required for each recovered plaintext byte, and the vulnerability allows attackers to confirm if a particular byte was guessed correctly. This vulnerability is inherent to SSL 3.0 and unavoidable in this protocol version. The fix is to upgrade to newer versions, up to TLS 1.2 if possible.

Normally, a client and a server automatically negotiate the most recent supported protocol version of SSL/TLS. The second aspect of the POODLE vulnerability concerns this negotiation mechanism. For the protocol negotiation mechanism to work, servers must gracefully deal with a more recent protocol version offered by clients. (The connection would just use the older, server-supported version in such a scenario, not benefiting from future protocol enhancements.) However, when newer TLS versions were deployed, it was discovered that some servers just terminated the connection at the TCP layer or responded with a fatal handshake error, preventing a secure connection from being established. Clearly, this server behavior is a violation of the TLS protocol, but there were concerns that this behavior would make it impossible to deploy upgraded clients and widespread interoperability failures were feared. Consequently, browsers first try a recent TLS version, and if that fails, they attempt again with older protocol versions, until they end up at SSL 3.0, which suffers from the padding-related vulnerability described above. This behavior is sometimes called the compatibility dance. It is not part of TLS implementations such as OpenSSL, NSS, or GNUTLS; it is implemented by application code in client applications such as Firefox and Thunderbird.

Both aspects of POODLE require a man in the middle attack at the network layer. The first aspect of this flaw, the SSL 3.0 vulnerability, requires that an attacker can observe the network traffic between a client and a server and somehow trigger crafted network traffic from the client. This does not strictly require active manipulation of the network transmission, passive eavesdropping is sufficient. However, the second aspect, the forced protocol downgrade, requires active manipulation of network traffic.  As described in the POODLE paper, both aspects require the attacker to be able to observe and manipulate network traffic while it is in transit.


How are modern browsers affected by the POODLE security flaw?

Browsers are particularly vulnerable because session cookies are short and an ideal target for plain text recovery, and the way HTTPS works allows an attacker to generate many guesses quickly (either through Javascript or by downloading images). Browsers are also most likely to implement the compatibility fallback.
By default, Firefox supports SSL 3.0, and performs the compatibility fallback as described above. SSL 3.0 support can be switched off, but the compatibility fallback cannot be configured separately.

Is this issue fixed?

The first aspect of POODLE, the SSL 3.0 protocol vulnerability, has already been fixed through iterative protocol improvements, leading to the current TLS version, 1.2. It is simply not possible to address this in the context of the SSL 3.0 protocol, a protocol upgrade to one of the successors is needed. Note that TLS versions before 1.1 had similar padding-related protocol issues, which is why we recommend to switch to TLS 1.1, if possible. These issues, while present, are currently not known to be exploitable in a way such as POODLE. The first aspect of the POODLE vulnerability is not exploitable with TLS 1.0. (SSL and TLS are still quite similar as protocols, the name change has non-technical reasons.)

The second aspect, caused by browsers which implement the compatibility fallback in an insecure way, has yet to be addressed. Strictly speaking, this is a security vulnerability in browsers due to the way they misuse the TLS protocol. One way to fix this issue would be to remove the compatibility dance, focusing instead on making servers compatible with clients implementing the most recent TLS implementation (as explained, the protocol supports a version negotiation mechanism, but some servers refuse to implement it).

However, there is an industry-wide effort under way to enable browsers to downgrade in a secure fashion, using a new Signaling Cipher Suite Value (SCSV). This will require updates in browsers (such as Firefox) and TLS libraries (such as OpenSSL, NSS and GNUTLS). However, we do not envision changes in TLS client applications which currently do not implement the fallback logic, and neither in TLS server applications as long as they use one of the system TLS libraries. TLS-aware packet filters, firewalls, load balancers, and other infrastructure may need upgrades as well.

Is there going to be another SSL 3.0 issue in the near future? Is there a long term solution?

Disabling SSL 3.0 will obviously prevent exposure to future SSL 3.0-specific issues. The new SCSV-based downgrade mechanism should reliably prevent the use of SSL 3.0 if both parties support a newer protocol version. Once these software updates are widely deployed, the need to disable SSL 3.0 to address this and future vulnerabilities will hopefully be greatly reduced.

SSL 3.0 is typically used in conjunction with the RC4 stream cipher. (The only other secure option in a strict, SSL 3.0-only implementation is Triple DES, which is quite slow even on modern CPUs.) RC4 is already considered very weak, and SSL 3.0 does not even apply some of the recommended countermeasures which prolonged the lifetime of RC4 in other contexts. This is another reason to deploy support for more recent TLS versions.

I have patched my SSL implementation against BEAST and LUCKY-13, am I still vulnerable?

This depends on the type of mitigation you have implemented. If you disabled protocol versions earlier than TLS 1.1 (which includes SSL 3.0), then the POODLE issue does not affect your installation. If you forced clients to use RC4, the first aspect of POODLE does not apply, but you and your users are vulnerable to all of the weaknesses in RC4. If you implemented the n/n-1 split through a software update, or if you deployed TLS 1.1 support without enforcing it, but made no other configuration changes, you are still vulnerable to the POODLE issue.

Is it possible to monitor for exploit attempts?

The protocol downgrade is visible on the server side. Usually, servers can log TLS protocol versions. This information can then be compared with user agents or other information from the profile of a logged-in user, and mismatches could indicate attack attempts.

Attempts to abuse the SSL 3.0 padding oracle part of POODLE, as described in the paper, are visible to the server as well. They result in a fair number of HTTPS requests which follow a pattern not expected during the normal course of execution of a web application. However, it cannot be ruled out that a more sophisticated adaptive chosen plain text attack avoids confirmation of guesses from the server, and this more advanced attack would not be visible to the server, only to the client and the network up to the point at which the attacker performs their traffic manipulation.

What happens when i disable SSL 3.0 on my web server?

Some old browsers may not be able to support a secure connection to your site. Estimates of the number of such browsers in active use vary and depend on the target audience of a web site. SSL protocol version logging (see above) can be used to estimate the impact of disabling SSL 3.0 because it will be used only if no TLS version is available in the client.

Major browser vendors including Mozilla and Google have announced that they are to deactivate the SSL 3.0 in their upcoming versions.

How do I secure my Red Hat-supported software?

Red Hat has put together several articles regarding the removal of SSL 3.0 from its products.  Customers should review the recommendations and test changes before making them live in production systems.  As always, Red Hat Support is available to answer any questions you may have.

October 14, 2014

Who can sign for what?

In my last post, I discussed how to extract the signing information out of a token. But just because the signature on a document is valid does not mean that the user who signed it was authorized to do so. How can we got from a signature to validating a token? Can we use that same mechanism to sign other OpenStack messages?

The following is a proposed extension to Keystone client based on existing mechanisms.

Overview

  1. Extract signer data out of the certificates
  2. Fetch the compete list of certificate from Keystone using the OS-SIMPLE-CERT extension
  3. Match the signer to the cert to validate the signature and extract the domain data for the token
  4. Fetch the mapping info from the Federation extension
  5. Use the mapping info to convert from the signing cert to a keystone user and groups
  6. Fetch the effective roles from Keystone for the user/groups for that domain
  7. Fetch policy from Keystone
  8. Execute the policy check to validate that the signer could sign for the data.

We need a method to go from the certificate used to sign the document to a valid Keystone user. Implied in there is that everything signed in an OpenStack system is going to be signed by a Keystone user. This is an expansion on how things were done in the past, but there is a pretty solid basis for this approach: in Kerberos, everything is a Principal, whether user or system.

From Tokens to Certs

The Token has the CMS Signer Info.  We can extract that information as previously shown.

The OS-SIMPLE-CERT extension has an API for fetching all of the signing certs as once:

This might not scale greatly, it is sufficient for supporting a proof-of-concept.  It reduces the problem of “how to find the cert for this token”  down to a match between the signing info and the attributes of the certificates.

To extract the data from the certificates, We can Popen the OpenSSL command to validate a certificate.  This is proper security practice anyway, as, while we trust the authoritative Keystone, we should verify whenever possible.  It will be expensive, but this result can be cached and reused, so it should not have to happen very often.

From Certs to Users

To translate from a certificate to a user, we need to first parse the data out of the certificate. This is possible doing a call to OpenSSL. We can be especially efficient by using that call to validate the certificate itself, and then converting the response to a dictionary. Keystone already has a tool to convert a dictionary to the Identity objects (user and groups): the mapping mechanism in the Federation backend. Since a mapping is in a file we can fetch, we do not need to be inside the Keystone server to process the mapping, we just need to use the same mechanism.

Mappings

The OS-FEDERATION extension has an API to List all mappings.

And another to get each mapping.

Again, this will be expensive, but it can be cached now, and optimized in the future.

The same process that uses the mappings to translate the env-vars for an X509 certificate  to a user inside the Keystone server can be performed externally. This means extracting code from the Federation plugin of the Keystone server to python-keystoneclient.

From User to Roles

Once we have the users and groups, we need to get the Role data appropriate to the token. This means validating the token, and extracting out the domain for the project. Then we will use the Identity API to list effective role assignments

We’ll probably have to call this once for the user ID and then once for each of the groups from the mapping in order to get the full set of roles.

From Roles to Permission

Now, how do we determine if the user was capable of signing for the specified token? We need a policy file. Which one? The one abstraction we currently have is that a policy file can be associated with an endpoint. Since keystone is responsible for controlling the signing of tokens, the logical endpoint is the authoritative keystone server where we are getting the certificates etc:

We get the effective policy associated with the keystone endpoint using the policy API.

And now we can run the users RBAC through the policy engine to see if they can sign for the given token.  The policy engine is part of oslo common.  There is some “flattening” code from the Keystone server we will want to pull over.  But of these will again land in python-keystoneclient.

Implementation

This is a lot of communication with Keystone, but it should not have to be done very often: once each of these API calls have been made, the response can be cached for a reasonable amount of time. For example, a caching rule could say that all data is current for a minimum of 5 minutes. After that time, if a newly submitted token has an unknown signer info, the client could refetch the certificates. The notification mechanism from Keystone could also be extended to invalidate the cache of remote clients that register for such notifications.

For validating tokens in remote endpoints, the process will be split between python-keystoneclient and keystonemiddleware.  The Middleware piece will be responsible for cache management and maintaining the state of the validation between calls.  Keystone Client will expose each of the steps with a parameter that allows the cached state to be passed in, as well as already exposing the remote API wrapping functions.

At this state, it looks like no changes will have to be made to the Keystone server itself.  This shows the power of the existing abstractions.  In the future, some of the calls may need optimization.   Of example, the fetch for certificates may need to be broken down into a call that fetches an individual certificate by its signing info.

October 10, 2014

Who Signed that Token?

The specification For multiple signers requires a mechanism to determine who signed the token and then determine I’d the signer had the authority to issue a token for the scope of the token.  These are the steps he he necessary to perform that validation.

The CMS document is signed by a certificate, but, due to size constraints, the certificate has been stripped out of the token.  All that remains in the token is the ‘signer info’ section of the CMS document, as defined here http://tools.ietf.org/html/rfc5652#page-13 as

SignerIdentifier ::= CHOICE {
issuerAndSerialNumber IssuerAndSerialNumber,
subjectKeyIdentifier [0] SubjectKeyIdentifier }

IssuerAndSerialNumber is defined as:

The IssuerAndSerialNumber type identifies a certificate, and thereby
an entity and a public key, by the distinguished name of the
certificate issuer and an issuer-specific certificate serial number.

The definition of Name is taken from X.501 [X.501-88], and the
definition of CertificateSerialNumber is taken from X.509 [X.509-97].

IssuerAndSerialNumber ::= SEQUENCE {
issuer Name,
serialNumber CertificateSerialNumber }

CertificateSerialNumber ::= INTEGER

http://tools.ietf.org/html/rfc5652#section-10.2.4

SubjectKeyIdentifier is a more general approach which allows the user to identify if a specific certificate matches (it has the Subject Key Identifier in it as well) but does not provide a way to identify who signed it, or where to fetch the corresponding certificate.

With the Keystone server acting as the system of reference, we know where to fetch the certificate. All signing certificates can be fetched using the OS-SIMPLE-CERT extension. So, regardless of which form the signing info takes, we could determine which certificate to use in order to verify the token.

To date, only a single certificate is used to sign tokens. Identifying the signer of the token is the first step to expanding that in the future.

What does the current OpenSSL binary call produce:  I took the DER form of one of the sample tokens and passed it through /usr/lib64/nss/unsupported-tools/derdump  (one of the most useful of tools)  and saw:

C-Sequence  (816)
   Object Identifier  (9)
      1 2 840 113549 1 7 2 (PKCS #7 Signed Data)
   C-[0]  (801)
      C-Sequence  (797)
         Integer  (1)
            01 
         C-Set  (13)
            C-Sequence  (11)
               Object Identifier  (9)
                  2 16 840 1 101 3 4 2 1 (SHA-256)
         C-Sequence  (309)
            Object Identifier  (9)
               1 2 840 113549 1 7 1 (PKCS #7 Data)
            C-[0]  (294)
               Octet String  (290)
                  7b 22 61 63 63 65 73 73 22 3a 20 7b 22 74 6f 6b 65 6e 
                  22 3a 20 7b 22 65 78 70 69 72 65 73 22 3a 20 22 32 31 
                  31 32 2d 30 38 2d 31 37 54 31 35 3a 33 35 3a 33 34 5a 
                  22 2c 20 22 69 64 22 3a 20 22 30 31 65 30 33 32 63 39 
                  39 36 65 66 34 34 30 36 62 31 34 34 33 33 35 39 31 35 
                  61 34 31 65 37 39 22 7d 2c 20 22 73 65 72 76 69 63 65 
                  43 61 74 61 6c 6f 67 22 3a 20 7b 7d 2c 20 22 75 73 65 
                  72 22 3a 20 7b 22 75 73 65 72 6e 61 6d 65 22 3a 20 22 
                  75 73 65 72 5f 6e 61 6d 65 31 22 2c 20 22 72 6f 6c 65 
                  73 5f 6c 69 6e 6b 73 22 3a 20 5b 5d 2c 20 22 69 64 22 
                  3a 20 22 63 39 63 38 39 65 33 62 65 33 65 65 34 35 33 
                  66 62 66 30 30 63 37 39 36 36 66 36 64 33 66 62 64 22 
                  2c 20 22 72 6f 6c 65 73 22 3a 20 5b 7b 22 6e 61 6d 65 
                  22 3a 20 22 72 6f 6c 65 31 22 7d 2c 20 7b 22 6e 61 6d 
                  65 22 3a 20 22 72 6f 6c 65 32 22 7d 5d 2c 20 22 6e 61 
                  6d 65 22 3a 20 22 75 73 65 72 5f 6e 61 6d 65 31 22 7d 
                  7d 7d 
         C-Set  (462)
            C-Sequence  (458)
               Integer  (1)
                  01 
               C-Sequence  (164)
                  C-Sequence  (158)
                     C-Set  (10)
                        C-Sequence  (8)
                           Object Identifier  (3)
                              2 5 4 5 (X520 Serial Number)
                           Printable String  (1)
                              "5"
                     C-Set  (11)
                        C-Sequence  (9)
                           Object Identifier  (3)
                              2 5 4 6 (X520 Country Name)
                           Printable String  (2)
                              "US"
                     C-Set  (11)
                        C-Sequence  (9)
                           Object Identifier  (3)
                              2 5 4 8 (X520 State Or Province Name)
                           Printable String  (2)
                              "CA"
                     C-Set  (18)
                        C-Sequence  (16)
                           Object Identifier  (3)
                              2 5 4 7 (X520 Locality Name)
                           Printable String  (9)
                              "Sunnyvale"
                     C-Set  (18)
                        C-Sequence  (16)
                           Object Identifier  (3)
                              2 5 4 10 (X520 Organization Name)
                           Printable String  (9)
                              "OpenStack"
                     C-Set  (17)
                        C-Sequence  (15)
                           Object Identifier  (3)
                              2 5 4 11 (X520 Organizational Unit Name)
                           Printable String  (8)
                              "Keystone"
                     C-Set  (37)
                        C-Sequence  (35)
                           Object Identifier  (9)
                              1 2 840 113549 1 9 1 (PKCS #9 Email Address)
                           IA5 String  (22)
                              "keystone@openstack.org"
                     C-Set  (20)
                        C-Sequence  (18)
                           Object Identifier  (3)
                              2 5 4 3 (X520 Common Name)
                           Printable String  (11)
                              "Self Signed"
                  Integer  (1)
                     11 
               C-Sequence  (11)
                  Object Identifier  (9)
                     2 16 840 1 101 3 4 2 1 (SHA-256)
               C-Sequence  (13)
                  Object Identifier  (9)
                     1 2 840 113549 1 1 1 (PKCS #1 RSA Encryption)
                  NULL  (0)
               Octet String  (256)
                  6e 93 08 58 52 dd 52 db 65 b9 aa 9b f5 87 37 bc 56 f2 
                  b5 25 05 a5 9b 37 68 cc 9e 2e f3 80 49 e2 58 d8 70 01 
                  35 0a 7b 66 c7 15 2c 65 6b b3 15 31 e6 8b 8e 27 eb 12 
                  d5 70 cd 71 b1 ae 68 fe b6 cf a6 b5 d7 a3 a6 84 d9 0d 
                  52 d9 e6 cd 38 fa b9 7e c5 09 63 76 99 14 3a f6 5b 71 
                  9c b7 90 9b 36 64 b5 f3 77 e6 5e ca e1 06 d1 bb a9 fc 
                  39 4c a5 e4 b2 0b 86 ae 46 d7 40 67 9d 82 38 3c 4e 69 
                  ee 00 d0 0d a8 d4 38 f9 8d a3 96 36 4d ed 18 6a 2f c4 
                  09 9f 13 9e 71 b4 31 5a f9 24 57 80 52 a2 dc 69 6e e4 
                  76 96 1b ef ae 2a cb 2d f7 fe 6d d9 6e db 3d e2 03 d1 
                  00 00 8d 8e 2c 13 49 bf 0a 10 09 74 c0 d9 25 2f 7d 1e 
                  8e f2 f0 ff 79 a4 ce 45 a0 4d a5 8d 4c c5 18 44 66 8a 
                  90 a5 55 c8 6d a9 53 1c a6 d0 47 c1 26 40 45 9f 05 91 
                  41 00 ad e0 03 6a 15 8b fc d4 7c c4 d1 26 34 0d a2 9b 
                  a7 e6 95 c7 

Th field X520 Serial Number Shows it is serial number 5 from the CA. The CA is then identified by its X500 Attributes such as

X520 Common Name

Which is “Self Signed”. The following Python code will dump the signing info for the PEM version of a token. Its basically a stripped down version of this code.

#!/usr/bin/python
import argparse
import base64
import errno
import hashlib
import logging
import zlib

from pyasn1_modules import rfc2315, rfc2459, pem
from pyasn1.codec.der import encoder, decoder

from pyasn1.type import univ, char
from pyasn1.codec.der import decoder as der_decoder


infile=None
parser = argparse.ArgumentParser()
parser.add_argument('infile', metavar='infile', help='token file to decode')
args = parser.parse_args()

in_file = open(args.infile, 'r')

idx, substrate = pem.readPemBlocksFromFile(
    in_file, ('-----BEGIN CMS-----', '-----END CMS-----'))


contentInfo, rest = der_decoder.decode(substrate, 
                                       asn1Spec=rfc2315.ContentInfo())

contentType = contentInfo.getComponentByName('contentType')
contentInfoMap = {
    (1, 2, 840, 113549, 1, 7, 1): rfc2315.Data(),
    (1, 2, 840, 113549, 1, 7, 2): rfc2315.SignedData(),
    (1, 2, 840, 113549, 1, 7, 3): rfc2315.EnvelopedData(),
    (1, 2, 840, 113549, 1, 7, 4): rfc2315.SignedAndEnvelopedData(),
    (1, 2, 840, 113549, 1, 7, 5): rfc2315.DigestedData(),
    (1, 2, 840, 113549, 1, 7, 6): rfc2315.EncryptedData()
}

content, _ = decoder.decode(
    contentInfo.getComponentByName('content'),
    asn1Spec=contentInfoMap[contentType]
)
signerInfos = content.getComponentByName('signerInfos')[0]
issuer_and_sn = signerInfos.getComponentByName('issuerAndSerialNumber')
issuer =  issuer_and_sn.getComponentByName('issuer')
for n in issuer[0]:
    print n[0][0]
    print n[0][1]

It does something wrong with the text fields, I suspect because they are some form of Not–quite-text that needs to be parsed. I don’t think I need to take this farther, however. INstead, I plan on following an example from the OCSP generating sample code that hashes the issuer and compares the hash.

UPDATE: Thanks to gsilvis who pointed out a spurious import in the code sample.

October 09, 2014

Automated configuration analysis for Mozilla’s TLS guidelines

My friend Hubert has been doing a lot of work to make better the world a little safer.  Glad he’s getting some recognition.  Here’s a great article on testing your server for proper SSL/TLS configurations.


Ansible Hostgroups from FreeIPA

Ansible provides management for a large array of servers using ssh as the access mechanism. This is a good match for  FreeIPA.  However, by default Ansible uses a flat file to store groups of hosts.  How can we get that info from FreeIPA?

 

If you want to run the `uptime` command on all web servers, you would define a fragment of /etc/ansible/hosts  like this:

[webservers]

alpha.example.org
beta.example.org
192.168.1.100
192.168.1.110
web1.example.com

And then run

ansible webservers -a uptime

In order to get ansible to use a different scheme, use a dynamic inventory.  I wrote a proof of concept one  that uses the hostgroup definitions from my IPA server to populate a json file.  The format of the file is specified in this tutorial:

My Sample ignores the command line parameters, and just returns the whole set of hostgroups.

#Apache License...

#!/usr/bin/python

import json
from ipalib import api
api.bootstrap(context='cli')
api.finalize()
api.Backend.xmlclient.connect()
inventory = {}
hostvars={}
meta={}
result =api.Command.hostgroup_find()['result']
for hostgroup in result:
    inventory[hostgroup['cn'][0]] = { 'hosts': [host for host in hostgroup['member_host']]}
    for host in hostgroup['member_host']:
        hostvars[host] = {}
inventory['_meta'] = {'hostvars': hostvars}
inv_string = json.dumps( inventory)
print inv_string

I copied it to /etc/ansible/freeipa.py and ran:

 

$ ansible -i /etc/ansible/freeipa.py packstacked -a uptime
ayoungf20packstack.cloudlab.freeipa.org | success | rc=0 >>
20:42:33 up 141 days, 20:43, 2 users, load average: 0.22, 0.15, 0.14

multidom.cloudlab.freeipa.org | success | rc=0 >>
20:42:34 up 52 days, 3:17, 1 user, load average: 0.01, 0.03, 0.05

horizon.cloudlab.freeipa.org | success | rc=0 >>
20:42:35 up 51 days, 6:07, 2 users, load average: 0.00, 0.03, 0.05

As I said, this was a proof of concept. It does not do everything that you might want to have an inventory do. I plan on fleshing it out and submitting to the Ansible plugin repo. Meanwhile, you can look at the other examples.

If you are curious, here is the output from when I run my plugin:

$ python freeipa.py | python -mjson.tool
{
    "_meta": {
        "hostvars": {
            "ayoungf20packstack.cloudlab.freeipa.org": {},
            "horizon.cloudlab.freeipa.org": {},
            "ipa.cloudlab.freeipa.org": {},
            "jboss.cloudlab.freeipa.org": {},
            "multidom.cloudlab.freeipa.org": {}
        }
    },
    "keystone-ha-cluster": {
        "hosts": [
            "horizon.cloudlab.freeipa.org",
            "ipa.cloudlab.freeipa.org",
            "jboss.cloudlab.freeipa.org"
        ]
    },
    "packstacked": {
        "hosts": [
            "ayoungf20packstack.cloudlab.freeipa.org",
            "horizon.cloudlab.freeipa.org",
            "multidom.cloudlab.freeipa.org"
        ]
    }
}

October 08, 2014

The Source of Vulnerabilities, How Red Hat finds out about vulnerabilities.

Red Hat Product Security track lots of data about every vulnerability affecting every Red Hat product. We make all this data available on our Measurement page and from time to time write various blog posts and reports about interesting metrics or trends.

One metric we’ve not written about since 2009 is the source of the vulnerabilities we fix. We want to answer the question of how did Red Hat Product Security first hear about each vulnerability?

Every vulnerability that affects a Red Hat product is given a master tracking bug in Red Hat bugzilla. This bug contains a whiteboard field with a comma separated list of metadata including the dates we found out about the issue, and the source. You can get a file containing all this information already gathered for every CVE. A few months ago we updated our ‘daysofrisk’ command line tool to parse the source information allowing anyone to quickly create reports like this one.

So let’s take a look at some example views of recent data: every vulnerability fixed in every Red Hat product in the 12 months up to 30th August 2014 (a total of 1012 vulnerabilities).

Firstly a chart just giving the breakdown of how we first found out about each issue: Sources of issues

  • CERT: Issues reported to us from a national cert like CERT/CC or CPNI, generally in advance of public disclosure
  • Individual: Issues reported to Red Hat Product Security directly by a customer or researcher, generally in advance of public disclosure
  • Red Hat: Issues found by Red Hat employees
  • Relationship: Issues reported to us by upstream projects, generally in advance of public disclosure
  • Peer vendors: Issues reported to us by other OS distributions, through relationships
    or a shared private forum
  • Internet: For issues not disclosed in advance we monitor a number of mailing lists and security web pages of upstream projects
  • CVE: If we’ve not found out about an issue any other way, we can catch it from the list of public assigned CVE names from Mitre

Next a breakdown of if we knew about the issue in advance. For the purposes of our reports we count knowing the same day of an issue as not knowing in advance, even though we might have had a few hours notice: Known in advanceThere are few interesting observations from this data:

  • Red Hat employees find a lot of vulnerabilities. We don’t just sit back and wait for others to find flaws for us to fix, we actively look for issues ourselves and these are found by engineering, quality assurance, as well as our security teams. 17% of all the issues we fixed in the year were found by Red Hat employees. The issues we find are shared back in advance where possible to upstream and other peer vendors (generally via the ‘distros’ shared private forum).
  • Relationships matter. When you are fixing vulnerabilities in third party software, having a relationship with the upstream makes a big difference. But
    it’s really important to note here that this should never be a one-way street, if an upstream is willing to give Red Hat information about flaws in advance,
    then we need to be willing to add value to that notification by sanity checking the draft advisory, checking the patches, and feeding back the
    results from our quality testing. A recent good example of this is the OpenSSL CCS Injection flaw; our relationship with OpenSSL gave us advance
    notice of the issue and we found a mistake in the advisory as well as a mistake in the patch which otherwise would have caused OpenSSL to have to have
    done a secondary fix after release. Only two of the dozens of companies prenotified about those OpenSSL issues actually added value back to OpenSSL.
  • Red Hat can influence the way this metric looks; without a dedicated security team a vendor could just watch what another vendor does and copy them,
    or rely on public feeds such as the list of assigned CVE names from Mitre. We can make the choice to invest to find more issues and build upstream relationships.

October 07, 2014

Yellow Sticky of Doom in the Cloud

The password managers we discussed in the last post are a good start. If you only use one system a local password database is all you need.

Most people have multiple “devices” – a PC, a laptop, a smartphone, a tablet, and the number keeps growing. It would be terribly convenient to have access to your passwords on all of your devices, and to have everything automatically updated when you add or change a password.

This is where network – or today CLOUD BASED (highlighted for dramatic emphasis…) – password managers come into play. These networked password managers share, distribute, backup, and replicate your passwords.

Putting your passwords IN THE CLOUD should make you nervous. It is important to do your homework before choosing one – don’t just choose the first one that comes up on a search!

There are several places to look. Wikipedia has a List of Password Managers. Information Week has an article on 10 Top Password Managers. Network World published Best tools for protecting passwords. Mac World produced Mac password managers. At a minimum make sure that the password managers you are considering have at least some public review and feedback. You should also do web searches looking for user experience and any issues with the various password managers.

For cloud based password managers, one of the most important things is to make sure that you retain control of the passwords. This is done by encrypting the password data locally, on your system, and only sending encrypted data to the cloud. Done properly, the master encryption password for the password database never leaves your system – no one, including the company hosting your password manager, can decrypt your password. Of course this also means that if you lose your password manager password you are out of luck; no one can recover it.

As an anecdote, not a recommendation, a thoroughly paranoid colleague who works in the security space and whose opinions I respect recommends LastPass.  I prefer open source password managers that can be audited, like KeePassX, but there don’t seem to be any with good Cloud integration.


October 01, 2014

Wherein our hero attempts to build his own OpenStack Keystone RPMs

I have a Devstack setup. I’ve hacked the Keystone repo to add some cool feature.  I want to test it out with an RDO deployment.  How do I make my own RPM for the RDO system?

This is not a how to. This is more like a police log.

sudo mkdir fedora
sudo chown ayoung:ayoung fedora
cd fedora/
git clone http://pkgs.fedoraproject.org/cgit/openstack-keystone.git

What did that get us?

$ ls openstack-keystone
0001-remove-runtime-dep-on-python-pbr.patch
0002-sync-parameter-values-with-keystone-dist.conf.patch
daemon_notify.sh
keystone-dist.conf
openstack-keystone.init
openstack-keystone.logrotate
openstack-keystone-sample-data
openstack-keystone.service
openstack-keystone.spec
openstack-keystone.sysctl
openstack-keystone.upstart
sources

When I build, I do so in /home/ayoung/rpmbuild. How:

$ cat ~/.rpmmacros
%_topdir /home/ayoung/rpmbuild
%packager Adam Young
%dist .f20_ayoung

So, first move these files into %_topdir/SOURCES

 cp * ~/rpmbuild/SOURCES/

Technically you don’t need to move the .spec file there, but it hurts nothing to do so.

Ok, lets make a first attempt at building:

$ rpmbuild -bp openstack-keystone.spec 
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz

So it is looking for a tarball named keystone-2014.2.b3.tar.gz. Where does it get that name:

In openstack-keystone.spec:

%global release_name juno
%global milestone 3

%global with_doc %{!?_without_doc:1}%{?_without_doc:0}

Name:           openstack-keystone
Version:        2014.2
Release:        0.4.b%{milestone}%{?dist}
Summary:        OpenStack Identity Service
...
Source0:        http://launchpad.net/keystone/%{release_name}/%{release_name}-%{milestone}/+download/keystone-%{version}.b%{milestone}.tar.gz

Now, lets assume that I already have that version installed with RDO, and I want to test my change on top of it. If I make a mile > 3 it will update. I could do this by changing the spec file.

changeing
-%global milestone 3
+%global milestone 3a
would give me :
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3a.tar.gz: No such file or directory

But can I do that from the command line? Sure. Use -D, and put hta macro in quotes:

$ rpmbuild -D "milestone 3b"  -bp openstack-keystone.spec 
error: File /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz: No such file or directory

Ok, so now I need a tarball that looks like the file name I will use in the rpmbuild process. Use git-archive to produce it.

cd /opt/stack/keystone
git archive -o /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz  HEAD
cd /opt/fedora/openstack-keystone/
 rpmbuild -bp openstack-keystone.spec
error: Failed build dependencies:
	python-pbr is needed by openstack-keystone-2014.2-0.4.b3.f20_ayoung.noarch

Alright! Closer. There is a Yum utility to help out with this last problem. TO get the utilities:

sudo yum install yum-utils

then to get the build dependencies:

sudo yum-builddep openstack-keystone.spec

The next time running an rpmbuild gets the error:

+ cd keystone-2014.2.b3
/var/tmp/rpm-tmp.NSt5Oy: line 37: cd: keystone-2014.2.b3: No such file or directory

Due to the fact that the archive command above is not putting the code in a subdir. There is a flag for that: –prefix

cd /opt/stack/keystone
git archive  --prefix=keystone-2014.2.b3/   -o /home/ayoung/rpmbuild/SOURCES/keystone-2014.2.b3.tar.gz  HEAD
cd /opt/fedora/openstack-keystone
rpmbuild -bp openstack-keystone.spec
+ echo 'Patch #1 (0001-remove-runtime-dep-on-python-pbr.patch):'
Patch #1 (0001-remove-runtime-dep-on-python-pbr.patch):
+ /usr/bin/cat /home/ayoung/rpmbuild/SOURCES/0001-remove-runtime-dep-on-python-pbr.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file bin/keystone-all

Now I get an error on a patch application: this is better, as it implies I am packing the tarball correctly. That change is basically removing pbr from the keystone-all startup file. PBR does versioning type stuff that competes with RPM.

Let’s ignore that patch for now: Keystone will run just fine with PBR in. In the spec file I comment out:

%patch0001 -p1

Code in the spec file is doing something comparable, and fails out due to me skipping first patch.

+ sed -i s/REDHATKEYSTONEVERSION/2014.2/ bin/keystone-all keystone/cli.py
+ sed -i s/2014.2.b3/2014.2/ PKG-INFO

Skip that code for now, too.

--- a/openstack-keystone.spec
+++ b/openstack-keystone.spec
@@ -117,7 +117,7 @@ This package contains documentation for Keystone.
 %prep
 %setup -q -n keystone-%{version}.b%{milestone}
 
-%patch0001 -p1
+#%patch0001 -p1
 %patch0002 -p1
 
 find . \( -name .gitignore -o -name .placeholder \) -delete
@@ -127,9 +127,9 @@ rm -rf keystone.egg-info
 # Let RPM handle the dependencies
 rm -f test-requirements.txt requirements.txt
 # Remove dependency on pbr and set version as per rpm
-sed -i s/REDHATKEYSTONEVERSION/%{version}/ bin/keystone-all keystone/cli.py
+#sed -i s/REDHATKEYSTONEVERSION/%{version}/ bin/keystone-all keystone/cli.py
 
-sed -i 's/%{version}.b%{milestone}/%{version}/' PKG-INFO
+#sed -i 's/%{version}.b%{milestone}/%{version}/' PKG-INFO

And the build works. Thus far, I’ve been running -bp which just preps the source tree in the repo. Lets see what the full rpmbuild process returns:

+ cp etc/keystone.conf.sample etc/keystone.conf
+ /usr/bin/python setup.py build
error in setup command: Error parsing /home/ayoung/rpmbuild/BUILD/keystone-2014.2.b3/setup.cfg:Exception: Versioning for this project requires either an sdist tarball, or access to an upstream git repository. Are you sure that git is installed?

There is that python build reasonableness change. OK, what is that doing? Lets start with the last line we commented out in the spec file:

+#sed -i ‘s/%{version}.b%{milestone}/%{version}/’ PKG-INF

This is putting the RPM version of the python library into the Python EGG info. In my repo I have

keystone.egg-info/PKG-INFO

with

Version: 2014.2.dev154.g1af2428

I wonder if I can make the RPM version look like that? It won’t be an rpm -U to go from my RDO deployment if I do, but I can work around that with –oldpackage.

What is dev154.g1af2428? I suspect some git magic for the last part. Git log shows:

commit 1af24284bdc093dae4f027ade2ddb29656b676f0

So g1af2428? is g + githash[0:6] but what about dev154? It turns out it is the number of commits since the last tag.

At this point, I resorted to IRC, and then reading the code. The short of it is that I am submitting a patch to the keystone spec file to remove all of the PBR related changes. Instead, when python setup.py is called, we’ll use the environment variable to tell PBR the version number. For example

PBR_VERSION=%{version}.%{milestone}  %{__python} setup.py build

September 30, 2014

LDAP persistent searches with ldapjdk

As part of the LDAP-based profiles feature I’ve been working on for the Dogtag, it was necessary to implement a feature where the database is monitored for changes to the LDAP profiles. For example, when a profile is updated on a clone, that change is replicated to other clones, and those other clones have to detect that change and each instance must update its view of the profiles accordingly. This post details how the LDAP persistent search feature was used to implement this behaviour.

A naïve approach to solving this problem would have been to unconditionally refresh all profiles at a certain interval. Slightly better would be to check all profiles at a certain interval and update those that have changed. Both of these methods involve some non-trivial delay between changes being replicated to the local database, and the profile subsystem reflecting those changes.

A different approach was to use the LDAP persistent search capability. With this feature, once the search is running, the client receives immediate notification of changes. This advantage commended it over the polling approach as a more appropriate basis for a solution.

ldapjdk persistent search API

A big part of the motivation for this post was the paucity of the ldapjdk documentation with respect to persistent searches. The necessary information is all there – but it is scattered across several classes, all of which play some important part in a working implementation, but none of which tells the full story.

Hopefully some people will benefit from this information being brought together in one place and explained step by step. Let’s look at the classes involved one by one as we build up the solution.

LDAPPersistSearchControl

This is the server control that activates the persistent search behaviour. It also provides static flags for specifying what kinds of updates to listen for. Its constructor takes a union of these flags and three boolean values:

changesOnly

Whether to return existing entries that match the search criteria. For our use case, we are only interested in changes.

returnControls

Whether to return entry change controls with each search result. These controls are required if you need to know what kind of change occured (add, modify, delete or modified DN).

isCritical

Whether this control is critical to the search operation.

The LDAPPersistSearchControl object used for our persistent search is constructed in the following way:

int op = LDAPPersistSearchControl.ADD
    | LDAPPersistSearchControl.MODIFY
    | LDAPPersistSearchControl.DELETE
    | LDAPPersistSearchControl.MODDN;
LDAPPersistSearchControl persistCtrl =
    new LDAPPersistSearchControl(op, true, true, true);

LDAPSearchConstraints

The LDAPSearchConstraints object sets various controls and parameters for an LDAP search, persistent or otherwise. In our case, we need to attach the LDAPPersistSearchControl to the constraints, as well as disable the timeout of the search, and set the results batch size to 1 so that no buffering of results will occur at the server:

LDAPSearchConstraints cons = conn.getSearchConstraints();
cons.setServerControls(persistCtrl);
cons.setBatchSize(1);
cons.setServerTimeLimit(0 /* seconds */);

LDAPSearchResults

Executing the search method of an LDAPConnection (here named conn), yields an LDAPSearchResults object. This is the same whether or the search was a persistent search according to the LDAPSearchConstraints. The different between persistent and non-persistent searches is in how results are retrieved from the results object: if the search is persistent, the hasMoreElement method will block until the next result is received from the server (or the search times out, the connection dies, et cetera).

Let’s see what it looks like to actually execute the persistent search and process its results:

LDAPConnection conn = ... /* an open LDAPConnection */

LDAPSearchResults results = conn.search(
    "ou=certificateProfiles,ou=ca," + basedn, /* search DN */
    LDAPConnection.SCOPE_ONE, /* search at one level below DN */
    "(objectclass=*)",        /* search filter */
    null,   /* list of attributes we care about */
    false,  /* whether to only include specified attributes */
    cons    /* LDAPSearchConstraints defined above */
);
while (results.hasMoreElements()) /* blocks */ {
    LDAPEntry entry = results.next();
    /* ... process result ... */
}

We see that apart from the use of the LDAPSearchConstraints to specify a persistent search and the blocking behaviour of LDAPSearchResults.hasMoreElements, performing a persistent search is the same as performing a regular search.

Let us next examine what happens inside that while loop.

LDAPEntryChangeControl

Do you recall the returnControls parameter for LDAPPersistSearchControl? If true, it ensures that each entry returned by the persistent search is accompanied by a control that indicates the type of change that affected the entry. We need to know this information so that we can update the profile subsystem in the appropriate way (was this profile added, updated, or deleted?)

Let’s look at how we do this. We are inside the while loop from above, starting exactly where we left off:

LDAPEntry entry = results.next();
LDAPEntryChangeControl changeControl = null;
for (LDAPControl control : results.getResponseControls()) {
    if (control instanceof LDAPEntryChangeControl) {
        changeControl = (LDAPEntryChangeControl) control;
        break;
    }
}
if (changeControl != null) {
    int changeType = changeControl.getChangeType();
    switch (changeType) {
    case LDAPPersistSearchControl.ADD:
        readProfile(entry);
        break;
    case LDAPPersistSearchControl.DELETE:
        forgetProfile(entry);
        break;
    case LDAPPersistSearchControl.MODIFY:
        forgetProfile(entry);
        readProfile(entry);
        break;
    case LDAPPersistSearchControl.MODDN:
        /* shouldn't happen; log a warning and continue */
        CMS.debug("Profile change monitor: MODDN shouldn't happen; ignoring.");
        break;
    default:
        /* shouldn't happen; log a warning and continue */
        CMS.debug("Profile change monitor: unknown change type: " + changeType);
        break;
    }
} else {
    /* shouldn't happen; log a warning and continue */
    CMS.debug("Profile change monitor: no LDAPEntryChangeControl in result.");
}

The first thing that has to be done is to retrieve from the LDAPSearchResults object the LDAPEntryChangeControl for the most recent search result. To do this we call results.getResponseControls(), which returns an LDAPControl[]. Each search result can arrive with multiple change controls, but we are specifically interested in the LDAPEntryChangeControl so we iterate over the LDAPControl[] until we find what we want, then break.

Next we ensure that we did in fact find the LDAPEntryChangeControl. This should always hold in our implementation but the code should handle the failure case anyway – we just log a warning and move on.

Finally, we call changeControl.getChangeType() and dispatch to the appropriate behaviour according to its value.

Interaction with the profile subsystem

Up to this point, we have seen how to use the ldapjdk API to execute a persistent LDAP search and process its results. Of course, this is just part of the story – the search somehow needs to be run in a way that doesn’t impede the regular operation of the Dogtag PKI, and needs to safely interact with the profile subsystem. Because the persistent search involves blocking calls, the procedure needs to run in its own thread.

Because this persistent search only concerns the ProfileSubsystem class, it was possible to completely encapsulate it within this class such that no changes to its API (including constructors) were necessary. An inner class Monitor, which extends Thread, actually runs the search. In this way, the code we saw above is neatly segregated from the rest of the ProfileSubsystem class, and there are no visibility issues when calling the readProfile and forgetProfile methods of the other class.

The following simplified code conveys the essence of the complete implementation:

public class ProfileSubsystem implements IProfileSubsystem {
    public void init(...) {
        // Read profiles from LDAP into the subsystem.
        // Calls readProfile for each existing LDAPEntry.

        monitor = new Monitor(this, dn, dbFactory);
        monitor.start();
    }

    public synchronized IProfile createProfile(...) {
        // Create the profile
    }

    public void readProfile(LDAPEntry entry) {
        // Read some LDAP attributes into local vars
        createProfile(...);
    }

    private void forgetProfile(LDAPEntry entry) {
        profileId = /* read from entry */
        forgetProfile(profileId);
    }

    private void forgetProfile(String profileId) {
        // Forget about this profile.
    }

    private class Monitor extends Thread {
        public Monitor(...) {
            // constructor
        }

        public void run() {
            // Execute the persistent search as above.
            //
            // Calls readProfile and forgetProfile depending
            // on changes that occur.
        }
    }
}

So, what’s going on here? First of all, it must be emphasised that this example is simplified. For example, I have omitted details of how the monitor thread is stopped when the subsystem is shut down or reinitialised.

The monitor thread is started by the init method, once the existing profiles have been read into the profile subsystem. Executing the persistent search and handling results is the one job this the monitor has to do, so it can block without affecting any other part of the system. When it receives results, it calls the readProfile and forgetProfiles methods of the outer class – the ProfileSubsystem – to keep it up to date with the contents of the database.

Other parts of the system access the ProfileSubsystem as well, so consideration had to be given to synchronisation and making sure that changes to the contents of the ProfileSubsystem are done safely. In the end, the only method that was made synchronized was createProfile, which is also called by the REST interface. The behaviour of the handful of other methods that could be called simultaneously should be fine by virtue of the fact that the internal data structures used are themselves synchronised and idempotent. Hopefully I have not overlooked something important!

Conclusion

LDAP persistent searches can be used to receive immediate notification of changes that occur in an LDAP database. They support all the parameters of regular LDAP searches. ldapjdk’s API provides persistent search capabilities including the ability to discern what kind of change occurred for each result.

The ldapjdk LDAPSearchResults.hasMoreElements() method blocks each time it is called until a result has been received from the server. Because of this, it will usually be necessary to execute persistent searches asynchronously. Java threads can be employed to do this, but the usual "gotchas" of threading apply – threads must be stopped safely and the safety of methods that could be called from multiple places at the same time must be assessed. The synchronized keyword can be used to ensure serialisation of calls to methods that would otherwise be unsafe under these conditions.

September 29, 2014

Multiple Signers

You have a cloud, I have a cloud.

Neither of use are willing to surrender control of our OpenStack deployments, but we need to inter-operate.

We both have Keystone servers. Those servers are the system of record for user authorization through out our respective deployments. We each wish to maintain control of our assignments. How can we make a set of resources that can be shared?  It can’t be done today.  Here is why not, and how to make it possible.

Simple example:  Nova in OpenStack Deployment A (OSDA) creates a VM using an image from Glance in OpenStackDeployment B. Under the rules of policy, the VM and the image must both reside in the same project. However, the endpoints are managed by two different Keystone servers.

At the Keystone to Keystone level, we need to provide a shared view of the project. Since every project is owned by a domain, the correct level of abstraction is for the two Keystone servers to have a common view of the domain.  In our example we create a Domain in OSDA named “SharedDom” and assume it assigns “54321CBE” as the domain id. Keystone for OSDB needs to have an identical domain. Lets loosen the rules on domain creation to allow that: create domain takes both name and accepts “54321CBE” as the domain’s identifier.

Once we have the common domain definition, we can create a project under “SharedDom” in OSDA and, again, put a copy into OSDB.

Let us also assume that the service catalog has been synchronized across the two Keystone images. While you might be tempted to just make each a region in one Keystone server, that surrenders the control of the cloud to the remote Keystone admin, and so will not be organizationally acceptable. So, while we should tag each service catalog as a region, and keep the region names generally unique, they still are managed by distinct peer Keystone servers.

The API call that creates a VM in one endpoint by fetching an image from another accesses both endpoints with a single token. It is impossible to do today. Which keystone is then used to allocate the token? If it is OSDA, then Glance would reject it. if OSDB, Nova would reject it.

For PKI tokens, we can extract the “Signing Data” from the token body and determine which certificate signed it. From the certificate we can determine which Keystone server signed the token. Keystone would then have to provide rules for which Keystone server was allowed to sign for a domain. By default, it would be only the authoritative Keystone server associated with the same deployment. However, Keystone from OSDA would allow Keystone from OSDB to sign for Tokens under “SharedDom” only.

If both keystone servers are using UUID tokens we could arrange for the synchronization of all tokens for the shared domain across keystone instances. Now each endpoint would still authenticate tokens against its own Keystone server. The rules would be the same as far as determining which Keystone could sign for which tokens, it would just be enforced at the Keystone server level. My biggest concern with this approach is that there are synchronization problems; it has a tendency toward race conditions. But it could be made to work as well.

The agreement can be set one direction.  If OSDA  is managing the domain, and OSDB accepts the agreement, then OSDA can sign for tokens for the domain that are used across both deployments endpoints.   Keystone for OSDA must explicitly add a rule that says that Keystone for OSDB can sign for tokens as well.

This setup works  on a gentleman’s agreement genteel understanding that only one of the two Keystone servers  actively manages the domain.  One Keystone server is delegated the responsibility of managing the domain.  The other one performs that delegation, and agrees to stay out of it.  If the shared domain is proving to be too much trouble, either Keystone server can disable the domain without the agreement of the other Keystone server.

As Mick Jagger sang “Hey, you!  Get off of my cloud!”

September 26, 2014

A follow up to the Bash Exploit and SELinux.
One of the advantages of a remote exploit is to be able to setup and launch attacks on other machines.

I wondered if it would be possible to setup a bot net attack using the remote attach on an apache server with the bash exploit.

Looking at my rawhide machine's policy

sesearch -A -s httpd_sys_script_t -p name_connect -C | grep -v ^D
Found 24 semantic av rules:
   allow nsswitch_domain dns_port_t : tcp_socket { recv_msg send_msg name_connect } ;
   allow nsswitch_domain dnssec_port_t : tcp_socket name_connect ;
ET allow nsswitch_domain ldap_port_t : tcp_socket { recv_msg send_msg name_connect } ; [ authlogin_nsswitch_use_ldap ]


The apache script would only be allowed to connect/attack a dns server and an LDAP server.  It would not be allowed to become a spam bot (No connection to mail ports) or even attack other web service.

Could an attacker leave a back door to be later connected to even after the bash exploit is fixed?

# sesearch -A -s httpd_sys_script_t -p name_bind -C | grep -v ^D
#

Nope!  On my box the httpd_sys_script_t process is not allowed to listen on any network ports.

I guess the crackers will just have to find a machine with SELinux disabled.

September 25, 2014

What does SELinux do to contain the the bash exploit?
Do you have SELinux enabled on your Web Server?

Lots of people are asking me about SELinux and the Bash Exploit.

I did a quick analysis on one reported remote Apache exploit:

http://www.reddit.com/r/netsec/comments/2hbxtc/cve20146271_remote_code_execution_through_bash/




Shows an example of the bash exploit on an apache server.  It even shows that SELinux was enforcing when the exploit happened.




SELinux does not block the exploit but it would prevent escallation of confined domains.
Why didn't SELinux block it?

SELinux controls processes based on their types, if the process is doing what it was designed to do then SELinux will not block it.

In the defined exploit the apache server is running as httpd_t and it is executing a cgi script which would be labeled httpd_sys_script_exec_t.  

When httpd_t executes a script labeled httpd_sys_script_exec_t SELinux will transition the new process to httpd_sys_script_t.

SELinux policy allowd processes running as httpd_sys_script_t is to write to /tmp, so it was successfull in creating /tmp/aa.

If you did this and looked at the content in /tmp it would be labeled httpd_tmp_t

httpd_tmp_t.

Lets look at which files httpd_sys_script_t is allowed to write to on my Rawhide box.

# sesearch -A -s httpd_sys_script_t -c file -p write -C | grep open | grep -v ^D
   allow httpd_sys_script_t httpd_sys_rw_content_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow httpd_sys_script_t anon_inodefs_t : file { ioctl read write getattr lock append open } ; 
   allow httpd_sys_script_t httpd_sys_script_t : file { ioctl read write getattr lock append open } ; 
   allow httpd_sys_script_t httpd_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 

httpd_sys_script_t is a process label which only applies to content in /proc.  This means processes running as httpd_sys_script_t can write to there process data.

anon_inodefs_t is an in memory label, most likely not on your disk.

The only on disk places it can write files labeled httpd_sys_rw_content_t and /tmp.

grep httpd_sys_rw_content_t /etc/selinux/targeted/contexts/files/file_contexts

or on my box

# find /etc -context "*:httpd_sys_rw_content_t:*"
/etc/BackupPC
/etc/BackupPC/config.pl
/etc/BackupPC/hosts
/etc/glpi

With SELinux disabled, this hacked process would be allowed to write any content that is world writable on your system as well as any content owned by the apache user or group.

Lets look at what it can read.

sesearch -A -s httpd_sys_script_t -c file -p read -C | grep open | grep -v ^D | grep -v exec_t
   allow domain locale_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t iso9660_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t httpd_sys_ra_content_t : file { ioctl read create getattr lock append open } ; 
   allow httpd_sys_script_t httpd_sys_rw_content_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow httpd_sys_script_t squirrelmail_spool_t : file { ioctl read getattr lock open } ; 
   allow domain ld_so_t : file { ioctl read getattr execute open } ; 
   allow httpd_sys_script_t anon_inodefs_t : file { ioctl read write getattr lock append open } ; 
   allow httpd_sys_script_t sysctl_kernel_t : file { ioctl read getattr lock open } ; 
   allow domain base_ro_file_type : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t httpd_sys_script_t : file { ioctl read write getattr lock append open } ; 
   allow nsswitch_domain cert_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type etc_runtime_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type fonts_cache_t : file { ioctl read getattr lock open } ; 
   allow domain mandb_cache_t : file { ioctl read getattr lock open } ; 
   allow domain abrt_t : file { ioctl read getattr lock open } ; 
   allow domain lib_t : file { ioctl read getattr lock execute open } ; 
   allow domain man_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t cifs_t : file { ioctl read getattr lock execute execute_no_trans entrypoint open } ; 
   allow domain sysctl_vm_overcommit_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t nfs_t : file { ioctl read getattr lock execute execute_no_trans entrypoint open } ; 
   allow kernel_system_state_reader proc_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain passwd_file_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain sssd_public_t : file { ioctl read getattr lock open } ; 
   allow domain cpu_online_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type public_content_rw_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain etc_runtime_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain hostname_etc_t : file { ioctl read getattr lock open } ; 
   allow domain ld_so_cache_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain sssd_var_lib_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type public_content_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain krb5_conf_t : file { ioctl read getattr lock open } ; 
   allow domain abrt_var_run_t : file { ioctl read getattr lock open } ; 
   allow domain textrel_shlib_t : file { ioctl read getattr execute execmod open } ; 
   allow httpd_sys_script_t httpd_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow domain machineid_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t mysqld_etc_t : file { ioctl read getattr lock open } ; 
   allow domain rpm_script_tmp_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain samba_var_t : file { ioctl read getattr lock open } ; 
   allow domain sysctl_crypto_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain net_conf_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type etc_t : file { ioctl read getattr execute execute_no_trans open } ; 
   allow httpd_script_type fonts_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type ld_so_t : file { ioctl read getattr execute execute_no_trans open } ; 
   allow nsswitch_domain file_context_t : file { ioctl read getattr lock open } ; 
   allow httpd_sys_script_t httpd_squirrelmail_t : file { ioctl read getattr lock append open } ; 
   allow httpd_script_type base_ro_file_type : file { ioctl read getattr lock execute execute_no_trans open } ; 
   allow httpd_sys_script_t snmpd_var_lib_t : file { ioctl read getattr lock open } ; 
   allow nsswitch_domain samba_etc_t : file { ioctl read getattr lock open } ; 
   allow domain man_cache_t : file { ioctl read getattr lock open } ; 
   allow httpd_script_type bin_t : file { ioctl read getattr lock execute execute_no_trans open } ; 
   allow httpd_script_type lib_t : file { ioctl read getattr lock execute execute_no_trans open } ; 
   allow httpd_sys_script_t httpd_sys_content_t : file { ioctl read getattr lock execute execute_no_trans entrypoint open } ; 
   allow nsswitch_domain etc_t : file { ioctl read getattr lock open } ; 
ET allow nsswitch_domain cert_t : file { ioctl read getattr lock open } ; [ authlogin_nsswitch_use_ldap ]
ET allow nsswitch_domain slapd_cert_t : file { ioctl read getattr lock open } ; [ authlogin_nsswitch_use_ldap ]
ET allow nsswitch_domain net_conf_t : file { ioctl read getattr lock open } ; [ authlogin_nsswitch_use_ldap ]
ET allow domain sysctl_kernel_t : file { ioctl read getattr lock open } ; [ fips_mode ]

Looks like a lot of types, but most of these are System Types bin_t, lib_t ld_so_t, man_t fonts_t,  most stuff under /usr etc.

It would be allowed to read /etc/passwd (passwd_t) and most content in /etc.  

It can read apache static content, like web page data.

Well what can't it read?

user_home_t - This is where I keep my credit card data
usr_tmp_t where an admin might have left something
Other content in /var
*db_t - No database data.
It can not read most of apache runtime data like apache content in /var/lib or /var/log or /etc/httpd

With SELinux disabled, this process would be allowed to read any content that is world readable on your system as well as any content owned by the apache user our group.

We also need to look at what domains httpd_sys_script_t can transition to?

# sesearch -T -s httpd_sys_script_t -c process -C | grep -v ^D
Found 9 semantic te rules:
   type_transition httpd_sys_script_t httpd_rotatelogs_exec_t : process httpd_rotatelogs_t; 
   type_transition httpd_sys_script_t abrt_helper_exec_t : process abrt_helper_t; 
   type_transition httpd_sys_script_t antivirus_exec_t : process antivirus_t; 
   type_transition httpd_sys_script_t sepgsql_ranged_proc_exec_t : process sepgsql_ranged_proc_t; 
   type_transition httpd_sys_script_t sepgsql_trusted_proc_exec_t : process sepgsql_trusted_proc_t; 

SELinux would also block the process executing a setuid process to raise its capabilities.

Now this is a horrible exploit but as you can see SELinux would probably have protected a lot/most of your valuable data on your machine.  It would buy you time for you to patch your system.

Did you setenforce 1?

September 24, 2014

Bash specially-crafted environment variables code injection attack

Update 2014-09-30 19:30 UTC

Questions have arisen around whether Red Hat products are vulnerable to CVE-2014-6277 and CVE-2014-6278.  We have determined that RHSA-2014:1306, RHSA-2014:1311, and RHSA-2014:1312 successfully mitigate the vulnerability and no additional actions need to be taken.


 

Update 2014-09-26 12:00 UTC

We have written a FAQ to address some of the more common questions seen regarding the recent bash issues.

Frequently Asked Questions about the Shellshock Bash flaws


Update 2014-09-26 02:20 UTC

Red Hat has released patched versions of Bash that fix CVE-2014-7169.  Information regarding these updates can be found in the errata.  All customers are strongly encouraged to apply the update as this flaw is being actively attacked in the wild.
Fedora has also released a patched version of Bash that fixes CVE-2014-7169.  Additional information can be found on Fedora Magazine.

Update 2014-09-25 16:00 UTC

Red Hat is aware that the patch for CVE-2014-6271 is incomplete. An attacker can provide specially-crafted environment variables containing arbitrary commands that will be executed on vulnerable systems under certain conditions. The new issue has been  assigned CVE-2014-7169.
We are working on patches in conjunction with the upstream developers as a critical priority. For details on a workaround, please see the knowledgebase article.
Red Hat advises customers to upgrade to the version of Bash which contains the fix for CVE-2014-6271 and not wait for the patch which fixes CVE-2014-7169. CVE-2014-7169 is a less severe issue and patches for it are being worked on.

Bash or the Bourne again shell, is a UNIX like shell, which is perhaps one of the most installed utilities on any Linux system. From its creation in 1980, Bash has evolved from a simple terminal based command interpreter to many other fancy uses.

In Linux, environment variables provide a way to influence the behavior of software on the system. They typically consists of a name which has a value assigned to it. The same is true of the Bash shell. It is common for a lot of programs to run Bash shell in the background. It is often used to provide a shell to a remote user (via ssh, telnet, for example), provide a parser for CGI scripts (Apache, etc) or even provide limited command execution support (git, etc)

Coming back to the topic, the vulnerability arises from the fact that you can create environment variables with specially-crafted values before calling the Bash shell. These variables can contain code, which gets executed as soon as the shell is invoked. The name of these crafted variables does not matter, only their contents. As a result, this vulnerability is exposed in many contexts, for example:

  • ForceCommand is used in sshd configs to provide limited command execution capabilities for remote users. This flaw can be used to bypass that and provide arbitrary command execution. Some Git and Subversion deployments use such restricted shells. Regular use of OpenSSH is not affected because users already have shell access.
  • Apache server using mod_cgi or mod_cgid are affected if CGI scripts are either written in Bash, or spawn subshells. Such subshells are implicitly used by system/popen in C, by os.system/os.popen in Python, system/exec in PHP (when run in CGI mode), and open/system in Perl if a shell is used (which depends on the command string).
  • PHP scripts executed with mod_php are not affected even if they spawn subshells.
  • DHCP clients invoke shell scripts to configure the system, with values taken from a potentially malicious server. This would allow arbitrary commands to be run, typically as root, on the DHCP client machine.
  • Various daemons and SUID/privileged programs may execute shell scripts with environment variable values set / influenced by the user, which would allow for arbitrary commands to be run.
  • Any other application which is hooked onto a shell or runs a shell script as using Bash as the interpreter. Shell scripts which do not export variables are not vulnerable to this issue, even if they process untrusted content and store it in (unexported) shell variables and open subshells.

Like “real” programming languages, Bash has functions, though in a somewhat limited implementation, and it is possible to put these Bash functions into environment variables. This flaw is triggered when extra code is added to the end of these function definitions (inside the enivronment variable). Something like:

$ env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
 vulnerable
 this is a test

The patch used to fix this flaw, ensures that no code is allowed after the end of a Bash function. So if you run the above example with the patched version of Bash, you should get an output similar to:

 $ env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
 bash: warning: x: ignoring function definition attempt
 bash: error importing function definition for `x'
 this is a test

We believe this should not affect any backward compatibility. This would, of course, affect any scripts which try to use environment variables created in the way as described above, but doing so should be considered a bad programming practice.

Red Hat has issued security advisories that fixes this issue for Red Hat Enterprise Linux. Fedora has also shipped packages that fixes this issue.

We have additional information regarding specific Red Hat products affected by this issue that can be found at https://access.redhat.com/site/solutions/1207723

Information on CentOS can be found at http://lists.centos.org/pipermail/centos/2014-September/146099.html.

September 18, 2014

Enterprise Linux 5.10 to 5.11 risk report

Red Hat Enterprise Linux 5.11 was released this month (September 2014), eleven months since the release of 5.10 in October 2013. So, as usual, let’s use this opportunity to take a look back over the vulnerabilities and security updates made in that time, specifically for Red Hat Enterprise Linux 5 Server.

Red Hat Enterprise Linux 5 is in Production 3 phase, being over seven years since general availability in March 2007, and will receive security updates until March 31st 2017.

Errata count

The chart below illustrates the total number of security updates issued for Red Hat Enterprise Linux 5 Server if you had installed 5.10, up to and including the 5.11 release, broken down by severity. It’s split into two columns, one for the packages you’d get if you did a default install, and the other if you installed every single package.

Note that during installation there actually isn’t an option to install every package, you’d have to manually select them all, and it’s not a likely scenario. For a given installation, the number of package updates and vulnerabilities that affected your systems will depend on exactly what you selected during installation and which packages you have subsequently installed or removed.

Security errata 5.10 to 5.11 Red Hat Enterprise Linux 5 ServerFor a default install, from release of 5.10 up to and including 5.11, we shipped 41 advisories to address 129 vulnerabilities. 8 advisories were rated critical, 11 were important, and the remaining 22 were moderate and low.

For all packages, from release of 5.10 up to and including 5.11, we shipped 82 advisories to address 298 vulnerabilities. 12 advisories were rated critical, 29 were important, and the remaining 41 were moderate and low.

You can cut down the number of security issues you need to deal with by carefully choosing the right Red Hat Enterprise Linux variant and package set when deploying a new system, and ensuring you install the latest available Update release.

Critical vulnerabilities

Vulnerabilities rated critical severity are the ones that can pose the most risk to an organisation. By definition, a critical vulnerability is one that could be exploited remotely and automatically by a worm. However we also stretch that definition to include those flaws that affect web browsers or plug-ins where a user only needs to visit a malicious (or compromised) website in order to be exploited. Most of the critical vulnerabilities we fix fall into that latter category.

The 12 critical advisories addressed 33 critical vulnerabilities across just three different projects:

  • An update to NSS/NSPR: RHSA-2014:0916(July 2014). A race condition was found in the way NSS verified certain certificates which could lead to arbitrary code execution with the privileges of the user running that application.
  • Updates to PHP, PHP53: RHSA-2013:1813, RHSA-2013:1814
    (December 2013). A flaw in the parsing of X.509 certificates could allow scripts using the affected function to potentially execute arbitrary code. An update to PHP: RHSA-2014:0311
    (March 2014). A flaw in the conversion of strings to numbers could allow scripts using the affected function to potentially execute arbitrary code.
  • Updates to Firefox, RHSA-2013:1268 (September 2013), RHSA-2013:1476 (October 2013), RHSA-2013:1812 (December 2013), RHSA-2014:0132 (February 2014), RHSA-2014:0310 (March 2014), RHSA-2014:0448 (Apr 2014), RHSA-2014:0741 (June 2014), RHSA-2014:0919 (July 2014) where a malicious web site could potentially run arbitrary code as the user running Firefox.

Updates to correct 32 of the 33 critical vulnerabilities were available via Red Hat Network either the same day or the next calendar day after the issues were public.

Overall, for Red Hat Enterprise Linux 5 since release until 5.11, 98% of critical vulnerabilities have had an update available to address them available from the Red Hat Network either the same day or the next calendar day after the issue was public.

Other significant vulnerabilities

Although not in the definition of critical severity, also of interest are other remote flaws and local privilege escalation flaws:

  • A flaw in glibc, CVE-2014-5119, fixed by RHSA-2014:1110 (August 2014). A local user could use this flaw to escalate their privileges. A public exploit is available which targets the polkit application on 32-bit systems although polkit is not shipped in Red Hat Enterprise Linux 5. It may be possible to create an exploit for Red Hat Enterprise Linux 5 by targeting a different application.
  • Two flaws in squid, CVE-2014-4115, and CVE-2014-3609, fixed by RHSA-2014:1148 (September 2014). A remote attacker could cause Squid to crash.
  • A flaw in procmail, CVE-2014-3618, fixed by RHSA-2014:1172 (September 2014). A remote attacker could send an email with specially crafted headers that, when processed by formail, could cause procmail to crash or, possibly, execute arbitrary code as the user running formail.
  • A flaw in Apache Struts, CVE-2014-0114, fixed by RHSA-2014:0474 (April 2014). A remote attacker could use this flaw to manipulate the ClassLoader used by an application server running Stuts 1 potentially leading to arbitrary code execution under some conditions.
  • A flaw where yum-updatesd did not properly perform RPM signature checks, CVE-2014-0022, fixed by RHSA-2014:1004 (Jan 2014). Where yum-updatesd was configured to automatically install updates, a remote attacker could use this flaw to install a malicious update on the target system using an unsigned RPM or an RPM signed with an untrusted key.
  • A flaw in the kernel floppy driver, CVE-2014-1737, fixed by RHSA-2014:0740 (June 2014). A local user who has write access to /dev/fdX on a system with floppy drive could use this flaw to escalate their privileges. A public exploit is available for this issue. Note that access to /dev/fdX is by default restricted only to members of the floppy group.
  • A flaw in libXfont, CVE-2013-6462, fixed by RHSA-2014:0018 (Jan 2014). A local user could potentially use this flaw to escalate their privileges to root.
  • A flaw in xorg-x11-server, CVE-2013-6424, fixed by RHSA-2013:1868 (Dec 2013). An authorized client could potentially use this flaw to escalate their privileges to root.
  • A flaw in the kernel QETH network device driver, CVE-2013-6381, fixed by RHSA-2014:0285 (March 2014). A local, unprivileged user could potentially use this flaw to escalate their privileges. Note this device is only found on s390x architecture systems.

Note that Red Hat Enterprise Linux 5 was not affected by the OpenSSL issue, CVE-2014-0160, “Heartbleed”.

Previous update releases

We generally measure risk in terms of the number of vulnerabilities, but the actual effort in maintaining a Red Hat Enterprise Linux system is more related to the number of advisories we released: a single Firefox advisory may fix ten different issues of critical severity, but takes far less total effort to manage than ten separate advisories each fixing one critical PHP vulnerability.

To compare these statistics with previous update releases we need to take into account that the time between each update release is different. So looking at a default installation and calculating the number of advisories per month gives the following chart:

Security Errata per month to 5.10This data is interesting to get a feel for the risk of running Enterprise Linux 5 Server, but isn’t really useful for comparisons with other major versions, distributions, or operating systems — for example, a default install of Red Hat Enterprise Linux 4AS did not include Firefox, but 5 Server does. You can use our public security measurement data and tools, and run your own custom metrics for any given Red Hat product, package set, time scales, and severity range of interest.

See also:
5.10, 5.9, 5.8, 5.7, 5.6, 5.5, 5.4, 5.3, 5.2, and 5.1 risk reports.

September 16, 2014

Electronic Yellow Sticky of Doom

Instead of writing passwords on a piece of paper, you can save them on the computer. The most obvious way to do this is with a text document or a spreadsheet.

Bad idea!

Your password is now subject to being hacked. Since your computer is almost certainly connected to the network (or you wouldn’t have so many passwords!), you are now threatened by the entire Internet, not just the people around you.

OK, you can save the document or spreadsheet as a password protected document. Less bad, still not good. It is a step in the right direction, but there are better ways to do it. This is a situation where you should use the right tool for the job – in this case a password manager.

A password manager is an application designed to manage your passwords. It will typically have a strong security model, the ability to organize passwords by the application or web site they go to, and the ability to generate passwords. Many password managers are designed to integrate with applications and the Web, where they can automatically provide the username and password and login for you.

Instead of trying to remember multiple passwords, all you have to do is remember a single password to open the password manager. (Yes, you can write down the password to your password manager. In fact, you probably should – and lock it up in a secure location like a safety deposit box. Think of this written copy as a backup, where security is more important than ease of access. You may even want to print out the contents of your password manager and lock them up in the safety deposit box.)

There are many password managers available, so you should do your research before choosing one. Reviews are available from a number of sources such as LWN  or Tech Radar. Be especially careful when choosing a password manager from an application store that has millions of applications – many of these applications are poorly written and may contain malware. For something as important as a password manager you need to do your homework!

An example of a highly regarded password manager is KeePass and the KeePassX version for Linux. This is an open source application, so the code has been widely reviewed. If you are running Linux, KeePassX is probably included in your Linux distribution.

KeePass creates an encrypted database of passwords on your system. In fact, it supports multiple databases, so you can keep personal and work related passwords separated. KeePass also has an internal group structure to organize passwords. This allows you to to have groups like finance, social, sports, email, and work for your various accounts.

A strength of password managers is that most of them have a password generator. Again looking at KeePass as a specific example, you can control the length of the password, uppercase/lower case, numbers, special characters, and even whitespace. You also have the option of specifying “pronounceable” passwords.(For some values of pronounceable…)

Using a password manager makes it feasible to use more secure passwords – specifying 24 characters, uppercase/lowercase/numbers/special characters and generating a unique password for each application or site is easy. It is also easy to change passwords regularly – just have your password manager generate a new password, which it then remembers.

Password managers aren’t perfect, but they are a useful tool for making passwords as good as they can be. Using a good password manager is more secure than re-using an easily guessable password across multiple applications.


September 15, 2014

Confusion with sesearch.
I just saw an email where a user was asking why sesearch is showing access but the access is still getting denied.

I'm running CentOS 6. I've httpd running which accesses a file but it results in access denied with the following --

type=AVC msg=audit(1410680693.979:40): avc:  denied  { read } for pid=987 comm="httpd" name="README.txt" dev=dm-0 ino=12573 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file

However,

sesearch -A | grep 'allow httpd_t' | grep ': file' | grep user_home_t
   allow httpd_t user_home_t : file { ioctl read getattr lock open } ;
   allow httpd_t user_home_t : file { ioctl read getattr lock open } ;


sesearch

sesearch is a great tool that we use all the time.  It allows you to analyze and look the the SELInux policy.  It is part of the setools-console package.  It uses the "Apol" libraries to examine policy, the same libraries we have used to build the new tool set sepolicy.

The problem was that he was using sesearch incorrectly.  sesearch -A shows you all possible, allow rules not just the allow rules that are currently in effect.

The user needs to add a -C option to the sesearch.  The -C options shows you the booleans required for that access.  It also shows a capital E or D indicating whether or not the boolean is enabled or disabled in policy at the beginning of the line.

On my machine, I will use a more complicated command, this command says show the allow rules for a source type of httpd_t, and a target type of user_home_t, permission=read on a class=file.

sesearch -A -C -s httpd_t -t user_home_t -p read -c file
Found 1 semantic av rules:
DT allow httpd_t user_home_type : file { ioctl read getattr lock open } ; [ httpd_read_user_content ]


As you can see on my machine the boolean is disabled, so Apache is not allowed to read general content in my homedir, which I assume was true for the user.   If  the user wants to allow httpd_t to read all general content in the users homedir you can turn on the httpd_read_user_content boolean.

If you want to allow it to read just a certain directories/files, recommended,  you should change the label on the directory.  BTW ~/public_html and ~/www already have the correct labeling.

matchpathcon ~/public_html ~/www
/home/dwalsh/public_html    staff_u:object_r:httpd_user_content_t:s0
/home/dwalsh/www    staff_u:object_r:httpd_user_content_t:s0


I would not want to let the apache process read general content in my homedir, since I might be storing critical stuff like credit card data, passwords, and unflattering pictures of me in there. :^)

September 10, 2014

What is this new unconfined_service_t type I see on Fedora 21 and RHEL7?
Everyone that has ever used SELinux knows that the unconfined_t domain is a process label that is not confined.  But this is not the only unconfined domain on a SELinux system.  It is actually the default domain of a user that logs onto a system.  In a lot of ways we should have used the type unconfined_user_t rather then unconfined_t.

By default in an SELinux Targeted system there are lots of other unconfined domains.  We have these so that users can run programs/services without SELinux interfering if SELinux does not know about them. You can list the unconfined domains on your system using the following command.

seinfo -aunconfined_domain_type -x

In RHEL6 and older versions of Fedora, we used to run system services as initrc_t by default.  Unless someone has written a policy for them.  initrc_t is an unconfined domain by default, unless you disabled the unconfined.pp module. Running unknown serivices as initrc_t allows administrators to run an application service, even if no policy has never been written for it.

In RHEL6 we have these rules:

init_t @initrc_exec_t -> initrc_t
init_t @bin_t -> initrc_t

If an administrator added an executable service to /usr/sbin or /usr/bin, the init system would run the service as initrc_t.

We found this to be problematic, though. 

The problem was that we have lots of transition rules out of initrc_t.  If a program we did not know about was running as initrc_t and executed a program like rsync to copy data between servers, SELinux would transition the program to rsync_t and it would blow up.  SELinux mistakenly would think that rsync was set up in server mode, not client mode.  Other transition rules could also cause breakage. 

We decided we needed a new unconfined domain to run services with, that would have no transition rules.  We introduced the unconfined_service_t domain.  Now we have:

init_t @bin_t -> unconfined_service_t

A process running as unconfined_service_t is allowed to execute any confined program, but stays in the unconfined_service_t domain.  SELinux will not block any access. This means by default, if you install a service that does not have policy written for it, it should work without SELinux getting in the way.

Sometimes applications are installed in fairly random directories under /usr or /opt (Or in oracle's case /u01), which end up with the label of usr_t, therefore we added these transition rules to policy.

# sesearch -T -s init_t  | grep unconfined_service_t
type_transition init_t bin_t : process unconfined_service_t;
type_transition init_t usr_t : process unconfined_service_t;
You can see it in Fedora21.

Bottom Line

Hopefully unconfined_service_t will make leaving SELinux enabled easier on systems that have to run third party services, and protect the other services that run on your system.


Note:
Thanks to Simon Sekidde and Miroslav Grepl for helping to write this blog.
TLS landscape

Transport Layer Security (TLS) or, as it was known in the beginnings of the Internet, Secure Sockets Layer (SSL) is the technology responsible for securing communications between different devices. It is used everyday by nearly everyone using the globe-spanning network.

Let’s take a closer look at how TLS is used by servers that underpin the World Wide Web and how the promise of security is actually executed.

Adoption

Hyper Text Transfer Protocol (HTTP) in versions 1.1 and older make encryption (thus use of TLS) optional. Given that the upcoming HTTP 2.0 will require use of TLS and that Google now uses the HTTPS in its ranking algorithm, it is expected that many sites will become TLS-enabled.

Surveying the Alexa top 1 million sites, most domains still don’t provide secure communication channel for their users.

Just under 40% of HTTP servers support TLS or SSL and present valid certificates.

Just under 40% of HTTP servers support TLS or SSL and present valid certificates.

Additionally, if we look at the version of the protocol supported by the servers most don’t support the newest (and most secure) version of the protocol TLSv1.2.  Of more concern is the number of sites that support the completely insecure SSLv2 protocol.

Only half of HTTPS servers support TLS 1.2

Only half of HTTPS servers support TLS 1.2

(There are no results for SSLv2 for first 3 months because of error in software that was collecting data.)

One of the newest and most secure ciphers available in TLS is Advanced Encryption Standard (AES) in Galois/Counter Mode (AES-GCM). Those ciphers provide good security, resiliency against known attacks (BEAST and Lucky13), and very good performance for machines with hardware accelerators for them (modern Intel and AMD CPUs, upcoming ARM).

Unfortunately, it is growing a bit slower than TLS adoption in general, which means that some of the newly deployed servers aren’t using new cryptographic libraries or are configured to not use all of their functions.

Only 40% of TLS web servers support AES-GCM ciphersuites.

Only 40% of TLS web servers support AES-GCM ciphersuites.

Bad recommendations

A few years back, a weakness in TLS 1.0 and SSL 3 was shown to be exploitable in the BEAST attack. The recommended workaround for it was to use RC4-based ciphers. Unfortunately, we later learned that the RC4 cipher is much weaker than it was previously estimated. As the vulnerability that allowed BEAST was fixed in TLSv1.1, using RC4 ciphers with new protocol versions was always unnecessary. Additionally, now all major clients have implemented workarounds for this attack, which currently makes using RC4 a bad idea.

Unfortunately, many servers prefer RC4 and some (~1%) actually support only RC4.  This makes it impossible to disable this weak cipher on client side to force the rest of servers (nearly 19%) to use different cipher suite.

RC4 is still used with more than 18% of HTTPS servers.

RC4 is still used with more than 18% of HTTPS servers.

The other common issue, is that many certificates are still signed using the obsolete SHA-1. This is mostly caused by backwards compatibility with clients like Windows XP pre SP2 and old phones.

SHA-256 certificates only recently started growing in numbers

SHA-256 certificates only recently started growing in numbers

The sudden increase in the SHA-256 between April and May was caused by re-issuance of certificates in the wake of Heartbleed.

Bad configuration

Many servers also support insecure cipher suites. In the latest scan over 3.5% of servers support some cipher suites that uses AECDH key exchange, which is completely insecure against man in the middle attacks. Many servers also support single DES (around 15%) and export grade cipher suites (around 15%). In total, around 20% of servers support some kind of broken cipher suite.

While correctly implemented SSLv3 and later shouldn’t allow negotiation of those weak ciphers if stronger ones are supported by both client and server, at least one commonly used implementation had a vulnerability that did allow for changing the cipher suite to arbitrary one commonly supported by both client and server. That’s why it is important to occasionally clean up list of supported ciphers, both on server and client side.

Forward secrecy

Forward secrecy, also known as perfect forward secrecy (PFS), is a property of a cipher suite that makes it impossible to decrypt communication between client and server when the attacker knows the server’s private key. It also protects old communication in case the private key is leaked or stolen. That’s why it is such a desirable property.

The good news is that most servers (over 60%) not only support, but will actually negotiate cipher suites that provide forward secrecy with clients that support it. The used types are split essentially between 1024 bit DHE and 256 bit ECDHE, scoring respectively 29% and 33% of all servers in latest scan. The amount of servers that do negotiate PFS enabled cipher suites is also steadily growing.

PFS support among TLS-enabled HTTP servers

PFS support among TLS-enabled HTTP servers

Summary

Most Internet facing servers are badly configured, sometimes it is caused by lack of functionality in software, like in case of old Apache 2.2.x releases that don’t support ECDHE key exchange, and sometimes because of side effects of using new software with old configuration (many configuration tutorials suggested using !ADH in cipher string to disable anonymous cipher suites, that unfortunately doesn’t disable anonymous Elliptic Curve version of DH – AECDH, for that, use of !aNULL is necessary).

Thankfully, the situation seems to be improving, unfortunately rather slowly.

If you’re an administrator of a server, consider enabling TLS.  Performance issues when encryption was slow and taxing on servers are long gone. If you already use TLS, double check your configuration preferably using the Mozilla guide to server configuration as it is regularly updated. Make sure you enable PFS cipher suites and put them above non-PFS ciphers and that you as well as the Certificate Authority you’ve chosen, use modern crypto (SHA-2) and large key sizes (at least 2048 bit RSA).

If you’re a user of a server and you’ve noticed that the server doesn’t use correct configuration, try contacting the administrator – he may have just forgotten about it.

September 05, 2014

Think before you just blindly audit2allow -M mydomain
Don't Allow Domains to write Base SELinux Types

A few years ago I wrote a blog and paper on the four causes of SELinux errors.

The first two most common causes were labeling issues and SELinux needs to know.

Easiest way to explain this is a daemon wants to write to a certain file and SELinux blocks
the application from writing.  In SELinux terms the Process DOMAIN (httpd_t) wants to write to the file type (var_lib_t)
and it is blocked.  Users have potentially three ways of fixing this.

  1. Change the type of the file being written.

    • The object might be mislabeled and restorecon of the object fixes the issue

    • Change the label to httpd_var_lib_t using semanage and restorecon
        semanage fcontext -a -t httpd_var_lib_t '/var/lib/foobar(/.*)?'
        restorecon -R -v /var/lib/foobar


  2. There might be a boolean available to allow the Process Domain to write to the file type
      setsebool -P HTTP_BOOLEAN 1

  3. Modify policy using audit2allow
      grep httpd_t /var/log/audit/audit.log | audit2allow -M myhttp
      semodule -i myhttpd.pp

Sadly the third option is the least recommended and the most often used. 

The problem is it requires no thought and gets SELinux to just shut up.

In RHEL7 and latest Fedoras, the audit2allow tools will suggest a boolean when you run the AVC's through it.  And setroubleshoot has been doing this for years. setroubleshoot even will suggest potential types that you could change the destination object to use.

The thing we really want to stop is domains writing to BASE types.  If I allow a confined domain to write to a BASE type like etc_t or usr_t, then a hacked system can attack other domains, since almost all other domains need to read some etc_t or usr_t content.

BASE TYPES

One other feature we have added in RHEL7 and Fedora is a list of base types.  SELinux has a mechanism for grouping types based on an attribute.
We have to new attributes base_ro_file_type and base_file_type.  You can see the objects associated with these attributes using the seinfo command.

seinfo -abase_ro_file_type -x
   base_ro_file_type
      etc_runtime_t
      etc_t
      src_t
      shell_exec_t
      system_db_t
      bin_t
      boot_t
      lib_t
      usr_t
      system_conf_t
      textrel_shlib_t

$ seinfo -abase_file_type -x
   base_file_type
      etc_runtime_t
      unlabeled_t
      device_t
      etc_t
      src_t
      shell_exec_t
      home_root_t
      system_db_t
      var_lock_t
      bin_t
      boot_t
      lib_t
      mnt_t
      root_t
      tmp_t
      usr_t
      var_t
      system_conf_t
      textrel_shlib_t
      lost_found_t
      var_spool_t
      default_t
      var_lib_t
      var_run_t

If you use audit2allow to add a rule to allow a domain to write to one of the base types:

Most likely you are WRONG

If you have a domain that is attempting to write to one of these base types, then you most likely need to change the type of the destination object using the semanage/restorecon commands mentioned above.
The difficult thing for the users to figure out; "What type should I change the object to?"

We have added new man pages that show you the types that you program is allowed to write

man httpd_selinux

Look for writable types?

If your domain httpd_t is attempting to write to var_lib_t then look for httpd_var_lib_t. "sepolicy gui" is a new gui tool to help you understand the types also.

Call to arms:
If an enterprising hacker wanted to write some code, it would be nice to build this knowledge into audit2allow.  Masters Thesis anyone???

September 03, 2014

Is your software fixed?

A common query seen at Red Hat is “our auditor says our Red Hat machines are vulnerable to CVE-2015-1234, is this true?” or “Why hasn’t Red Hat updated software package foo to version 1.2.3?” In other words, our customers (and their auditors) are not sure whether or not we have fixed a security vulnerability, or if a given package is up to date with respect to security issues. In an effort to help our security-conscious customers, Red Hat make this information available in an easy to consume format.

What’s the deal with CVEs?

Red Hat is committed to the CVE process. To quote our CVE compatibility page:

We believe that giving our users accurate and complete information about security issues is extremely important. By including CVE names when we discuss security issues in our services and products, we can help users cross-reference vulnerabilities so they spend less time investigating and categorizing security events.

Red Hat has a representative on the CVE Editorial Board and declared CVE compatibility in April 2002.

To put it simply: if it’s a security issue and we fix it in an RHSA it gets a CVE. In fact we usually assign CVEs as soon as we determine a security issue exists (additional information on determining what constitutes a security issue can be found on our blog.).

How to tell if you software is fixed?

A CVE can be queried at our public CVE page.  Details concerning the vulnerability, the CVSS v2 metrics, and security errata are easily accessible from here.

To verify you system is secure, simply check which version of the package you have installed and if the NVR of your installed package is equal to or higher than the NVR of the package in the RHSA then you’re safe.

What’s an NVR?

The NVR is the Name-Version-Release of the package. The Heartbleed RHSA lists packages such as: openssl-1.0.1e-16.el6_5.7.x86_64.rpm. So from this we see a package name of “openssl” (a hyphen), a version of 1.0.1e (a hyphen) and the release is 16.el6_5.7. Assuming you are running RHEL 6, x86_64, if you have openssl version 1.0.1e release 16.el6_5.7 or later you’re protected from the Heartbleed issue.

Please note, there is an additional field called “epoch”, this field actually supersedes the version number (and release), most packages do not have an epoch number, however a larger epoch number means that a package can override a package with a lower epoch. This can be useful, for example, if you need a custom modified version of a package that also exists in RPM repos you are already using.  By assigning an epoch number to your package RPM you can override the same version package RPMs from another repo even if they have a higher version number. So be aware, using packages that have the same name and a higher epoch number you will not get security updates unless you specifically create new RPM’s with the epoch number and the security update.

But what if there is no CVE page?

As part of our process the CVE pages are automatically created if public entries exist in Bugzilla.  CVE information may not be available if the details of the vulnerability have not been released or the issue is still embargoed.  We do encourage responsible handling of vulnerabilities and sometimes delay CVE information from being made public.

Also, CVE information will not be created if the software we shipped wasn’t vulnerable.

How to tell if your system is vulnerable?

If you have a specific CVE or set of CVEs that you are worried about you can use the yum command to see if your system is vulnerable. Start by installing yum-plugin-security:

sudo yum install yum-plugin-security

Then query the CVE you are interested in, for example on a RHEL 7 system without the OpenSSL update:

[root@localhost ~]# yum updateinfo info --cve CVE-2014-0224
===============================================
 Important: openssl security update
===============================================
 Update ID : RHSA-2014:0679
 Release : 
 Type : security
 Status : final
 Issued : 2014-06-10 00:00:00
 Bugs : 1087195 - CVE-2010-5298 openssl: freelist misuse causing 
        a possible use-after-free
 : 1093837 - CVE-2014-0198 openssl: SSL_MODE_RELEASE_BUFFERS NULL
   pointer dereference in do_ssl3_write()
 : 1103586 - CVE-2014-0224 openssl: SSL/TLS MITM vulnerability
 : 1103593 - CVE-2014-0221 openssl: DoS when sending invalid DTLS
   handshake
 : 1103598 - CVE-2014-0195 openssl: Buffer overflow via DTLS 
   invalid fragment
 : 1103600 - CVE-2014-3470 openssl: client-side denial of service 
   when using anonymous ECDH
 CVEs : CVE-2014-0224
 : CVE-2014-0221
 : CVE-2014-0198
 : CVE-2014-0195
 : CVE-2010-5298
 : CVE-2014-3470
Description : OpenSSL is a toolkit that implements the Secure 
Sockets Layer

If your system is up to date or the CVE doesn’t affect the platform you’re on then no information will be returned.

Conclusion

Red Hat Product Security makes available as much information as we can regarding vulnerabilities affecting our customers.  This information is available on our customer portal as well as within the software repositories. As you can see it is both easy and quick to determine if your system is up to date on security patches with the provided information and tools.

The following checklist can be used to check if systems or packages are affected by specific security issues:

1) Check if the issue you’re concerned about has a CVE and check the Red Hat CVE page:

https://access.redhat.com/security/cve/CVE-2014-0224

2) Check to see if your system is up to date for that issue:

sudo yum install yum-plugin-security 
yum updateinfo info --cve CVE-2014-0224

3) Alternatively you can check the package NVR in the RHSA errata listed in the CVE page (in #1) and compare it to the packages on your system to see if they are the same or greater.
4) If you still have questions please contact Red Hat Support!

Three Types of Keystone Users

Keystone supports multiple backend for Identity.  While SQL is the default, LDAP is one of the most used.  With Federation protocols, the user data won’t even be stored in the identity backend at all.  All three of these approaches have different use cases, and all work together.  The way that that I’ve come to think of them is as  three types of Keystone users:  employees, partners, and customers.  Take the following as a metaphor, not literal  truth.

LDAP is the enterprise database of choice for employees  (technically, a Directory server, accessed by LDAP, but we call the Database LDAP to distinguish from relation database which we call SQL due to the query language, to should satisfy the pedants.)  LDAP entries are managed by HR (with help from IT) and is considered read-only data by most applications in the enterprise.  The Keystone users in LDAP are people that we pay money to use OpenStack.

Partners are people from other companies that we work with.  They have their own LDAP servers, but we don’t have access to them via LDAP.  Instead, we get a SAML document, which is basically a snapshot of an LDAP query, signed with a private key.  Partners are people we work with to make money.  We neither pay them nor do they pay us.

Customers are people who pay us to use our OpenStack deployment.  There are lots of them.  They are quickly added to our Keystone store.  Since LDAP is read only, we stick them in SQL.  They belong to different companies, and people from different companies shouldn’t know about each other, so each of these companies are in their own domain.  Only SQL allows multiple domains to be added dynamically.

Obviously, this is greatly over simplified.  It does not account for service users.  SAML might be used for employees.  Another approach is that  everyone might be in LDAP, and LDAP is set to be read-write;  CERN does this.  But the three classes of users listed above each represents a different usage pattern for Keystone’s  identity store, and all three can and should  be supportable in the same Keystone deployment.

August 26, 2014

Yellow Sticky of Doom Revisited

Talking with security experts about the Yellow Sticky of Doom shows that the situation isn’t entirely bleak. They agree that posting notes on a monitor – or the bottom of a keyboard – is bad.

However, they recognize that (somewhat secure) passwords are difficult to remember and will be written down. They point out that combining written passwords with physical security can actually be a reasonable approach.

If you write your password down and place it in a locked desk drawer you achieve a significant level of security. Getting the password out of sight is a good start – rifling through someones desk drawer is usually noticed. And if you lock your desk when you leave you are establishing a reasonable level of commercial security. And the good news about desk drawers is that they can’t be accessed through the Internet!

This approach assumes that you have a reasonable level of physical security for your business or home. If you don’t, password security may be the least of your concerns.

There are a variety of ways to increase physical security, such as control of keys, using secure filing cabinets, or using a safe. Something as simple as a Locking Bar for 4 Drawer File provides significantly enhanced physical security beyond that of common desk locks.

This is an area where you need to look at security from a higher level. Once you recognize that passwords by themselves provide poor security and that passwords will be written down you can develop a rational approach. Consider computers, networks, people, policies, and physical security together – develop a real security policy, rather than passing down edicts that don’t work.

You can’t abolish the Yellow Stick of Doom. But moving it into a locked desk drawer is probably good enough.


August 21, 2014

Phishing

Kerberos was slow when talking to my demo machine. As part of debugging it, I was making DNS changes, so I pointed my machine directly to the DNS server. It was at my hosting provider, and authoritative for my domain.

As I tend to do, I idly checked Facebook. Its a bad habit, like biting nails. Sometimes I’m not even aware that I am doing it. This time, however, a browser warning brought me up short:

“Security Error: Domain Name Mismatch”

The certificate reported that it was valid for a domain that ended in the same domain name as the nameserver I was pointing at.

Someone just like me had the ability to push up whatever they wanted to the DNS server. This is usually fine: only the Authoritative DNS server for a site is allowed to replicate changes. It did mean, however, that anyone that was looking at this particular DNS server would be directed to something they were hosting themselves. I’m guessing it was a Phishing attempt as I did not actually go to their site to check.

Most of us run laptops set up to DNS from the DHCP server we connect to. Which means that if we are at a Coffee Shop, the local library, or the Gym, we are running against an unknown DNS server. The less trusted the location, the less reason to trust the DHCP server.

This is a nasty problem to work around. There are things you can do to mitigate, such as whitelisting DNS servers. The onus, however, should not be up to the end users. DNSSec attempts to address the issues. Until we have that, however, use HTTPS where ever possible. And check the certificates.

August 20, 2014

Greatest Threat: Yellow Sticky of Doom

We now get to what I consider the greatest threat to computer security: the Yellow Sticky of Doom!

Yellow Sticky

Passwords written down on yellow sticky notes. These are everywhere.

What is the difference between a secure facility and an insecure facility? In an insecure facility the yellow sticky notes are stuck to monitors. In a secure facility the yellow sticky notes are stuck to the bottom of the keyboard. In really secure facilities they are in desk drawers – and maybe even locked up!

The solution is obvious: ban people from writing down their passwords!

Except that this won’t work. Full stop. Period. Won’t. Work.

Why? Because passwords are crap for security.

Passwords that are difficult to guess or to crack with a brute force attack are impossible for people to remember – look at the ones in the yellow sticky above! All of these passwords were produced by a password generator with a high security setting. Anyone who can remember one of these passwords scares me!

Consider the usual guidelines for producing a secure password: 12-16 characters, no dictionary words, a combination of upper case, lower case, numbers, and punctuation. And changed every 1-6 months.

Right….

Human brains don’t work this way.

Correct Horse Battery Staple

If you want people to actually remember passwords, consider the way the human brain works. Look at XKCD on Password Strength: this is an example of a password that a human can remember. It builds on the way the mind and memory work, through chunking, context, and pattern recognition. Correct Horse Battery Staple has become an Internet meme – a code term referencing a way to make passwords somewhat work.

But, can your system handle it? Do you allow passwords this long? Do you allow spaces in passwords?

And look at your policies. If a person can remember a word, it is in a dictionary! The only thing a “no dictionary words” policy does is guarantee that passwords will be written down.

At a minimum, encourage pass phrases rather than classical passwords.

If you actually care about security, implement multi-factor authentication – a combination of what you know, what you have, and what you are.

Traditional passwords serve only one purpose - to allow you to blame innocent users for your mistakes. They are no longer an effective security or authentication mechanism. Forget trying to stop people from writing them down and get serious about security.

Get rid of the Yellow Sticky of Doom by making it obsolete!


August 18, 2014

Threat: Joe the Backhoe Operator

BackhoeSmall

Where Dennis the Weatherman is a proxy for all the threats nature can pose, Joe the Backhoe Operator is a proxy for man-made threats outside the data center.

Backhoe Fade is a familiar term in the telecommunications industry, where it refers to construction activities cutting cables. This can be anything from a single network link to a major fibre optic link affecting millions of people. The classic example is a backhoe operator digging in a field in the middle of nowhere who digs right through a cable, taking out a major telecommunications link.

Closely related to backhoe fade is damage to undersea cables, often from ships dragging anchors across the cables and severing them. And, of course, sharks… How Google Stops Sharks From Eating Undersea Cables

While not necessarily a classical security threat, and not a threat to system integrity in the same way as other threats we have discussed, backhoe fade is a great threat to system availability and business continuity.

Major data centers will typically have multiple redundant, physically separated network connections to allow them to route around network failures.

Unfortunately, it is much less common for individual buildings where people actually work to have such redundant network connections. If the hundreds of people in your office can’t get to the corporate data systems, it really doesn’t matter which end of the cable has been cut…


Musings on identity management

This post is an edited version of an email I sent to the Red Hat Identity Management (IdM) team mailing list that outlines the main take-aways from my first few months working on the FreeIPA identity management solution.

I’m over three months into my new gig on the identity management team at Red Hat now, so I would like to share a few thoughts about what I’ve learned about identity management.

I was excited to come into this role because of my innate interest in security and cryptography. I had little practical experience with PKI and security protocols beyond basic X.509/TLS and OpenPGP, so I have been relishing the opportunity to broaden my knowledge and experience and solve problems in this domain.

What I did not understand, when I joined, was just how much an effective IdM strategy and infrastructure can benefit businesses and large communities in the form of improved security and reduced risk (two sides of the same coin, one could argue) and of course, greater efficiency. The diversity of use cases and the versatility of our software to address these use cases also amazed me.

This added perspective motivates me to seek opportunities to talk to people and find out about their IdM needs and how existing offerings (ours or others) are falling short, and work out what we as a team can do to better meet and even anticipate their needs. It has also given me a foundation to explain to non-technical people what FreeIPA and related projects are all about, and help them understand how our solutions can help their business or community.

I say "community" above because I have begun to see that free software communities represent valuable proving grounds for FreeIPA. For example, a couple of weeks ago during PyCon Australia I was chatting to Nick Coghlan and learned that the Python community is currently struggling with a proliferation of identity silos – developer accounts, PSF memberships and roles, the main website, PyPI, and so on. Yet noone has put their hand up to address this. I didn’t quite commit to writing a PEP to fix all that (yet) but we agreed that this represents a great opportunity to employ FreeIPA to benefit an important project and community – important for our team and for Red Hat as well as for the software industry in general. How many other communities to whom we have links or on whom we rely could benefit from FreeIPA in a similar way? And how much will our solutions be improved, and new innovations discovered, by what we might learn in working with these communities to improve their identity management?

So, that’s most of what I wanted to say, but I want to thank you all for your assistance and encouragement during my first few months. It has been quite a shift adapting to working with a global team, but I am really enjoying working with you on Red Hat IdM and am excited for our future.

August 13, 2014

Fedora Security Team

Vulnerabilities in software happen.  When they get fixed it’s up to the packager to make those fixes available to the systems using the software.  Duplicating much of the response efforts that Red Hat Product Security performs for Red Hat products, the Fedora Security Team (FST) has recently been created to assist packagers get vulnerability fixes downstream in a timely manner.

At the beginning of July, there were over 500 vulnerability tickets open* against Fedora and EPEL.  Many of these vulnerabilities already had patches or releases available to remedy the problems but not all.  The Team has already found several examples of upstream not knowing that the vulnerability exists and was able to fix the issue quickly.  This is one of the reasons having a dedicated team to work these issues is so important.

In the few short weeks since the Team was created, we’ve already closed 14 vulnerability tickets and are working another 150.  We hope to be able to work in a more real-time environment once the backlog decreases.  Staying in front of the vulnerabilities will not be easy, however.  During the week of August 3rd, 27 new tickets were opened for packages in Fedora and EPEL.  While we haven’t figured out a way to get ahead of the problem, we are trying to deal with the aftermath and get fixes pushed to the users as quickly as possible.

Additional information on the mission and the Team can be found on our wiki page.  If you’d like to get involved please join us for one of our meetings and subscribe to our listserv.

 

* A separate vulnerability ticket is sometimes opened for different versions of Fedora and EPEL resulting in multiple tickets for a single vulnerability.  This makes informing the packager easier but also inflates the numbers significantly.

August 11, 2014

Getting Service Users out of LDAP

Most people cannot write to the LDAP servers except to manage their own data. Thus, OpenStack requiring the Service users in LDAP is a burden that many IT organizations cannot assume. In Juno we have support for Multiple backends for domains.

Starting with a devstack install, (and an unrelated bug fix) I created a file: /etc/keystone/domains/keystone.freeipa.conf. The naming of this file is essential: keystone..conf is the expected form. It looks like this.

# The domain-specific configuration file for the test domain
# 'domain1' for use with unit tests.

[ldap]
url=ldap://ipa.cloudlab.freeipa.org
user_tree_dn=cn=users,cn=accounts,dc=ipa,dc=cloudlab,dc=freeipa,dc=org
user_id_attribute=uid
user_name_attribute=uid
group_tree_dn=cn=groups,cn=accounts,dc=ipa,dc=cloudlab,dc=freeipa,dc=org


[identity]
driver = keystone.identity.backends.ldap.Identity

And made the following changes to /etc/keystone/keystone.conf

[identity]
domain_specific_drivers_enabled=true
domain_config_dir=/etc/keystone/domains

I restarted HTTPD to get Keystone to pick up my changes:

sudo systemctl  restart httpd.service

Then followed the steps from my earlier blog post to list domains.

export TOKEN=`curl -si -d @token-request-admin.json -H "Content-type: application/json" http://localhost:35357/v3/auth/tokens | awk '/X-Subject-Token/ {print $2}'`
curl -s -H"X-Auth-Token:$TOKEN" -H "Content-type: application/json" http://localhost:35357/v3/domains | jq '.domains[] | {id, name}

Note that I used a little jq to make things easier to read.

To add the new domain:

curl  -H"X-Auth-Token:$TOKEN" -H "Content-type: application/json" -d '{"domain": {"description": "FreeIPA Backed LDAP Domain", "enabled": true, "name": "freeipa"}}'  http://localhost:35357/v3/domains

Now to test out the new domain I have a user from the LDAP server. Good Old Edmund. Here is token-request-edmund-freeipa.json

{
    "auth": {
        "identity": {
            "methods": [
                "password"
            ],
            "password": {
                "user": {
                    "domain": {
                        "name": "freeipa"
                    },
                    "name": "edmund",
                    "password": "freeipa4all"
                }
            }
        }
    }
}
-sh-4.2$ curl -si -d @token-request-edmund-freeipa.json -H "Content-type: application/json" http://localhost:35357/v3/auth/tokens 
HTTP/1.1 201 Created
Date: Mon, 11 Aug 2014 14:52:14 GMT
Server: Apache/2.4.10 (Fedora) mod_wsgi/3.5 Python/2.7.5
X-Subject-Token: PKIZ_eJx1VMmSozgUvPMVfa-oKBaD8aEPEsIgXJILEJtuBmw2eTfG5uuHcnXEzBxat5ehzHxPynjv79OBtoPpL4uE38W7RDC2FkPmrY4c1_eCAr-wjhNmIMsCvW8B335YDLiwii6oIpbTTngkO1Z4dkKcaxLybRdsJpDN7Kqyje3Tk3P1JvIGG9i9NpuUjmUSd6m6lHF7rHCLB8L8G0HVjbBlJE2FQRk2CIt6OgKFjL6eNPiKLU9s3eCZpaRZN3C-C4cK7xVROvWduy8sxwdYS4VGtVzzOizkPyR4Kvbx-DfHyeh_hhIZ7ecfR6VQ4-c3aRqjy1Wl3iSzlzvei-4lto9mJMlmdMzGSWTM2qCW1gzPqIp1uqdijYjME77nrf3EzXfLep0n0SQCGn7wBE_EkIXe4tMCzSbxX7hE2u5BR9FRFj04KkbSQkEZUAnzGo6EyJjYUyduCYINYbAlaiSTlsgk8ZU1657S2glqymx1ndgyZ2QkqGz5z0h9lijip_W4y9O455a32KXyo6ocOH0_3PkYSoDBgiyLhzUCD1Y0hiBjQMSM-LMBgQzFvo8RiOP8QEWJ7DUBgwOUyIbDsIwTfZR46j9QC8gP-UhgHPfTY8okqIZl9RJACCy0Uit6ntZ1nsIrD_U2V-XvkA0SHLLlapghUB8H5P83kTaEPvgOF-hIdpO7UNSmnLZXeLqXffBhOUwPUkvqdg2vHx_mmSu1YVy9ozbUnLAzTd5clJkqKoubnU_xTtfHMdymNRgRCdEFB4fDoC2FxKNP9iXicA6avAtquQzmJ8q-jK3XDHvVgIvNTW9P84-3dsvNNemCw2fDqUfHWX18rORAuga7U2WaXChPLVF7S9bedL3Tmtonpb4KFqnBBdP7Q9-cD0O5L8doU503fsVcC33V50pq7-N93n0cyf0xLE1MS3exq4P0pmwwnNkm0a1NWG2s3fnpuKWD36gd3k9qtu0uF4Nw5ErZ7cbnZn9wL8FQ6NvKzDy7genTjYRjlm_bL_eL59Pl-c1V9B20i3b3qWIOVsPv39JrI9gU_bsd_gH7ultg
Vary: X-Auth-Token
Content-Length: 314
Content-Type: application/json

{"token": {"issued_at": "2014-08-11T14:52:15.705349Z", "extras": {}, "methods": ["password"], "expires_at": "2014-08-11T15:52:15.705312Z", "user": {"domain": {"id": "e81f8763523b4a9287b96ce834efff12", "name": "freeipa"}, "id": "29179d551d7320e50612bd9ea9f4ec00b10c3e42341d59928da5169a4e3307cf", "name": "edmund"}}}-sh-4.2$ 

It works! Henry Nash worked hard on this. IT was supposed to land for Havana, but it got delayed by several issues. Probably the biggest was the User_Id. Notice in the Token response above that Edmund’s User ID is not the UID field from the LDAP query, but rather 29179d551d7320e50612bd9ea9f4ec00b10c3e42341d59928da5169a4e3307cf. This value is a SHA256 of the LDAP assigned filed and the domain_id combined. This mechanism was select to prevent conflicts between two different Identity Providers assigning the same value. To confirm:

$ echo "select * from id_mapping;" | mysql keystone
public_id	domain_id	local_id	entity_type
29179d551d7320e50612bd9ea9f4ec00b10c3e42341d59928da5169a4e3307cf	e81f8763523b4a9287b96ce834efff12	edmund	user
Threat: Dennis the Weatherman

HurricaneSandy

Dennis the Weatherman is a proxy for the threats that nature presents. Superstorm Sandy is a recent example of the power of weather. Some places received over a half meter of rain in less than 24 hours, as well as high winds. The combination of flooding, storm surge, high winds and downed trees wreaked havoc on businesses and data centers over an extended area.

Superstorm Sandy highlighted many factors around disaster preparedness. Some companies were able to fail over to geographically remote data centers and continue operations with minimal disruption.

Some companies had mixed experiences. One of the best examples is Peer 1 Hosting – their data center was well above the flooding and they had backup generators. Unfortunately, their diesel fuel tanks for the backup generators were in the basement… They had to form a “bucket brigade” to carry diesel fuel up 17 flights of stairs.

Other companies were simply down. Without power or network connectivity there was nothing they could do. Worse, their datacenters may have been flooded and the equipment damaged or destroyed. Worst case would be a flooded datacenter without adequate offsite disaster recovery or even backups; some companies went out of business.

In addition to hurricanes, you have to worry about flooding, tornadoes, fire and wildfire, blizzards, and earthquakes.

The good news is that the computer systems in the basement of a flooded data center which is burning down in the middle of an earthquake are not likely to be hacked…

Yes, weather is a clear and present danger to system integrity. Plan accordingly. PLAN accordingly! Exactly what will you do if your data center is under water and located in the middle of an entire region of downed trees with roads blocked by thousands of people trying to escape. And maybe even Starbucks closed!


August 06, 2014

Flock to Fedora: Day One (Afternoon)

Overview

The afternoon today has been absolutely filled with excellent talks. As with this morning, I’ve captured most of them for your entertainment and edification below. These blog entries are getting very long; perhaps I need an index.

Sessions

Fedora Workstation: Goals, Philosophy and Future

“The desktop is not dying!” cries Christian Schaller as he dives into his talk on the Fedora Workstation. In a throwback to the old saw about the Year of Linux on the Desktop, he points out that 35% of laptop sales are Chromebooks and that there are now over six hundred games available on Steam for Linux.

So what is the Fedora Workstation trying to achieve? What are its driving principles? First of all, Fedora Workstation is the only desktop out there that isn’t trying to sell you (in other words, capture information about you to sell to someone else). Fedora Workstation will consider end-user privacy one of its highest goal. By default, Workstation will not share your data.

The recent switch to focusing on delivering specific Fedora Products gave the Workstation an opportunity to start defining standards and maintaining them within a particular Product. Traditionally, Fedora was effectively a collection of exactly what upstreams provided. With the Workstation, we’ll be able to curate this code and integrate it into a cohesive whole (Author’s note: bingo!). One of the key points here is that by providing a known base, we give developers a better idea of what they can rely upon when writing their applications and services.

Christian continues on to discuss more about the future plans of the Workstation. The first of these was a discussion about how containers will fit into the Workstation world. Significant research is going into figuring out how to use container images to isolate and secure applications from one another, which will offer both flexibility and security in the long run. With some of the features offered by Wayland, this will eventually provide an excellent (and safe!) experience for users. Christian also notes that their goal is to make this all seamless from the users’ perspectives. If they notice it’s been done (without being told), they failed.

The Workstation Working group has been doing extensive research on Docker as a container implementation but has yet to make a decision on whether to standardize on Docker or else use a customized namespacing solution.

Christian then talks a bit about how Workstation is going to try to grow the platform. He acknowledges that there has been a slow decline of usage in Fedora and wants to put effort into drawing people back in. Some of the plans for this involve increased publicity through blogging and the Fedora Marketing groups. It will also involve finding ways to reduce the split between applications that run on Fedora and Red Hat Enterprise Linux. In part, that involves making sure that newer APIs are available in RHEL and older interfaces remain available in Fedora.

Another key piece of Fedora Workstation’s long-term strategy involves the creation and maintenance of a true ABI at all levels of the stack. This will help reduce the “moving target” aspect of Fedora and provide more guarantees about long-term sustainability of an application written for Fedora.

Christian takes a moment to talk about the target users. A lot of the conversation so far has been about making Fedora better for developers, but from his perspective it is also necessary to build a platform that’s useful to creators of all sorts. “Creators” may include video editors, 3D-modeling, music mixing and all sorts of other creation tasks.

After this, the conversation moved on to discuss a little bit of the challenges they face. Marketing and PR is going to be a significant challenge, particularly correcting some existing negative press that we’ve gathered over the last few years. We’ve also got to convince ourselves. There have been a lot of big changes in Fedora lately (the Three Product Plan being the most visible) and it’s clear that there’s going to be a period of adjustment as we course-correct and really figure out how we’re going to move forward.

Christian then talks about the advent of web-based applications and how that relates to the Workstation. He notes that there are users (even at Red Hat) that never open any local application other than their web browser. From this perspective, it’s very clear that we need Fedora Workstation to be a powerful mechanism for running web applications. So one of the things being worked on is ways that web applications can be tied into the desktop environment in a more integrated way.

Comments during and after the talk focused a great deal on the publicity and marketing aspects of things. It was noted that the more targeted user groups lends itself better to more controlled messaging. This will bring in more users in those particular groups who will in turn pass their good experience on by word-of-mouth.

Several questions were also asked about specific feature enhancements to things like Gnome Online Accounts (particularly around the earlier discussion of web applications). Christian indicated that the next  immediate efforts on that front will likely be focused around a better calendaring experience.

Evolving the Fedora Updates Process

I next attended a talk about the Fedora update process given by Luke Macken, author and maintainer of the Bodhi update system.

Once upon a time in Fedora Core 1 through Fedora Core 3, updates were handled via a manual process involving emails to release engineering. Starting with Fedora Core 4, a private internal updating system that was available only to Red Hat employees.

The modern world of Bodhi began in Fedora 7 at the same time that Fedora Core and Fedora extras were merged. It introduced the concept of Karma and it was written in TurboGears 1.x and it is still in production today, seven years and many revisions later.

Bodhi does a lot of things behind the scenes, being both extremely intricate and very inefficient. Luke described a number of issues that have cropped up over the years, including inflexible SQL routines and the karma process outliving its usefulness.

Luke next took a little side-trip to tell us about some of the more entertaining glitches that have cropped up over the years, including the infamous Fedora 9 GPG re-keying and numerous crashes during update pushes.

After that, he moved on to discussing the plans for the Bodhi2 project. The plan is to have it land sometime after the Fedora 21 release. We don’t want to rely on it for zero-day updates, but we’ll phase it in soon after and it should hopefully be a graceful transition.

Some of the major changes in Bodhi2 will be a comprehensive REST API, a new and improved command-line tool and major changes to the UI that should provide a better experience for the users.

Another great feature of Bodhi2 is that it will integrate with fedmsg and the new Fedora Message Service to reduce the amount of “spam” email that Bodhi sends out. Luke dives in a bit to talk about the datagrepper and datanommer mining tools that power the notification service and the set of filters that you can opt into.

Luke showed off how Bodhi2 will be tightly integrated with Taskotron to perform automated testing on updates, as well as the integration with the Fedora Badges (there are lots of them available for Bodhi!) and then on to the feedback system. He called out both the fedora-easy-karma command-line tool and the fedora-gooey-karma GUI tool for managing karma updates on Bodhi1 (and noted that they will be working together to support Bodhi2 as well).

Then he went and left me slack-jawed with the new submitter process, automating almost everything and making it almost unbelievably simple. Adding to that, the new karma system allows the submitter to select fine-grained karma controls, so they can request that specific tests have to pass karma before accepting it into the stable repository.

The talk finished up with some prognosticating about the future, particularly talking about being able to run AMI and other cloud image updates through as well.

State of the Fedora Kernel

The next stop on my whirlwind tour of the wide world of Fedora was Josh Boyer’s annual discussion on Fedora’s treatment of the kernel. First up on the agenda was an overview of the release process. Fedora focuses on having a common base kernel across all stable releases (which means that bugs and features are shared). Fedora rebases to the latest upstream kernel on a regular basis, staggering the updates back to the two stable releases.

Josh described how the old process for kernel updates was to be more conservative on updates on older Fedora releases. However, a few years ago the process changed and updates are now faster and keeps Fedora much closer to the kernel upstream.

The talk then moved on to discussing the state of open bugs in the kernel. During this talk in 2013, there were 854 open bugs against the Fedora Kernel. After last year, the kernel maintainers sat down and spent a lot of time to knock down the bugs. Today it’s down to 533, but this is still not good enough. There will be a talk on Saturday about some ways to address this.

Josh pointed out several consistent problem areas: WiFI support, suspend/resume, video playback and brightness settings, platform-specific drivers (like Fn keys) and bluetooth. “All the stuff that involves ACPI is usually broken on some platform, somewhere”.

He then moved on to talking about how they handle bugs. He pointed out that if someone files a bug, they’re contacted (the bug is set NEEDINFO) every time the kernel is rebased. If after two weeks the reporter doesn’t confirm a fix (or a continuing bug), the bug is closed INSUFFICIENT_INFO.

What does all this mean? In short, it means that the Fedora kernel maintainers are permanently saturated, the work is never-ending and they would really appreciate if people would take vacations at different times than the maintainers so they don’t always return to 200+ extra bugs. Additionally, they really need help with triage, but it’s difficult to find anyone to do so, mainly because bug triage is admittedly boring. Some steps have been made in the last couple years that really helps, particularly the ABRT bug reports and the retrace server that helps narrow down which bugs are having the widest impact. The retrace server in particular keeps statistics on the number of reports, so that helps in prioritizing the fixing efforts.

After the bug discussion, Josh moved on to talking about the situation with Fedora 21. The plan is to release either kernel 3.16 or 3.17 at release, depending on schedule slips. During the Fedora 21 process, the kernel maintenance has actually been fairly calm, despite a set of new packaging changes.

During the Fedora.next process, the Fedora Cloud Working Group made requests of the kernel team to shrink down its size. There are a lot of optional components built into the kernel and many of these weren’t actually needed in a cloud environment. So the kernel team went and broke out the available modules into a core set and a common set above that. This made the minimal installation set much smaller and reduced the space on the cloud images substantially. In addition to this, they found ways to compress the kernel modules so the storage on disk shrank quite a bit as well.

Another useful feature that was added to packaging in Fedora 21 is support for automatically-generated RPM “Provides” listing the set of modules that are present in each of the packages. This will make it easier for packagers to specify dependencies on the appropriate package (and will continue working if modules move around).

The last major change in Fedora 21 is support for 64-bit ARM hardware (aarch64), which as was noted by an audience member is now available for general purchase. It works fairly well (thanks in large part to a herculean effort by Red Hat ARM engineers) and may be promoted to a primary architecture in Fedora 22 or 23. As a side-effect of this work, it’s going to be possible to replace the slow armv7hl builders in Koji with the new aarch64 builders that will be vastly more performant.

Josh then moved on to discuss the new kernel Playground, which is a new unsupported COPR containing some new experimental features. It tracks Fedora Rawhide and 21 and provides today the Overlayfs v23 (a union filesystem) and kdbus (the high-performance kernel D-BUS implementation). These are fairly stable patches to the kernel that are still out of the main tree and therefore not really suitable for Fedora proper (yet).

In the future, it may include other features such as kpatch and kgraft (the in-kernel infrastructure for supporting live-patching the kernel).

Advocating Fedora.next

After taking a short break to catch my breath and give my fingers a rest (these blog entries are a lot of work!), I went along to Christoph Wickert’s session Advocating Fedora.next. This session was largely (but not exclusively) directed at Fedora Ambassadors, informing them how best to talk about the Fedora.next initiative to the public.

He began his talk by addressing the question of “Why?”. Why did we need to change things so substantially? After providing the delightfully glib answer “Why not?”, Cristoph described a bit about Fedora’s history. He pointed out quite eloquently how Fedora has always been about change. Fedora has never been afraid to try something new to improve.

He then tackled some of the non-reasons behind Fedora.next, specifically the rumors of our demise post-GNOME 3 and similar. The truth is that we have a strong brand with a large contributor base that is extremely closely linked to upstream development (more so than many, if not all, of the other distributions). We’ve had a decade of successes and Fedora 20 has been very positively reviewed overall.

Another such rumor was that the new products are a benefit only to Red Hat. The obvious rebuttal here is that separating products makes good sense, focusing on specific user sets. Also, Red Hat has no particular interest in a consumer workstation product.

A final common criticism is that the new working groups are a power-grab by Red Hat. The older governance has not changed (and remains community-elected). Furthermore, all of the working groups were self-nominated and contains numerous non-Red Hat community members, including Christoph himself.

Christoph then ruminates, “If we didn’t do it for those reasons, why did we do it?”. The first answer is that distributions have become boring. All major distributions have declining market share. The general impression is that the distro is a commodity and that the users don’t care which one they’re running. The user really only cares that their applications run. Things like cloud environments and containers blur this line as well.

Continuing on, Christoph calls out to the Fedora Mission statement: “The Fedora Project’s mission is to lead the advancement of free and open source software and content as a collaborative community” and asks whether we feel like we’ve been living it. With that in mind, he defines Fedora.next for us: an umbrella term for the changes in the way that we create and release Fedora.

Fedora.next was born of two proposals that were first publicised at last year’s Flock conference: “The Architecture for a More Agile Fedora” by Matthew Miller and “The Fedora Crystal Ball: Where are we going for the next five years?” by yours truly, Stephen Gallagher. As the last year has passed, the Fedora.next that we now know has become a merge of these two proposals.

Matthew’s proposal involved having different policies depending on how low in the stack that a package lived, with core functionality having stricter guidelines than packages up in the application layer. My proposal was around having three different development streams (Server, Workstation and Cloud) and possibly different release cycles. The modern vision will be a combination of the two, with three products.

Christoph also warned the Ambassadors to be aware that the Fedora installation DVD will be retired in Fedora 21, but notes that it was never truly well-maintained and that its replacement netinstalls, spins and live media should be at least sufficient.

“What is a product?”, Christoph asks, then answers. A Product is more than simply a Fedora Spin. Each product has a defined target audience, a mission statement, a product requirements document (PRD) and a technical specification. The mission statement, PRD and technical statement were all defined and discussed publicly by the Product Working Groups and ratified by the Board and FESCo. Each product contains features not present in older Fedoras and has its own working group with their own governance models.

Christoph stresses that this is not a power-grab but instead the opposite: it’s an opportunity to give more power to the specific people who are building the Products. As a member of the Workstation Working Group, he calls out to its Mission Statement and then discusses a few of the Workstation-specific cool features. He notes that what Fedora Workstation will look like in Fedora 21 will not be a huge difference from the classic Fedora Live image, but this will change over time as the Workstation comes into its own life.

He then continues on to discuss Fedora Server a bit, calling out the exciting new Cockpit system management console.

Moving on to the Fedora Cloud, Christoph asks for Matthew Miller in the audience to comment further on it. Matthew describes the pets vs. cattle metaphor and explains that Fedora Cloud is really meant to fill the “cattle” side of that metaphor. Matthew notes the work towards the Fedora Base Image and the Fedora Atomic effort.

Christoph notes that this is an excellent example of why Spins are not enough. For example, when distributing cloud images, it doesn’t really meet the definition of a Spin because it doesn’t install via anaconda.

The talk then moves on to discussing the remaining two working groups: “Base Design” and the “Environments and Stacks” groups. Talking about the “Base Design”, he stresses that the idea is for the base to be as small as possible and provide a common framework for the Products to build on. The “Environments and Stacks” working groups are focused on making developers’ lives easier, providing the ability to install known software (development) stacks, possibly different versions side-by-side.

Christoph summarizes that there has been a great deal of misinformation put out there and he calls out to the Ambassadors and everyone else to explain what’s really going on, how it works and why it’s a positive change in Fedora. The message must be positive, because the change is exciting and there’s much more to come. He cautions “it’s not just ‘the next fedora’, it’s ‘Fedora.next’.”

Daily Summary

It’s really hard to pick out one specific thing to say about Flock’s first day. Every one of the speakers I listened to today were excited, engaging and clearly love Fedora, warts and all. I think I’ll leave it there for today.


Flock to Fedora: Day One (Morning)

Overview

Today was quite a day! I attended many interesting sessions and it’s going to be quite difficult to keep this post to a reasonable length. I make no promises that I will succeed at that. Here’s a set of highlights, mostly my stream-of-consciousness notes from the sessions I attended. I’ll pare down the best details in a final wrap-up post on Saturday or Sunday. For now, consider this an exhaustively detailed set of live notes published as a digest.

This has gotten quite long as I write, so I’m going to break these up into morning and afternoon reports instead. Enjoy!

Sessions

Free and Open Source Software in Europe: Policies and Implementations

The first talk I attended today was the keynote by Gijs Hellenius, Free and Open Source Software in Europe: Policies and Implementations. It was an excellent overview of the successes and issues surrounding the use of Open Source software in the European public sector. The good news was that more and more politicians are identifying the value of Open Source, providing lower costs and avoidance of lock-in. The bad news is that extensive lobbying by large proprietary companies has done a lot of damage, thus preventing more widespread adoption.

That is all in terms of usage. Unfortunately, the situation soured somewhat because developers of public sector code are hesitant about making their code open-source (or contributing to other open-source projects directly). It was noted that the legal questions have all been answered (and plenty of examples of open-sourced public development exist). Unfortunately, a culture of conservatism exists and has slowed this advance.

A highlight of the talk was the Top three most visible open source implementations section. The first of these was the French Gendarmie who switched 72,000 systems over to Ubuntu Linux and LibreOffice Desktops. While this author would certainly prefer that they had selected an RPM-based distribution, I have to bow to the reported 80% cost reduction.

The second major win was the conversion of 32,000 PCs in government and healthcare in the Spain Extremadura areas (mostly Debian).

The third one was the City Administration of Munich, who also converted 14,800 workstations from Windows to Ubuntu Linux and LibreOffice. Excellent quote on his slide (paraphrased because it went by too quickly): “Bill Gates: Why are you doing this, Mr. Ude?” “Ude: To gain freedom.” “Gates: Freedom from what?” “Ude: From you, Mr. Gates.”

Fedora QA – What Why How…?

The second talk I attended today was an introduction to Fedora QA from Amita Sharma, a Red Hat senior quality engineer.

Amita started by talking about why Fedora QA is important, citing that Fedora’s fast schedule would otherwise lead to a very buggy and unpleasant system to use. She showed us a few bugs in Bugzilla to explain how easy it can be to participate in QA. In the simplest sense, it’s no more than spotting a bug and reporting it.

Next she discussed a bit about the team composition, meeting schedule and duties. This involved things like update and release testing, creation of automation and validation testing, filing bugs, weekly meetings and scheduling and execution of Fedora Test Days.

Some of the interesting bits that Amita pointed out were some of the various available resources for Fedora QA. She made sure to call out the daily Rawhide Report emails to the devel@lists.fedoraproject.org mailing list, as well as Adam Williamson’s Rawhide Watch blog and the regular Security Update test notice.

She next launched into a detailed explanation of the karma process for Fedora testing updates. She explained how it is used to gate updates to the stable repository until users or QA testers validate that it works properly.

Amita moved on to describing the bug workflow process, from how to identify a bug to reporting it on Bugzilla and following it through to resolution.

The next phase of Amita’s talk focused on the planning, organization and execution of a Fedora Test Day, from proposing a Test Day to creating test cases and managing the process on IRC.

Where’s Wayland?

I was originally planning to attend the Taskotron talk by Tim Flink, but the room was over capacity, so I went over to the Wayland talk being given by Matthias Clasen (which was my close second choice).

I arrived a few minutes late, so I missed the very beginning and it took me a bit of time to catch up. Matthias was describing Wayland’s capabilities and design decisions at a very low level, in many cases over my head, when I came in, so I really regretted missing the beginning.

One of the fundamental principles of Wayland is that clients are fully isolated. There’s no root window and there’s no concept of where your window is positioned relative to other windows, no “grabs” of input and no way for one Wayland application’s session to interfere directly with another. (Author’s note: this is a key security feature and a major win over X11). Wayland will allow certain clients to be privileged for special purposes such as taking screenshots and accessibility tools. (Author’s note: I had a conversation earlier in the day with Hans de Goede where we talked about implementing Android-style “intents” for such privileged access so that no application can perform display-server-wide actions without express user permission. I think this is a great idea and we need to work out with user-experience designers how best to pull this off.)

Another interesting piece that Wayland will offer is multiple interfaces. For example, it will be much easier to implement a multi-seat setup with Wayland than it was with X11 historically.

Matthias then moved on to describe why we would want to replace X11. The biggest reason is to clean out a lot of cruft that has accumulated in the X11 protocol over the years (much of it unused in modern systems). Having both the compositor and renderer in the same engine also means that you can avoid a lot of excess and duplicated work in each of the window managers.

One very important piece is the availability of sandboxing. In X11, any application talking to the server has access to the entire server. This potentially means that any compromised X application can read anything made available to X from any other application. With Wayland, it becomes very easy to isolate the display server information between processes, which makes it much more difficult to escalate privileges.

Matthias also covered some of the concerns (or arguable disadvantages), the biggest of which was that compositor crashes will now destroy the display session, whereas on older X11 approaches it was possible to restart the window manager and retain the display. Other issues involved driver support; for example the nVidia driver does not support Wayland. (Note: the Nouveau driver works just fine, it’s the proprietary driver that does not.)

The next topic was GNOME support of Wayland, which has come a very long way in the last two years. GNOME 3.10 and later has experimental support for running on Wayland. The remaining gaps on Fedora 21 are mainly drag-and-drop support, input configuration and WACOM support. (Author’s note: I’ve played with GNOME 3.13.3 running atop Wayland. Much of the expected functionality works stably, but I wouldn’t yet recommend it for daily use. Given the rate of improvement I’ve seen, this recommendation may change at F21 GA release).

Matthias made a rather bold statement that the goal to hit before making Wayland the default in Fedora is that end-users should not notice any difference (Author’s note: presumably negative difference) in their desktop.


Threat: Dave the Service Technician

ComputerServiceSmall

Dave is responsible for adding, upgrading and repairing systems. Without Dave, things will quickly go downhill in your data center.

While Dave is responsible for maintaining system integrity, he can also compromise it:

  • A drive has failed in a RAID5 set. You need to replace the failed drive and rebuild the RAID. Oops! Pulled the wrong drive. The RAID set has gone from degraded to dead. Time for a recovery operation!
  • Server17 in a rack of 36 1U “pizza box” servers needs to be power cycled. Dave hits the power button on Server18…
  • There is a short circuit in power distrubution unit in the server rack. Now you have 36 systems down!
  • Dave moves the wrong network cable in the wiring closet.
  • Don’t even think about what happens if Dave slips and bumps the Big Red Button!

EmergencyPowerOff

And if Dave happens to be malevolent, he can do things like:

  • Slip a laptop or other small computer into the wiring closet and have it snoop the internal network for data.
  • Connect internal networks directly to the Internet.
  • Steal parts, supplies, and even complete systems. Look at the number of cases where good boards are replaced and then sold on Ebay…

Basically, Dave is a proxy for all of the physical threats to system integrity that can occur in the data center.


Flock to Fedora: Intro

Today was a big day: we kicked off the first of the four-day Flock conference. This is the second time we’ve run a Flock, which is the largest annual gathering of contributors to the Fedora Project. There are a great many excellent talks scheduled for Flock this year, and I’ll only be able to attend a handful of them. Fortunately, all of the talks (but not the hackfests and workshops, sorry!) will be streamed live and recorded at the Flock YouTube Channel. My plan is to take a few notes on each of the talks I attend and put up a summary each day with the highlights. Come back tonight for my review of the first day!

 

Edit: See also Máirín Duffy’s blog for an absolutely excellent breakdown of how to follow along with Flock remotely.

 


August 04, 2014

Threat: Sally the User

SallyUser

Unlike Sam the Disgruntled Employee from our last post, Sally doesn’t have an evil bone in her body. She is dedicated, hardworking, helpful, and committed to doing a good job.

Unfortunately, she doesn’t completely understand how the system works, and sometimes enters incorrect data.

Actually, this isn’t her fault – Tom the Programmer from a few posts back probably didn’t write a usable system! I’m convinced that “Enterprise Software” means software that is hideously expensive with a poor user interface that no one would voluntarily use. I often use the phrase as user friendly as a rabid weasel to describe software, and much of the mission critical software that companies run on meets this description. But, that is a digression – let’s get back to the main point.

Since Sally is helpful and considerate, she is likely to give Fred the System Administrator her password when he calls. This isn’t just a Sally issue; virtually everyone is vulnerable to social engineering; look at the success of spear phishing against senior executives.

Sally is also likely to let Sam the Disgruntled Employee use her system if he asks with a plausible reason.

Sally is representative of the majority of people in your company. She works hard and wants to do the right thing. The systems – both computer systems and corporate procedures – need to support her in getting her job done, be resistant to mistakes, and prevent malevolent entities from using her as an attack vector. This will be a combination of training, system design, software design, management, operations, and company policies and procedures.

Basically, systems need to be designed to help Sally succeed and help prevent her from failing. This is the last place to use a heavy handed blame the employee for everything policy – it is both counter-productive and ineffective.

To be blunt, the problems you have with Sally are system failures, not user failures – the system isn’t designed to be used by typical users in the real world. In many cases the security model is much like the old physics approach of simplifying things to make it easier to deal with, where a problem statement will begin with: “Postulating a spherical cow in a vacuum, what is the trajectory…”

Unfortunately, such idealizations fall apart when real world factors come into play!


July 31, 2014

Threat: Sam the Disgruntled Employee

SamDisgruntled

I’m going to assert that Sam is the second greatest security you face. (We will encounter the greatest thread in a few more posts.) Depending on who you talk to, between 60% and 90% of corporate losses due to theft and fraud and from employees, not external threats.

This may be overstated in some areas; a lot of credit card theft and identify theft is external. See, for example, the theft of over 50M credit cards numbers at Target. Still, much of the real world theft is internal.

Sam is unhappy with your company. He wants to take from it or cause hurt. Sam may be committing fraud, copying internal documents to take to a competitor, posting damaging information on the Internet, or walking out the door in the evening with a bag full of your products or supplies.

You need to both watch for disgruntled employees and to minimize the damage they can do. Good management and good internal controls are your first line of defense. Constant awareness and vigilance are called for.

Above all, watch the people side. In some cases Sam is simply unethical – you need to find him and remove him. In other cases he is angry – this is often a management issue. In many cases he simply sees an opportunity that he can’t resist; solid internal controls will minimize this risk.

In any case, be aware that your greatest threats are usually inside your company, not outside of it!


July 30, 2014

Controlling access to smart cards

Smart cards are increasingly used in workstations as an authentication method. They are mainly used to provide public key operations (e.g., digital signatures) using keys that cannot be exported from the card. They also serve as a data storage, e.g., for the corresponding certificate to the key. In RHEL and Fedora systems low-level access to smart cards is provided using the pcsc-lite daemon, an implementation of the PC/SC protocol, defined by the PC/SC industry consortium. In brief the PC/SC protocol allows the system to execute certain pre-defined commands on the card and obtain the result. The implementation on the pcsc-lite daemon uses a privileged process that handles direct communication with the card (e.g., using the CCID USB protocol), while applications can communicate with the daemon using the SCard API. That API hides, the underneath communication between the application and the pcsc-lite daemon which is based on unix domain sockets.

However, there is a catch. As you may have noticed there is no mention of access control in the communication between applications and the pcsc-lite daemon. That is because it is assumed that the access control included in smart cards, such as PINs, pinpads, and biometrics, would be sufficient to counter most threats. That isn’t always the case. As smart cards typically contain embedded software in the form of firmware there will be bugs that can be exploited by a malicious application, and these bugs even if known they are not easy nor practical to fix. Furthermore, there are often public files (e.g., without the protection of a PIN) present on a smart card that while they were intended to be used by the smart card user, it is not always desirable to be accessible by all system users. Even worse, there are certain smart cards that would allow any user of a system to erase all smart card data by re-initializing it. All of these led us to introduce additional access control to smart cards, in par with the access control used for external hard disks. The main idea is to be able to provide fine-grained access control on the system, and specify policies such as “the user on the console should be able to fully access the smart card, but not any other user”. For that we used polkit, a framework used by applications to grant access to privileged operations. The reason of this decision is mainly because polkit has already been successfully used to grant access to external hard disks, and unsurprisingly the access control requirements for smart cards share many similarities with removable devices such as hard disks.

The pcsc-lite access control framework is now part of pcsc-lite 1.8.11 and will be enabled by default in Fedora 21. The advantages that it offers is that it can prevent unauthorized users from issuing commands to smart cards, and prevent unauthorized users from reading, writing or (in some cases) erasing any public data from a smart card. The access control is imposed during the session initialization, thus reducing to minimal any potential overhead. The default policy in Fedora 21 will treat any user on the console as authorized, as physical access to the console implies physical access to the card, but remote users, e.g., via ssh, or system daemons will be treated as unauthorized unless they have administrative rights.

Let’s now see how the smart card access control can be administered. The system-wide policy for pcsc-lite daemon is available at /usr/share/polkit-1/actions/org.debian.pcsc-lite.policy. That file is a polkit XML file that contains the default rules needed to access the daemon. The default policy that will be shipped in Fedora 21 consists of the following.

  <action id="org.debian.pcsc-lite.access_pcsc">
    <description>Access to the PC/SC daemon</description>
    <message>Authentication is required to access the PC/SC daemon</message>
    <defaults>
      <allow_any>auth_admin</allow_any>
      <allow_inactive>auth_admin</allow_inactive>
      <allow_active>yes</allow_active>
    </defaults>
  </action>

  <action id="org.debian.pcsc-lite.access_card">
    <description>Access to the smart card</description>
    <message>Authentication is required to access the smart card</message>
    <defaults>
      <allow_any>auth_admin</allow_any>
      <allow_inactive>auth_admin</allow_inactive>
      <allow_active>yes</allow_active>
    </defaults>
  </action>

The syntax format is explained in more details in the polkit manual page. The pcsc-lite relevant parts are the action IDs. The action with ID “org.debian.pcsc-lite.access_pcsc” contains the policy in order to access the pcsc-lite daemon and issue commands to it, i.e., access the unix domain socket. The latter action with ID “org.debian.pcsc-lite.access_card” contains the policy to issue commands to smart cards available to the pcsc-lite daemon. That distinction allows for example programs to query the number of readers and cards present, but not issue any commands to them. Under both policies only active (console) processes are allowed to access the pcsc-lite daemon and smart cards, unless they are privileged processes.

Polkit, is quite more flexible though. With it we can provide even more fine-grained access control, e.g., to specific card readers. For example, if we have a web server that utilizes a smart card we can restrict it to use only the smart cards under a given reader. These rules are expressed in Javascript and can be added in a separate file in /usr/share/polkit-1/rules.d/. Let’s now see how the rules for our example would look like.

polkit.addRule(function(action, subject) {
    if (action.id == "org.debian.pcsc-lite.access_pcsc" &&
        subject.user == "apache") {
            return polkit.Result.YES;
    }
});

polkit.addRule(function(action, subject) {
    if (action.id == "org.debian.pcsc-lite.access_card" &&
        action.lookup("reader") == 'name_of_reader' &&
        subject.user == "apache") {
            return polkit.Result.YES;    }
});

Here we add two rules. The first one allows the user “apache”, which is the user the web-server runs under, to access the pcsc-lite daemon. That rule explicitly allows access to the daemon because in our default policy only administrator and console user can access it. The latter rule, it allows the same user to access the smart card reader identified by “name_of_reader”. The name of the reader can be obtained using the commands pcsc_scan or opensc-tool -l.

With these changes to pcsc-lite we manage to provide reasonable default settings for the users of smart cards that apply to most, if not all, typical uses. These default settings increase the overall security of the system, by denying access to the smart card firmware, as well as to data and operations for non-authorized users.

July 29, 2014

Threat: Tom the Programmer

TomProgrammer

No discussion of system integrity and security would be complete without Tom.

Without the applications, tools, and utilities that Tom writes, computers would be nothing but expensive space heaters. Software, especially applications software, is the reason computers exist.

Tom is a risk because of the mistakes that he might make – mistakes that can crash an application or even an entire system, mistakes that can corrupt or lose data, and logic errors that can produce erroneous results.

Today, most large applications are actually groups of specialized applications working together. The classic example is three tier applications which include a database tier, a business logic tier, and a presentation tier. Each tier is commonly run on a different machine. The presentation and business logic tiers are commonly replicated for performance, and the database tier is often configured with fail-over for high availability. Thus, you add complex communications  between these application components as well as the challenge of developing and upgrading each component. It isn’t surprising that problems can arise! Building and maintaining these applications is much more challenging than a single application on a single system.

Tom is also a risk because of the things he can do deliberately – add money to his bank account, upload credit card data to a foreign system, steal passwords and user identity, and a wide range of other “interesting” things.

If Tom works for you, look for integrity as well as technical skills.

Be aware that behind every software package is a programmer or a team of programmers. They are like fire – they can do great good or great damage. And, like fire, it is easy to overlook them until something bad happens.


OTP authentication in FreeIPA

As of release 4.0.0, FreeIPA supports OTP authentication. HOTP and TOTP tokens are supported natively, and there is also support for proxying requests to a separately administered RADIUS server.

To become more familiar with FreeIPA and its capabilities, I have been spending a little time each week setting up scenarios and testing different features. Last week, I began playing with a YubiKey for HOTP authentication. A separate blog about using YubiKey with FreeIPA will follow, but first I wanted to post about how FreeIPA’s native OTP support is implemented. This deep dive was unfortunately the result of some issues I encountered, but I learned a lot in a short time and I can now share this information, so maybe it wasn’t unfortunate after all.

User view of OTP

A user has received or enrolled an OTP token. This may be a hardware token, such as YubiKey, or a software token like FreeOTP for mobile devices, which can capture the token simply by pointing the camera at the QR code FreeIPA generates.

When logging in to an IPA-backed service, the FreeIPA web UI, or when running kinit, the user uses their token to generate a single-use value, which is appended to their usual password. To authenticate the user, this single-use value is validated in addition to the usual password validation, providing an additional factor of security.

HOTP algorithm

The HMAC-based One-Time Password (HOTP) algorithm uses a secret key that is known to the validation server and the token device or software. The key is used to generate an HMAC of a monotonically increasing counter that is incremented each time a new token is generated. The output of the HMAC function is then truncated to a short numeric code – often 6 or 8 digits. This is the single-use OTP value that is transmitted to the server. Because the server knows the secret key and the current value of the counter, it can validate the value sent by the client.

HOTP is specified in RFC 4226. TOTP (Time-based One-Time Password), specified in RFC 6238, is a variation of HOTP that MACs the number of time steps since the UNIX epoch, instead of a counter.

Authentication flow

The problem I encountered was that HOTP authentication (to the FreeIPA web UI) was failing about half the time (there was no discernable pattern of failure). The FreeIPA web UI seemed like a logical place to start investigating the problem, but for a password (and OTP value) it is just the first port of call in a journey through a remarkable number of services and libraries.

Web UI and kinit

The ipaserver.rpcserver.login_password class is responsible for handling the password login process. It’s implementation reads request parameters and calls kinit(1) with the user credentials. Its (heavily abridged) implementation follows:

class login_password(Backend, KerberosSession, HTTP_Status):
    def __call__(self, environ, start_response):
        # Get the user and password parameters from the request
        query_dict = urlparse.parse_qs(query_string)
        user = query_dict.get('user', None)
        password = query_dict.get('password', None)

        # Get the ccache we'll use and attempt to get
        # credentials in it with user,password
        ipa_ccache_name = get_ipa_ccache_name()
        self.kinit(user, self.api.env.realm, password, ipa_ccache_name)
        return self.finalize_kerberos_acquisition(
            'login_password', ipa_ccache_name, environ, start_response)

    def kinit(self, user, realm, password, ccache_name):
        # get http service ccache as an armor for FAST to enable
        # OTP authentication
        armor_principal = krb5_format_service_principal_name(
            'HTTP', self.api.env.host, realm)
        keytab = paths.IPA_KEYTAB
        armor_name = "%sA_%s" % (krbccache_prefix, user)
        armor_path = os.path.join(krbccache_dir, armor_name)

        (stdout, stderr, returncode) = ipautil.run(
            [paths.KINIT, '-kt', keytab, armor_principal],
            env={'KRB5CCNAME': armor_path}, raiseonerr=False)

        # Format the user as a kerberos principal
        principal = krb5_format_principal_name(user, realm)

        (stdout, stderr, returncode) = ipautil.run(
            [paths.KINIT, principal, '-T', armor_path],
            env={'KRB5CCNAME': ccache_name, 'LC_ALL': 'C'},
            stdin=password, raiseonerr=False)

We see that the login_password object reads credentials out of the request and invokes kinit using those credentials, over an encrypted FAST (flexible authentication secure tunneling) channel. At this point, the authentication flow is the same as if a user had invoked kinit from the command line in a similar manner.

KDC

Recent versions of the MIT Kerberos key distrubution centre (KDC) have support for OTP preauthentication. This preauthentication mechanism is specified in RFC 6560.

The freeipa-server package ships the ipadb.so KDC database plugin that talks to the database over LDAP to look up principals and their configuration. In this manner the KDC can find out that a principal is configured for OTP authentication, but this is not where OTP validation takes place. Instead, an OTP-enabled principal’s configuration tells the KDC to forward the credentials elsewhere for validation, over RADIUS.

ipa-otpd

FreeIPA ships a daemon called ipa-otpd. The KDC communicates with it using the RADIUS protocol, over a UNIX domain socket. When ipa-otpd receives a RADIUS authentication packet, it queries the database over LDAP to see if the principal is configured for RADIUS or native OTP authentication. For RADIUS authentication, it forwards the request on to the configured RADIUS server, otherwise it attempts an LDAP BIND operation using the passed credentials.

As a side note, ipa-otpd is controlled by a systemd socket unit. This is an interesting feature of systemd, but I won’t delve into it here. See man 5 systemd.socket for details.

Directory server

Finally, the principal’s credentials – her distinguished name and password with OTP value appended – reach the database in the form of a BIND request. But we’re still not at the bottom of this rabbit hole, because 389 Directory Server does not know how to validate an OTP value or indeed anything about OTP!

Yet another plugin to the rescue. freeipa-server ships the libipa_pwd_extop.so directory server plugin, which handles concepts such as password expiry and – finally – OTP validation. By way of this plugin, the directory server attempts to validate the OTP value and authenticate the user, and the whole process that led to this point unwinds back through ipa-otpd and the KDC to the Kerberos client (and through the web UI to the browser, if this was how the whole process started).

Diagram

My drawing skills leave a lot to be desired, but I’ve tried to summarise the preceding information in the following diagram. Arrows show the communication protocols involved; red arrows carry user credentials including the OTP value. The dotted line and box show the alternative configuration where ipa-otpd proxies the token on to an external RADIUS server.

image

Debugging the authentication problem

At time of writing, I still haven’t figured out the cause of my issue. Binding directly to LDAP using an OTP token works every time, so it definitely was not an issue with the HOTP implementation. Executing kinit directly fails about half the time, so the problem is likely to be with the KDC or with ipa-otpd.

When the failure occurs, the dirsrv access log shows two BIND operations for the principal (in the success case, there is only one BIND, as would be expected):

[30/Jul/2014:02:58:54 -0400] conn=23 op=4 BIND dn="uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local" method=128 version=3
[30/Jul/2014:02:58:54 -0400] conn=23 op=4 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local"
[30/Jul/2014:02:58:55 -0400] conn=37 op=4 BIND dn="uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local" method=128 version=3
[30/Jul/2014:02:58:55 -0400] conn=37 op=4 RESULT err=49 tag=97 nentries=0 etime=0

The first BIND operation succeeds, but for some reason, one second later, the KDC or ipa-otpd attempts to authenticate again. It would make sense that the same credentials are used, and in that case the second BIND operation would fail (error code 49 means invalid credentials) due to the HOTP counter having been incremented in the database.

ipa-otpd does some logging via the systemd journal facility, so it was possible to observe its behaviour via journalctl --follow /usr/libexec/ipa-otpd. The log output for a failed login showed two requests being send by the KDC, thus exonerating ipa-otpd:

Aug 04 02:44:35 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: request received
Aug 04 02:44:35 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: user query start
Aug 04 02:44:35 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: user query end: uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local
Aug 04 02:44:35 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: bind start: uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local
Aug 04 02:44:36 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: request received
Aug 04 02:44:36 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: user query start
Aug 04 02:44:37 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: user query end: uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local
Aug 04 02:44:37 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: bind start: uid=ftweedal,cn=users,cn=accounts,dc=ipa,dc=local
Aug 04 02:44:37 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: bind end: success
Aug 04 02:44:37 ipa-2.ipa.local ipa-otpd[3910]: ftweedal@IPA.LOCAL: response sent: Access-Accept
Aug 04 02:44:38 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: bind end: Invalid credentials
Aug 04 02:44:38 ipa-2.ipa.local ipa-otpd[3935]: ftweedal@IPA.LOCAL: response sent: Access-Reject

The KDC log output likewise showed two KRB_AS_REQ requests coming from the client (i.e. kinit) – one of these resulted in a ticket being issued, and the other resulted in a KDC_ERR_PREAUTH_FAILED response. Therefore, after all this investigation, the cause of the problem seems to be aggressive retry behaviour in kinit.

I had been testing with MIT Kerberos version 1.11.5 from the Fedora 20 repositories. A quick scan of the Kerberos commit log turned up some promising changes released in version 1.12. Since the Fedora package for 1.11 includes a number of backports from 1.12 already, I backported the most promising change: one that relaxes the timeout if kinit connects to the KDC over TCP. Unfortunately, this did not fix the issue.

I was curious whether version the 1.12 client exhibited the same behaviour. The Fedora 21 repositories have MIT Kerberos version 1.12, so I installed a preview release and enrolled the host. OTP authentication worked fine, so the change I backported to 1.11 was either the wrong change, or needed other changes to work properly.

Since HOTP authentication in FreeIPA is somewhat discouraged due to the cost and other implications of counter synchronisation in a replicated environment, and since the problem seems to be rectified in MIT Kerberos 1.12, I was happy to conclude my investigations at this point.

Concluding thoughts

OTP authentication in FreeIPA involves a lot of different servers, plugins and libraries. To provide the OTP functionality and make all the services work together, freeipa-server ships a KDC plugin, a directory server plugin, and the ipa-otpd daemon! Was it necessary to have this many moving parts?

The original design proposal explains many of the design decisions. In particular, ipa-otpd is necessary for a couple of reasons. The first is the fact that the MIT KDC supports only RADIUS servers for OTP validation, so for native OTP support we must have some component act as a RADIUS server. Second, the KDC radius configuration is static, so configuration is simplified by having the KDC talk only to ipa-otpd for OTP validation. It is also nice that ipa-otpd is the sole arbiter of whether to proxy a request to an external RADIUS server or to attempt an LDAP BIND.

What if the KDC could dynamically work out where to direct RADIUS packets for OTP validation? It is not hard to conceieve of this, since it already dynamically learns whether a principal is configured for OTP by way of the ipadb.so plugin. But even if this were possible, the current design is arguably preferable since, unlike the KDC, we have full control over the implementation of ipa-otpd and are therefore better placed to respond to performance or security concerns in this aspect of the OTP authentication flow.

July 25, 2014

Threat: Fred the System Administrator

FredSysadmin

In terms of threat potential, Fred is off the charts. In order to do his job, he has essentially uncontrolled access to all computer resources. Fred can damage software and data in obvious or subtle ways. He can wipe out users, steal data, and wreak almost unimaginable carnage.

Fortunately, the vast majority of system administrators are conscientious, professional and honest. They are a force for good, committed to keeping systems running smoothly, data protected, and users productive.

Fred is a risk to system integrity in two ways – accidentally and deliberately.

Most of the time, the greatest threat from Fred is that he doesn’t have the resources he needs to do his job or that his hands are tied by management edicts. These factors can cause system administrators to do (or not do) things that threaten system integrity and security. If Fred is denied budget for proper backups, data is at risk. If Fred is ordered to punch a hole through the firewall to allow sales people access to the orders database, without VPN and proper authentication, systems are at risk. If Fred is ordered to allow contractors access to internal networks – see the Target case – the entire network can be exposed.

In the Target case it isn’t clear if the issue was do to a network design problem or if there were orders to provide this access. This would be interesting to know.

If Fred does go bad, there is almost no limit to the damage he can do. Even if he doesn’t compromise systems he can commit identity or credit card theft or steal company – or even national – confidential data. I don’t think I have to do more than mention the name Edward Snowden…

A number of things can be done to mitigate the threats that Fred presents:

  • Recruit and hire system administrators carefully! Look for proof of integrity as well as technical skills.
  • Ensure that your sysadmins have the training, resources and management support to do their jobs.
  • “Trust but verify.” Have regular system audits. Ensure that system access and changes are logged to a secure remote logging server. Look at the log files! Apply technology, process, and people to maintaining the integrity of system management.
  • Divide responsibilities. Large companies will have separate organizations responsible for systems, networks, storage and applications. Divide up the work and accountability to address both functional and system integrity needs.
  • Focus on detection, mitigation and remediation more than prevention. Go talk to your colleagues in Finance – they have hundreds of years of experience working with high value systems.  You will be surprised at what you can learn from them. They have evolved a model that is designed to prevent theft and misuse where possible, to detect it when it does occur, and to minimize losses. They are aware that you can’t stop everything while keeping the business going – but you should be able to minimize losses and to discover things eventually. Find out how they do policies and procedures, the ethical and business guidelines they follow, how they implement internal controls, and how they balance risk and cost. Hint: it isn’t worthwhile spending $10,000 to stop $20 in losses. But if someone is stealing $10 here and $10 there, you want to find out about it before it grows.

 


July 24, 2014

Devstack mounted via NFS

Devstack allows the developer to work with the master branches for upstream OpenStack development. But Devstack performs many operations (such as replacing pip) that might be viewed as corrupting a machine, and should not be done on your development workstation. I’m currently developing with Devstack on a Virtual Machine running on my system. Here is my setup:

Both my virtual machine and my Base OS are Fedora 20. To run a virtual machine, I use KVM and virt-manager. My VM is fairly beefy, with 2 GB of Ram allocated, and a 28 GB hard disk.

I keep my code in git repositories on my host laptop. To make the code available to the virtual machine, I export them via NFS, and mount them on the host VM in /opt/stack, owned by the ayoung user, which mirrors the setup on the base system.

Make sure NFS is running with:

sudo systemctl enable nfs-server.service 
sudo systemctl start  nfs-server.service

My /etc/exports:

/opt/stack/ *(rw,sync,no_root_squash,no_subtree_check)

And to enable changes in this file

sudo exportfs

Make sure firewalld has the port for nfs open, but only for the internal network. For me, this is interface

virbr0: flags=4163 UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255

I used the firewall-config application to modify firewalld:

For both, make sure the Configuration select box is set on Permanent or you will be making this change each time you reboot.

First add the interface:

firewalld-nfs-interfaces

And enable NFS:

firewalld-nfs-ports

In the Virtual machine, I added a user (ayoung) with the same numeric userid and group id from my base laptop. To find these values:

$ getent passwd ayoung
ayoung:x:14370:14370:Adam Young:/home/ayoung:/bin/bash

I admit I created them when I installed the VM, which I did using the Anaconda installer and a DVD net-install image. However, the same thing can be done using user-add. I also added the user to the wheel group, which simplifies sudo.

On the remote machine, I created /opt/stack and let the ayoung user own them:

$ sudo mkdir /opt/stack ; sudo chown ayoung:ayoung /opt/stack

To mount the directory via nfs, I made an /etc/fstab entry:

192.168.122.1:/opt/stack /opt/stack              nfs4  defaults 0 0 

And now I can mount the directory with:

$ sudo mount /opt/stack

I went through and updated the git repos in /opt/stack using a simple shell script.

 for DIR in `ls` ; do pushd $DIR ; git fetch ; git rebase origin/master ; popd ; done

The alternative is setting RECLONE=yes in /opt/stack/devstack/localrc.

When running devstack, I had to make sure that the directory /opt/stack/data was created on the host machine. Devstack attempted to create it, but got an error induced by nfs.

Why did I go this route? I need to work on code running in HTTPD, namely Horizon and Keystone. THat preclueded me from doing all of my work in a venv on my laptop. The NFS mount gives me a few things:

  • I keep my Git repo intact on my laptop. This includes the Private key to access Gerrit
  • I can edit using PyCharm on my Laptop.
  • I am sure that the code on my laptop and in my virtual machine is identical.

This last point is essential for remote debugging. I just go this to work for Keystone, and have submitted a patch that enables it for Keystone. I’ll be working up something comparable for Horizon here shortly.

July 21, 2014

Threats: William the Manager
William the Manager

William the Manager

William is concerned with his group getting their job done. He is under budget pressure, time pressure, and requirements to deliver. William is a good manager – he is concerned for his people and dedicated to removing obstacles that get in their way.

To a large degree William is measured on current performance and expectations for the next quarter. This means that he has little sympathy for other departments getting in the way of his people making the business successful! A lot of his job involves working with other groups to make sure that they meet his needs. And when they don’t, he gets them over-ruled or works around them.

When William does planning – and he does! – he is focused on generating business value and getting results that benefit him and his team. He is not especially concerned about global architecture or systems design or “that long list of hypothetical security issues”. Get the job done, generate value for the company, and move on to the next opportunity.

William sees IT departments as an obstacle to overcome – they are slow, non-responsive, and keep doing things that get in the way of his team. He sees the security team in particular as being an unreasonable group of people who have no idea what things are like in the real world, and who seem be be dedicated to coming up with all sorts of ridiculous requirements that are apparently designed to keep the business from succeeding.

William, with the best of intentions, is likely to compromise and work around security controls – and often gets the support of top management in doing this. To be more blunt, if security gets in the way, it is gone! If a security feature interferes with getting work done, he will issue orders to turn that feature off. If you look at some of my other posts on the value of IT and computer systems, such as Creating Business Value, you will see that, at least in some cases, William may be right.

And this is assuming that William is a good corporate citizen, looking out for the best interests of the company. If he is just looking out for himself, the situation can be much worse.

It is not enough to try to educate William on security issues – for one thing (depending on the security feature), William may be right! The only chance for security is to find ways to implement security controls that don’t excessively impact the business units. And to keep the nuclear option for the severe cases where it is needed, such as saving credit card numbers in plain text on an Internet facing system. (Yes, this can easily happen – for example, William might set up a “quick and dirty” ecommerce system on AWS if the IT group isn’t able to meet his needs.)


July 18, 2014

Oh No! I Committed to master! What do I do?

You were working in a git repo and you committed your change to master. Happens all the time. Panic not.

Here are the steps to recover

Create a new branch from your current master branch. This will include your new commit.  (to be clear, you should replace ‘description-of-work’  with a short name for your new branch)

git branch description-of-work

now reset your current master branch to upstream

git reset --hard origin/master

All fixed.

Why did this work?

A branch in git points to a specific commit.  All commits are named by hashes.  For example, right now, I have a keystone repo with my master branch pointing to

$ git show master
commit bbfd58a6c190607f7063d15a3e2836e40806ef57
Merge: e523119 f18911e
Author: Jenkins <jenkins@review.openstack.org>
Date: Fri Jul 11 23:36:17 2014 +0000

Merge "Do not use keystone's config for nova's port"

This is defined by a file in .git/refs:

$ cat .git/refs/heads/master 
bbfd58a6c190607f7063d15a3e2836e40806ef57

I could edit this file by hand and have the same effect as a git checkout. Lets do exactly that.

$ cp .git/refs/heads/master .git/refs/heads/edit-by-hand
$ git branch 
  edit-by-hand
* master
$ git show edit-by-hand 
commit bbfd58a6c190607f7063d15a3e2836e40806ef57
Merge: e523119 f18911e
Author: Jenkins <jenkins>
Date:   Fri Jul 11 23:36:17 2014 +0000

    Merge "Do not use keystone's config for nova's port"

OK, lets modify this the right way:

$ git checkout edit-by-hand 
Switched to branch 'edit-by-hand'
$ git reset --hard HEAD~1
HEAD is now at e523119 Merge "Adds hacking check for debug logging translations"
$ git show edit-by-hand 
commit e52311945a4ab3b47a39084b51a2cc596a2a1161
Merge: b0d690a 76baf5b
Author: Jenkins <jenkins>
Date:   Fri Jul 11 22:19:03 2014 +0000

    Merge "Adds hacking check for debug logging translations"
...

that made it in to:

$ cat .git/refs/heads/edit-by-hand 
e52311945a4ab3b47a39084b51a2cc596a2a1161

Here is the history for the edit-by-hand branch

$ git log edit-by-hand  --oneline

Returns

e523119 Merge "Adds hacking check for debug logging translations"
b0d690a Merge "multi-backend support for identity"
6aa0ad5 Merge "Imported Translations from Transifex"
...

I want the full hash for 6aa0ad5 so:

git show --stat 6aa0ad5
commit 6aa0ad5beb39107ffece6e5d4a068d77f7d51059

Lets set the branch to point to this:

$ echo 6aa0ad5beb39107ffece6e5d4a068d77f7d51059 > .git/refs/heads/edit-by-hand 
$ git show
commit 6aa0ad5beb39107ffece6e5d4a068d77f7d51059
Merge: 2bca93f bf8a2e2
Author: Jenkins <jenkins>
Date:   Fri Jul 11 21:48:09 2014 +0000

    Merge "Imported Translations from Transifex"

What would you expect now? I haven’t looked yet, but I would expect git to tell me that I have a bunch of unstaged changes; basically, everything that was in the commits on master that I chopped off of edit-by-hand. Lets look:

$ git status
On branch edit-by-hand
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   doc/source/configuration.rst
	modified:   doc/source/developing.rst
	modified:   etc/keystone.conf.sample
...
Untracked files:
  (use "git add <file>..." to include in what will be committed)

It was pretty long, so I cut out some files.

I can undo all those “changes” by setting the hash back to the original value:

$ echo e52311945a4ab3b47a39084b51a2cc596a2a1161   > .git/refs/heads/edit-by-hand 
$ git status
On branch edit-by-hand
Untracked files:
  (use "git add <file>..." to include in what will be committed)

Why did that work? Without the explicit “checkout” command, the current workspace was left unchanged.

July 16, 2014

Kerberos for Horizon and Keystone

I have a Horizon instance Proof of Concept. It has a way to go to be used in production, but the mechanism works.

This is not a how-to. I’ve written up some of the steps in the past. Instead, this is an attempt to illuminate some of the issues.

I started with a Packstack based all in one on Fedora 20 instance registered as a FreeIPA client. I hand modified Keystone to use HTTPD.

The Horizon HTTP instance is set up with S4U2Proxy. Since both Horizon and Keystone are on the same machine, it is both the source and target of the proxy rule; a user connecting via HTTPS gets a service ticket for Horizon, which then requests a delegated service ticket for itself. I’m not seeing any traffic on the KDC when this happens, which leads me to think that the Kerberos library is smart enough to reuse the initial service ticket for the user. However, I’ve also tested S4U2Proxy from HTTPD to Keystone on a remote machine in an earlier set up, and am fairly certain that this will work when Horizon and Keystone are not co-located.

After initial configuration, I did a git clone of the repositories for the projects I needed to modify:

  • django_openstack_auth
  • keystone
  • python-keystoneclient
  • python-openstackclient

To use this code, I switched to each directory and ran:

sudo pip install -e .

Horizon

overseas-broadening-horizons630x354

Horizon uses form based authentication. I have not modified this. Longer term, we would need to determine what UI to show based on the authentication mechanism. I would like to be able to disable Form Based authentication for the Kerberos case, as I think passing your Kerberos password over the wire is one of the worst security practices; We should actively discourage it.

Django OpenStack Auth and Keystone Client

Django_Reinhardt,_Aquarium,_New_York,_N.Y.,_ca._Nov._1946_(William_P._Gottlieb_07311)

Django Reinhardt

Horizon uses a project called django-openstack-auth that communicates with Keystone client. This needs to work with client auth plugins. I’ve hacked this in code, but the right solution is for it to get the auth plugin out of the Django configuration options. Implied here is that django-openstack-auth should be able to use keystoneclient sessions and V3 Keystone authentication.

keystone_2705841439_067c16b192

When a user authenticates to Horizon, they do not set a project. Thus, Horizon does not know what project to pass in the token request. The Token API has some strange behaviour when it comes to token requests without an explicit scope. If the user has a default project set, they get a token scoped to that project. If they do not have a default project set, they get an unscoped token.

Jamie Lennox has some outstanding patches to Keystone client that address the Kerberos use cases. Specifically, if the client gets an unscoped token, there is no service catalog. Since the client makes calls to Keystone based on endpoints in the service catalog, it cannot use an unscoped token. One of Jamie’s patches address this; if there is no service catalog, continue to use the Auth URL to talk to Keystone. This is a bit of an abuse of the Auth URL and really should only be used to get the list of domains or projects for a user. Once the user has this information, they can request a scoped token. This is what Horizon needs to do on behalf of the user.

While Keystone can use Kerberos as an “method” value when creating a token, the current set of plugins did not allow for mapping to the DefaultDomain. There is a plugin for “external” to do that, and I subclassed that for Kerberos. There is an outstanding ticket for removing the “method” value from the plugin implementations, which will reduce the number of classes we need to implement common behavior.

To talk to a Kerberos protected Keystone, the Keystone client needs to use the Kerberos Auth plugin. However, it can only use this to get a token; the auth plugin does not handle other communication. Thus, only the /auth/tokens path should be Kerberos protected. Here is what I am using:

<Location "/keystone/krb">
  LogLevel debug
  WSGIProcessGroup keystone_krb_wsgi
  NSSRequireSSL
</Location>

<Location "/keystone/krb/v3/auth/tokens">
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  # defaults off, but to be explicit
  # Keystone should not be a proxy 
  KrbConstrainedDelegation off
  Require valid-user
  NSSRequireSSL
</Location>

<Location "/dashboard/auth/login/">
  LogLevel debug
  AuthType Kerberos
  AuthName "Kerberos Login"
  KrbMethodNegotiate on
  KrbMethodK5Passwd off
  KrbServiceName HTTP
  KrbAuthRealms IPA.CLOUDLAB.FREEIPA.ORG
  Krb5KeyTab /etc/httpd/conf/openstack.keytab
  KrbSaveCredentials on
  KrbLocalUserMapping on
  KrbConstrainedDelegation on
  Require valid-user
  NSSRequireSSL
</Location>

To go further, the Kerberos protection should be optional if you wish to allow other authentication methods to be stacked with Kerberos.

This implies that the value:

Require valid-user

Should be in the HTTPD conf section for Horizon, but not for Keystone. This does not yet work for me, and I will investigate further.

Herakles_menangkap_Kerberos

Manage Kerberos like a Hero

Secure web communications require cryptography. For authentication, the two HTTP based standards are Client Side Certificates and Kerberos. Of the two, only Kerberos allows for constrained delegation in a standardized way. Making Kerberos part of the standard approach to OpenStack will lead to a more secure OpenStack.