February 09, 2012

Is GPL usage really declining?
Matthew Aslett wrote about how the proportion of projects released under GPL-like licenses appears to be declining, at least as far as various sets of figures go. But what does that actually mean? In absolute terms, GPL use has increased - any change isn't down to GPL projects transitioning over to liberal licenses. But an increasing number of new projects are being released under liberal licenses. Why is that?

The figures from Black Duck aren't a great help here, because they tell us very little about the software they're looking at. FLOSSmole is rather more interesting. I pulled the license figures from a few sites and found the following proportion of GPLed projects:

RubyForge: ~30%
Google Code: ~50%
Launchpad: ~70%

I've left the numbers rough because there's various uncertainties - should proprietary licenses be included in the numbers, is CC Sharealike enough like the GPL to count it there, that kind of thing. But what's clear is that these three sites have massively different levels of GPL use, and it's not hard to imagine why. They all attract different types of developer. The RubyForge figures are obviously going to be heavily influenced by Ruby developers, and that (handwavily) implies more of a bias towards web developers than the general developer population. Launchpad, on the other hand, is going to have a closer association with people with an Ubuntu background - it's probably more representative of Linux developers. Google Code? The 50% figure is the closest to the 56.8% figure that Black Duck give, so it's probably representative of the more general development community.

The impression gained from this is that the probability of you using one of the GPL licenses is influenced by the community that you're part of. And it's not a huge leap to believe that an increasing number of developers are targeting the web, and the web development community has never been especially attached to the GPL. It's not hard to see why - the benefits of the GPL vanish pretty much entirely when you're never actually obliged to distribute the code, and while Affero attempts to compensate from that it also constrains your UI and deployment model. No matter how strong a believer in Copyleft you are, the web makes it difficult for users to take any advantage of the freedoms you'd want to offer. It's as easy not to bother.
So it's pretty unsurprising that an increase in web development would be associated with a decrease in the proportion of projects licensed under the GPL.

This obviously isn't a rigorous analysis. I have very little hard evidence to back up my assumptions. But nor does anyone who claims that the change is because the FSF alienated the community during GPLv3 development. I'd be fascinated to see someone spend some time comparing project type with license use and trying to come up with a more convincing argument.

(Raw data from FLOSSmole: Howison, J., Conklin, M., & Crowston, K. (2006). FLOSSmole: A collaborative repository for FLOSS research data and analyses. International Journal of Information Technology and Web Engineering, 1(3), 17–26.)

comment count unavailable comments

February 07, 2012

Multitouch in X - Multitouch-touchpads
This post is part of a series on multi-touch support in the X.Org X server.
  1. Multitouch in X - Getting events
  2. Multitouch in X - Pointer emulation
  3. Multitouch in X - Touch grab handling
In this post, I'll outline the handling of touch events for multitouch-capable touchpads. Multi-touch touchpads that are supported are those that provide position information for more than one finger. The current version of the synaptics X driver does some tricks to pretend two-finger interaction on single-finger touchpads - such devices are not applicable here.

Touchpads are primarily pointer devices and any multi-touch interaction is usually a gesture. In the protocol, such devices are of the type XIDependentDevice and the server does adjust touch event delivery. I've already hinted at this here but this time I'll give a more detailed explanation.

Touch event delivery

Unlike for direct touch devices such as touchscreens, dependent devices have a different picking mechanism in the server. We assume that all gestures are semantically associated with the cursor position. For example, for scrolling, you would move the cursor on top of the window to be scrolled, then you would start scrolling. The server thus adjusts event delivery accordingly. Whereas for direct touch devices the touch events are delivered to whichever window is at the position of the touch, touch events from dependent devices are always delivered to the window underneath the pointer (grab semantics are adjusted to follow the same rules). So if you start a gesture in the top-left corner of the touchpad, the window underneath the cursor gets the events with the top-left coordinates. Note that the event and root coordinates always reflect the pointer position.

The average multi-touch touchpad has two modes of operation: single-finger operation where the purpose is to move the visible cursor and multi-finger operation which is usually interpreted into a gesture on the client. These two modes are important, as they too affect event delivery. The protocol specifies that any interaction with the device that serves to move the visible cursor only should not generate touch events, and that touch events will start once that interaction becomes a true multi-touch interaction. This leaves the drivers a little bit of leeway, but the current implementation in the synaptics driver is the following:
  1. A user places one finger on the touchpad and moves. The client will receive regular pointer events.
  2. The user places a second finger on the touchpad. The client will now receive a TouchBegin event for the first and the second touch, at their respective current positions in device coordinate range.
  3. Movement of either finger now will generate touch events, but no pointer events.
  4. Any other fingers will generate touch events only.
  5. When one of two remaining fingers on the touchpoint ceases the touch, a TouchEnd is sent for both the terminating touch and the remaining touch. The remaining finger will revert to sending pointer events.

Legacy in-driver gestures

As you are likely aware, the synaptics driver currently supports several pseudo gestures such as tap-to-click or two-finger scrolling. These gestures are interpreted in the driver, thus the server and client never see the raw data for them.

With proper multi-touch support these gestures are now somewhat misplaced. On the one hand, we want the clients to interpret multitouch, on the other hand we want the gestures to be handled in the same manner in all applications. (Coincidentally, this is also a problem that we need to solve for Wayland).

We toyed with ideas of marking emulated events so clients can filter but since we do need to be compatible to the core and XI 1.x behaviours, we only found one solution: any in-driver gestures that alter event deliver must not generate touch events. Thus, if the driver is set to handle two-finger scrolling, the clients will only see the pointer events and scroll events, they will not see touch events from two-fingers. To get two-finger scrolling handled by the client, the in-driver gesture must be disabled. The obvious side-effect of that is that you then cannot scroll in applications that don't support the gestures. Oh well, it's the price we have to pay for having integrated gesture support in the wrong place.

February 01, 2012

Google is killing Free Software

Now that I got your attention with the catchy title, lemme rephrase that in a somewhat longer sentence:

Google is the greatest danger to the Free Software movement at the current time.

I’m not sure I should presume intent because of Hanlon’s razor, but a lot of smart people concerned about Free Software work at Google, so they should at least be aware of it.

The first problem I have with Google is that they are actively working on making the world of Free Software a worse place. The best example for this is Native Client. It’s essentially a tool that allows building web pages without submitting anything even resembling source code to the client. And thereby it’s killing the “View Source” option. (You could easily build such a tool as Free Software if instead of transmitting the binary, you’d transmit the source code. Compare HTML with Flash here.)

The second problem I have with Google is that what they do actively confuses people about Free Software. Google likes to portray itself as an Open Source company and managed to accumulate a lot of geek cred that way (compared to Apple, Amazon or Microsoft). But unfortunately, a lot of people get introduced to Open Source and Free Software via Google. And withholding source code for a while, shipping source code to something similar are close enough to confuse unsuspecting bystanders. And I fear that behavior is doing more harm than good to Free Software.

All that said, I don’t think Google is stupid, illegal or even illogical in what they do. Everything they do makes sense. All I’m saying is that they’re scumbags and you should be aware of that.

January 30, 2012

The ongoing fight against GPL enforcement
GPL enforcement is a surprisingly difficult task. It's not just a matter of identifying an infringement - you need to make sure you have a copyright holder on your side, spend some money sending letters asking people to come into compliance, spend more money initiating a suit, spend even more money encouraging people to settle, spend yet more money actually taking them to court and then maybe, at the end, you have some source code. One of the (tiny) number of groups involved in doing this is the Software Freedom Conservancy, a non-profit organisation that offers various services to free software projects. One of their notable activities is enforcing the license of Busybox, a GPLed multi-purpose application that's used in many embedded Linux environments. And this is where things get interesting

GPLv2 (the license covering the relevant code) contains the following as part of section 4:

Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License.

There's some argument over what this means, precisely, but GPLv3 adds the following paragraph:

However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation

which tends to support the assertion that, under V2, once the license is terminated you've lost it forever. That gives the SFC a lever. If a vendor is shipping products using Busybox, and is found to be in violation, this interpretation of GPLv2 means that they have no license to ship Busybox again until the copyright holders (or their agents) grant them another. This is a bit of a problem if your entire stock consists of devices running Busybox. The SFC will grant a new license, but on one condition - not only must you provide the source code to Busybox, you must provide the source code to all other works on the device that require source distribution.

The outcome of this is that we've gained access to large bodies of source code that would otherwise have been kept by companies. The SFC have successfully used Busybox to force the source release of many vendor kernels, ensuring that users have the freedoms that the copyright holders granted to them. Everybody wins, with the exception of the violators. And it seems that they're unenthusiastic about that.

A couple of weeks ago, this page appeared on the elinux.org wiki. It's written by an engineer at Sony, and it's calling for contributions to rewriting Busybox. This would be entirely reasonable if it were for technical reasons, but it's not - it's explicitly stated that companies are afraid that Busybox copyright holders may force them to comply with the licenses of software they ship. If you ship this Busybox replacement instead of the original Busybox you'll be safe from the SFC. You'll be able to violate licenses with impunity.

What can we do? The real problem here is that the SFC's reliance on Busybox means that they're only able to target infringers who use that Busybox code. No significant kernel copyright holders have so far offered to allow the SFC to enforce their copyrights, with the result that enforcement action will grind to a halt as vendors move over to this Busybox replacement. So, if you hold copyright over any part of the Linux kernel, I'd urge you to get in touch with them. The alternative is a strangely ironic world where Sony are simultaneously funding lobbying for copyright enforcement against individuals and tools to help large corporations infringe at will. I'm not enthusiastic about that.

comment count unavailable comments

January 28, 2012

Getting conned: eBay/Paypal fun
After seeing, this article about "How secure is Paypal for eBay sellers" in this morning's Guardian, I'll share my personal experience with you.

In October, I sold my first generation MacBook Air on eBay, and got a buyer within a day for the £500 "Buy It Now" price. "Buy It Now" requires using Paypal, and the £500 (minus commission) appeared in my Paypal account¹. After a bit of to and fro, the buyer got in contact, and suggested that he come and pick it up. Saving about £30 of shipping, and sorting out the sale faster, strike me as good ideas.

The "buyer" said he couldn't come, sent one of his "employees". A very courteous man came to pick the laptop. In hindsight, he seemed slightly uncomfortable, and looked like he was very happy to see how easy it was going to be.

The spooky thing is that within 40 minutes -- note, not 3 hours, not a day after, not the day before) -- within 40 minutes of the laptop getting picked up, the holder of the eBay and Paypal accounts submitted an "unauthorised account activity claim", leading to "payment reversal" (me owing £500 to Paypal²).

During my call to eBay's customer support, I was told that "I had nothing to worry about" (I'm guessing that would be the case as long as I repaid the £500). Paypal promptly sent a mail mentioning they needed my help, but with very little possibilities from my side ("no courier tracking number? Give us the money now").

Surrey Police failed to find the culprits, with the 2 mobile numbers associated with the con only being pay-as-you-go phones (topped up in a little convenience store in North London that only keeps a day's worth of CCTV).

So my advices:

  • If you sell anything via eBay using Paypal, send it recorded, and keep the receipt.
  • If you bought a MacBook Air first generation with the serial W88500DJ12G, it's stolen, send me an e-mail.

And as opposed to Mssrs Lodge and Reakes, Paypal didn't reimburse me anything, and I'm £500 out of pocket.

¹: I'll pass you the details on eBay referring to a closed Paypal account that meant I got conned two days later than the "buyers" anticipated.
²: On an account that was already closed, see ¹.

Update: Added mention of eBay's ludicrously bad customer service.

January 27, 2012

Wacom tablets in GNOME 3.4
Working from designs.


The cool stuff first

<object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://i.ytimg.com/vi/607uIdBmozU/0.jpg" height="266" width="320"><param name="movie" value="http://www.youtube.com/v/607uIdBmozU?version=3&amp;f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata"/><param name="bgcolor" value="#FFFFFF"/><embed height="266" src="http://www.youtube.com/v/607uIdBmozU?version=3&amp;f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" type="application/x-shockwave-flash" width="320"></embed></object>
Cosimo Cecchi presents the updated Wacom settings

Go to YouTube directly if you can't see the video here.

A new arrival

As mentioned by Cosimo, we have a new library to help us implement the settings you saw: libwacom.

libwacom is there to give us metadata about tablets, whether or not they are connected to your system, the list of styli it supports, as well as information about the styli themselves. As you can see from the UI, it's pretty important that we know:

  • whether the tablet is builtin (so we know whether you can calibrate it)
  • which form factor it has
  • the list of styli it supports
  • for each stylus, its full name, the number of buttons, what it looks like
In the past, all this information was only available within the drivers (as comments), exported in different ways (sysfs attributes), non-machine readable in public documentation, or, worst of all, hidden in Wacom's internal drivers for OS X or Windows.

So if you have a Wacom tablet, send us a definition file for your tablet, so you can configure it with the impression that the software actually knows about your device.

Where's that configuration again

After knowing what each tablet had to offer, we had to have a way to match the definitions to XInput devices, assign settings per-tablet, and importantly, switch stylus configuration when the user switches stylus. This is done using the new GsdWacomDevice and GsdWacomStylus objects, shared between gnome-settings-daemon (which will apply the configuration) and gnome-control-center (which will set the configuration).

This also means we have a few debugging applications, such as list-wacom in gnome-settings-daemon, to show you the attached GsdWacomDevices, or test-wacom in gnome-control-center, to test display of particular tablets if you don't own them (this is the place where I spend a lot of time).

What's next

Peter Hutterer, my input buddy at Red Hat, who made the original Wacom panel for GNOME 3.2, and the first version of libwacom, is currently spending a lot of time on Multi-Touch, and fixing bugs I report in the Wacom driver.

Jason Gerecke, from Wacom, who did most of the initial work on calibration support, is working on the related display-mapping. This will allow choosing whether a tablet's working area is the whole desktop, or a single monitor when in multiple monitors are used.

For my part, after fixing the layout bugs that so annoy me in the settings panel, I'll be starting work on tablet button mapping. I look forward to making the LEDs on the tablet match up with the selected keyboard shortcut!

Many thanks to Cosimo and Monty for helping out with presenting the work, and doing the video.

January 26, 2012

The Case for the /usr Merge

One of the features of Fedora 17 is the /usr merge, put forward by Harald Hoyer and Kay Sievers[1]. In the time since this feature has been proposed repetitive discussions took place all over the various Free Software communities, and usually the same questions were asked: what the reasons behind this feature were, and whether it makes sense to adopt the same scheme for distribution XYZ, too.

Especially in the Non-Fedora world it appears to be socially unacceptable to actually have a look at the Fedora feature page (where many of the questions are already brought up and answered) which is very unfortunate. To improve the situation I spent some time today to summarize the reasons for the /usr merge independently. I'd hence like to direct you to this new page I put up which tries to summarize the reasons for this, with an emphasis on the compatibility point of view:

The Case for the /usr Merge

Note that even though this page is in the systemd wiki, what it covers is mostly orthogonal to systemd. systemd supports both systems with a merged /usr and with a split /usr, and the /usr merge should be interesting for non-systemd distributions as well.

Primarily I put this together to have a nice place to point all those folks who continue to write me annoyed emails, even though I am actually not even working on all of this...

Enjoy the read!

Footnotes:

[1] And not actually by me, I am just a supportive spectator and am not doing any work on it. Unfortunately some tech press folks created the false impression I was behind this. But credit where credit is due, this is all Harald's and Kay's work.

January 23, 2012

UEFI and bugs
I gave a presentation on UEFI at LCA 2012 - you can watch it here, with the bonus repeat (and different jokes) here. It's a gentle introduction to UEFI, followed by some discussion of the problems we've faced in dealing with implementation bugs.

The fundamental problem is that UEFI is a lot of code. And I really do mean a lot of code. Ignoring drivers, the x86 Linux kernel is around 30MB of code. A comparable subset of the UEFI tree is around 35MB. UEFI is of a comparable degree of complexity to the Linux kernel. There's no reason to assume that the people who've actually written this code are significantly more or less competent than an average Linux developer, so all else being equal we'd probably expect somewhere around the same number of bugs per line. Of course, not all else is equal.

Even today, basically all hardware is shipping with BIOS by default. The only people to enable UEFI are enthusiasts. Various machines will pop up all kinds of dire warnings if you try to turn it on. UEFI has had very little real world testing. And it really does show. In the few months I've been working on UEFI I've discovered machines where SetVirtualAddressMap() calls code that has already been (per spec) discarded. I've seen cases where it was possible to create variables, but not to delete them. I've seen a machine that would irreparably corrupt its firmware when you tried to set a variable. I've tripped over code that fails to parse invalid boot variables, bricking the hardware. Many vendors independently fail to report the correct framebuffer stride. And those are just the ones that have ended up on hardware which crosses my desk, which means I haven't even tested the majority of consumer-grade hardware with UEFI.

The problems with UEFI have very little to do with its design or the ability of the people implementing it. After a few years of iterative improvements it stands a good chance of being more reliable and useful than BIOS. Increased commonality of code between vendors is a blessing and a curse - in the long term it means that these bugs can be shaken out, but in the short term it means that at least one hardware-destroying bug has ended up carried by multiple vendors. Right now we're still in the short term and it's likely that we'll find yet more UEFI bugs that cause all sorts of problems. The next few years will probably be a bumpy ride.

comment count unavailable comments

January 20, 2012

Plumbers Wishlist, The Third Edition, a.k.a. "The Thank You Edition"

Last October we published a wishlist for plumbing related features we'd like to see added to the Linux kernel. Three months later it's time to publish a short update, and explain what has been implemented in the kernel, what people have started working on, and what's still missing.

The full, updated list is available on Google Docs.

In general, I must say that the list turned out to be a great success. It shows how awesome the Open Source community is: Just ask nicely and there's a good chance they'll fulfill your wishes! Thank you very much, Linux community!

We'd like to thank everybody who worked on any of the features on that list: Lucas De Marchi, Andi Kleen, Dan Ballard, Li Zefan, Kirill A. Shutemov, Davidlohr Bueso, Cong Wang, Lennart Poettering, Kay Sievers.

Of the items on the list 5 have been fully implemented and are already part of a released kernel, or already merged for inclusion for the next kernels being released.

For 4 further items patches have been posted, and I am hoping they'll get merged eventually. Davidlohr, Wang, Zefan, Kirill, it would be great if you'd continue working on your patches, as we think they are following the right approach[1] even if there was some opposition to them on LKML. So, please keep pushing to solve the outstanding issues and thanks for your work so far!

Footnotes

[1] Yes, I still believe that tmpfs quota should be implemented via resource limits, as everything else wouldn't work, as we don't want to implement complex and fragile userspace infrastructure to racily upload complex quota data for all current and future UIDs ever used on the system into each tmpfs mount point at mount time.

XKB breaking grabs - CVE-2012-0064
Given that there is a copious amount of misinformation being spread, here is a summary of CVE-2012-0064, straight from a horse's mouth.

Outline of the issue

The bug allows users to work around screen locking (e.g. gnome-screensaver) by hitting Control+Alt+keypad multiply or Control+Alt+keypad divide. This terminates the input grab the screensaver has and thus allows a user to interact with the desktop, skipping the password entry.

Affected versions

Affected is anyone running X server 1.11 or later (or release candidates thereof). So if "Xorg -version" shows something else on your box, stop worrying. I doubt any distribution would have back-ported the patches.

In Fedora/Red Hat land - the only distributions affected are Fedora 16 and current Fedora Rawhide. Both have been fixed, the F16 update is avaialble here. Note that the update is to xkeyboard-config, not to the server itself.

Fedora 15 is not affected. RHEL 4, 5, 6 and thus CentOS 4, 5, 6 and other derivatives are not affected. I believe that most other distributions have now pushed out updates as well, if you want to link to the respective updates, please do so in the comments.

Sergey has also pushed out xkeyboard-config 2.5 today with the fix included.

History of the feature

The X protocol does not allow the server to break grabs. Once a client has a grab, the server must wait for that client to release the grab, terminate, or the grab window to become unviewable. This is an issue when debugging applications - if your client has a keyboard grab, you cannot use the debugger since all key events will go to the client being debugged. To avoid this issue, the X server has had two combinations to break grabs: Control+Alt+Keypad multiply and Control+Alt+Keypad divide. They forced grab termination inside the X server and although against the protocol it made debugging possible. The option required explicit enabling in the xorg.conf.

These options were removed in server 1.4 and disabled since. Which made debugging hard, so last year we merged a patch to bring them back, together with some other features. They are triggered by XKB actions (as they used to be). The plan was to remove the XKB actions from the default keymap so that the action is available on request but not enabled by default. This is where a miscommunication happened, the removal from the default keymap never happened. So server 1.11 and vanilla xkeyboard-config ship with both the actions available and present in the current keymap. As a result, any user can break a grab from any application and thus get around screen locking.

Outline of the fix

To shoot yourself in the foot, you need two items: a gun and a trigger. We have removed the trigger. The fix we've now pushed into xkeyboard-config removes the actions from the default keymap and into an XKB option instead. So the fix does not remove the gun, but it requires the user to screw the trigger in themselves before trying to hurt themselves. In a default configuration, it is thus no longer possible to break the grab of your screensaver.

To re-enable grab debugging, run setxkbmap with "-option grab:break_actions" or enable "Allow breaking grabs with keyboard actions (warning: security risk)" in the "Miscellaneous compatibility options" in your keyboard layout configuration tool of choice.
systemd for Administrators, Part XII

Here's the twelfth installment of my ongoing series on systemd for Administrators:

Securing Your Services

One of the core features of Unix systems is the idea of privilege separation between the different components of the OS. Many system services run under their own user IDs thus limiting what they can do, and hence the impact they may have on the OS in case they get exploited.

This kind of privilege separation only provides very basic protection however, since in general system services run this way can still do at least as much as a normal local users, though not as much as root. For security purposes it is however very interesting to limit even further what services can do, and shut them off a couple of things that normal users are allowed to do.

A great way to limit the impact of services is by employing MAC technologies such as SELinux. If you are interested to secure down your server, running SELinux is a very good idea. systemd enables developers and administrators to apply additional restrictions to local services independently of a MAC. Thus, regardless whether you are able to make use of SELinux you may still enforce certain security limits on your services.

In this iteration of the series we want to focus on a couple of these security features of systemd and how to make use of them in your services. These features take advantage of a couple of Linux-specific technologies that have been available in the kernel for a long time, but never have been exposed in a widely usable fashion. These systemd features have been designed to be as easy to use as possible, in order to make them attractive to administrators and upstream developers:

  • Isolating services from the network
  • Service-private /tmp
  • Making directories appear read-only or inaccessible to services
  • Taking away capabilities from services
  • Disallowing forking, limiting file creation for services
  • Controlling device node access of services

All options described here are documented in systemd's man pages, notably systemd.exec(5). Please consult these man pages for further details.

All these options are available on all systemd systems, regardless if SELinux or any other MAC is enabled, or not.

All these options are relatively cheap, so if in doubt use them. Even if you might think that your service doesn't write to /tmp and hence enabling PrivateTmp=yes (as described below) might not be necessary, due to today's complex software it's still beneficial to enable this feature, simply because libraries you link to (and plug-ins to those libraries) which you do not control might need temporary files after all. Example: you never know what kind of NSS module your local installation has enabled, and what that NSS module does with /tmp.

These options are hopefully interesting both for administrators to secure their local systems, and for upstream developers to ship their services secure by default. We strongly encourage upstream developers to consider using these options by default in their upstream service units. They are very easy to make use of and have major benefits for security.

Isolating Services from the Network

A very simple but powerful configuration option you may use in systemd service definitions is PrivateNetwork=:

...
[Service]
ExecStart=...
PrivateNetwork=yes
...

With this simple switch a service and all the processes it consists of are entirely disconnected from any kind of networking. Network interfaces became unavailable to the processes, the only one they'll see is the loopback device "lo", but it is isolated from the real host loopback. This is a very powerful protection from network attacks.

Caveat: Some services require the network to be operational. Of course, nobody would consider using PrivateNetwork=yes on a network-facing service such as Apache. However even for non-network-facing services network support might be necessary and not always obvious. Example: if the local system is configured for an LDAP-based user database doing glibc name lookups with calls such as getpwnam() might end up resulting in network access. That said, even in those cases it is more often than not OK to use PrivateNetwork=yes since user IDs of system service users are required to be resolvable even without any network around. That means as long as the only user IDs your service needs to resolve are below the magic 1000 boundary using PrivateNetwork=yes should be OK.

Internally, this feature makes use of network namespaces of the kernel. If enabled a new network namespace is opened and only the loopback device configured in it.

Service-Private /tmp

Another very simple but powerful configuration switch is PrivateTmp=:

...
[Service]
ExecStart=...
PrivateTmp=yes
...

If enabled this option will ensure that the /tmp directory the service will see is private and isolated from the host system's /tmp. /tmp traditionally has been a shared space for all local services and users. Over the years it has been a major source of security problems for a multitude of services. Symlink attacks and DoS vulnerabilities due to guessable /tmp temporary files are common. By isolating the service's /tmp from the rest of the host, such vulnerabilities become moot.

For Fedora 17 a feature has been accepted in order to enable this option across a large number of services.

Caveat: Some services actually misuse /tmp as a location for IPC sockets and other communication primitives, even though this is almost always a vulnerability (simply because if you use it for communication you need guessable names, and guessable names make your code vulnerable to DoS and symlink attacks) and /run is the much safer replacement for this, simply because it is not a location writable to unprivileged processes. For example, X11 places it's communication sockets below /tmp (which is actually secure -- though still not ideal -- in this exception since it does so in a safe subdirectory which is created at early boot.) Services which need to communicate via such communication primitives in /tmp are no candidates for PrivateTmp=. Thankfully these days only very few services misusing /tmp like this remain.

Internally, this feature makes use of file system namespaces of the kernel. If enabled a new file system namespace is opened inheritng most of the host hierarchy with the exception of /tmp.

Making Directories Appear Read-Only or Inaccessible to Services

With the ReadOnlyDirectories= and InaccessibleDirectories= options it is possible to make the specified directories inaccessible for writing resp. both reading and writing to the service:

...
[Service]
ExecStart=...
InaccessibleDirectories=/home
ReadOnlyDirectories=/var
...

With these two configuration lines the whole tree below /home becomes inaccessible to the service (i.e. the directory will appear empty and with 000 access mode), and the tree below /var becomes read-only.

Caveat: Note that ReadOnlyDirectories= currently is not recursively applied to submounts of the specified directories (i.e. mounts below /var in the example above stay writable). This is likely to get fixed soon.

Internally, this is also implemented based on file system namspaces.

Taking Away Capabilities From Services

Another very powerful security option in systemd is CapabilityBoundingSet= which allows to limit in a relatively fine grained fashion which kernel capabilities a service started retains:

...
[Service]
ExecStart=...
CapabilityBoundingSet=CAP_CHOWN CAP_KILL
...

In the example above only the CAP_CHOWN and CAP_KILL capabilities are retained by the service, and the service and any processes it might create have no chance to ever acquire any other capabilities again, not even via setuid binaries. The list of currently defined capabilities is available in capabilities(7). Unfortunately some of the defined capabilities are overly generic (such as CAP_SYS_ADMIN), however they are still a very useful tool, in particular for services that otherwise run with full root privileges.

To identify precisely which capabilities are necessary for a service to run cleanly is not always easy and requires a bit of testing. To simplify this process a bit, it is possible to blacklist certain capabilities that are definitely not needed instead of whitelisting all that might be needed. Example: the CAP_SYS_PTRACE is a particularly powerful and security relevant capability needed for the implementation of debuggers, since it allows introspecting and manipulating any local process on the system. A service like Apache obviously has no business in being a debugger for other processes, hence it is safe to remove the capability from it:

...
[Service]
ExecStart=...
CapabilityBoundingSet=~CAP_SYS_PTRACE
...

The ~ character the value assignment here is prefixed with inverts the meaning of the option: instead of listing all capabalities the service will retain you may list the ones it will not retain.

Caveat: Some services might react confused if certain capabilities are made unavailable to them. Thus when determining the right set of capabilities to keep around you need to do this carefully, and it might be a good idea to talk to the upstream maintainers since they should know best which operations a service might need to run successfully.

Caveat 2: Capabilities are not a magic wand. You probably want to combine them and use them in conjunction with other security options in order to make them truly useful.

To easily check which processes on your system retain which capabilities use the pscap tool from the libcap-ng-utils package.

Making use of systemd's CapabilityBoundingSet= option is often a simple, discoverable and cheap replacement for patching all system daemons individually to control the capability bounding set on their own.

Disallowing Forking, Limiting File Creation for Services

Resource Limits may be used to apply certain security limits on services being run. Primarily, resource limits are useful for resource control (as the name suggests...) not so much access control. However, two of them can be useful to disable certain OS features: RLIMIT_NPROC and RLIMIT_FSIZE may be used to disable forking and disable writing of any files with a size > 0:

...
[Service]
ExecStart=...
LimitNPROC=1
LimitFSIZE=0
...

Note that this will work only if the service in question drops privileges and runs under a (non-root) user ID of its own or drops the CAP_SYS_RESOURCE capability, for example via CapabilityBoundingSet= as discussed above. Without that a process could simply increase the resource limit again thus voiding any effect.

Caveat: LimitFSIZE= is pretty brutal. If the service attempts to write a file with a size > 0, it will immeidately be killed with the SIGXFSZ which unless caught terminates the process. Also, creating files with size 0 is still allowed, even if this option is used.

For more information on these and other resource limits, see setrlimit(2).

Controlling Device Node Access of Services

Devices nodes are an important interface to the kernel and its drivers. Since drivers tend to get much less testing and security checking than the core kernel they often are a major entry point for security hacks. systemd allows you to control access to devices individually for each service:

...
[Service]
ExecStart=...
DeviceAllow=/dev/null rw
...

This will limit access to /dev/null and only this device node, disallowing access to any other device nodes.

The feature is implemented on top of the devices cgroup controller.

Other Options

Besides the easy to use options above there are a number of other security relevant options available. However they usually require a bit of preparation in the service itself and hence are probably primarily useful for upstream developers. These options are RootDirectory= (to set up chroot() environments for a service) as well as User= and Group= to drop privileges to the specified user and group. These options are particularly useful to greatly simplify writing daemons, where all the complexities of securely dropping privileges can be left to systemd, and kept out of the daemons themselves.

If you are wondering why these options are not enabled by default: some of them simply break seamntics of traditional Unix, and to maintain compatibility we cannot enable them by default. e.g. since traditional Unix enforced that /tmp was a shared namespace, and processes could use it for IPC we cannot just go and turn that off globally, just because /tmp's role in IPC is now replaced by /run.

And that's it for now. If you are working on unit files for upstream or in your distribution, please consider using one or more of the options listed above. If you service is secure by default by taking advantage of these options this will help not only your users but also make the Internet a safer place.

January 18, 2012

Why UEFI secure boot is difficult for Linux
I wrote about the technical details of supporting the UEFI secure boot specification with Linux. Despite me pretty clearly saying that this was ignoring issues of licensing and key distribution and the like, people are now using it to claim that Linux could support secure boot with minimal effort. In a sense, they're right. The technical implementation details are fairly straightforward. But they're not the difficult bit.

Secure boot requires that all code that can touch hardware be trusted

Right now, if you can run unstrusted code before the OS then you can subvert the OS. Secure boot gives you a mechanism for making sure you only run trusted code, which protects against that. So your UEFI drivers have to be signed, your bootloader has to be signed, and your bootloader must only load a signed kernel. If you've only booted trusted code then you know that your OS is safe. But, unlike trusted boot, secure boot provides no way for you to know that only trusted code was executed. That has to be ensured by OS policy.

This doesn't sound like too much of a problem. But it is. Imagine we have a signed Linux bootloader and a signed Linux kernel, and that these signatures are made with a globally trusted key. These will boot on any hardware using secure boot. Now imagine that an attacker writes a kernel module that sets up a fake UEFI environment, stops the kernel from running code and then executes the Windows bootloader - kind of like kexec, but a little more fiddly. The Windows bootloader believes that it's performing a secure boot, but in fact everything that it believes is trustworthy is the attacker's fake UEFI code. The user's encryption passphrase is logged, the Windows kernel is modified to hide the Linux code and despite everything looking fine your credit card details are on their way to China. In this scenario, the signed Linux kernel is simply used as a malware loader. The only sign that anything is wrong is that boot will be slightly slowed down.

Signing the kernel isn't enough. Signed Linux kernels must refuse to load any unsigned kernel modules. Virtualbox on Linux? Dead. Nvidia binary driver on Linux? Dead. All out of tree kernel modules? Utterly, utterly dead. Building an updated driver locally? Not going to happen. That's going to make some people fairly unhappy.

(The same applies to Windows, of course. Windows 7 allows you to disable driver integrity checks. Windows 8 will have to forbid that when the system's using secure boot)

Licensing

GPLv3 has various requirements for signing keys to be available. Microsoft's new requirement that systems support the installation of user keys would let users boot their own modified bootloaders, so that may end up being sufficient to satisfy the license. But we're then beholden on Microsoft - if they remove that requirement then users lose that freedom, and suddenly we're in an awkward licensing situation. There are ongoing conversations about exactly what we're able to do here, but it's not a solved problem.

Key distribution

The UEFI spec doesn't describe or mandate a central certifying authority. Microsoft require that everyone carry their key. We could generate our own, but we have much less sway with vendors. There's no way to guarantee that all hardware vendors will generate our key. And, obviously, if we generate a key, we can't just hand the private half out to others. That means that it becomes impossible for people to produce derivative versions of Linux distributions without getting their own key. The kind of identity verification that would be required for getting such a key is likely to be expensive, and also fairly likely to require that the distribution have a legally registered company in order to facilitate the identity verification. Think Extended Validation certificates, not Startssl Free. Hobbyist Linux distributions will be a thing of the past.

Doesn't custom mode fix this?

Microsoft's certification requirements now state that all systems must support a custom mode, implying that it will be possible for a user to install their own keys. Linux vendors would then be able to ship with their own keys on the install media and impose their own policies. Everyone's happy. It's not really good enough, though. People have spent incredible amounts of time and effort making it easy to install Linux by doing little more than putting a CD in a drive. Asking them to go into the firmware and reconfigure things adds an extra barrier that restricts the ability to install Linux to more technically skilled users. And it's even worse than that. This is the full description of the requirement for custom mode:
  1. It shall be possible for a physically present user to use the Custom Mode firmware setup option to modify the contents of the Secure Boot signature databases and the PK.
  2. If the user ends up deleting the PK then, upon exiting the Custom Mode firmware setup, the system will be operating in Setup Mode with Secure Boot turned off.
  3. The firmware setup shall indicate if Secure Boot is turned on, and if it is operated in Standard or Custom Mode. The firmware setup must provide an option to return from Custom to Standard Mode which restores the factory defaults.

There's a few things missing from this, namely:
  • Any description of the UI. It's effectively impossible to document Linux installation when the first step becomes (a) complicated and (b) vendor specific. Vendors are using the UEFI transition to differentiate themselves by coming up with their own unique firmware interfaces. Custom mode is going to look different everywhere.
  • Any description of the key format. A raw binary representation of the key? An EFI_SIGNATURE_DATA struct? A base64 encoding of one, further protected with ROT13? We just don't know.
  • Any way to use custom mode for unattended installs. It's a firmware interface that requires a physically present user. Want to install a few thousand machines over the network? This isn't a scalable approach
  • …and this one's nitpicking, but there's not actually any requirement that the user be able to add keys - a vendor could conform to this language by only letting users delete keys. This is actually ok as long as the user deletes Pk, because then we'll effectively be back in setup mode and can install our own keys from the installer, but it still results in some practical problems

So no, custom mode doesn't make everything ok. Custom mode with a mandated UI and a documented key format would be much closer, but it wouldn't solve the problem of unattended automated installs.

Summary

We can write the code required to support secure boot on Linux in a minimal amount of time - in fact, most of it's now done. But significant practical problems remain, and so far we have no workable solutions for any of them.

comment count unavailable comments

January 17, 2012

New developments in the color management world

My work in color management has been bubbling away in the background, and new features are being added slowly and carefully.

One small, but nice feature is the new metadata tags that I’ve been standardising with Florian Höch and others. Of these, MAPPING_device_id is probably most interesting. This is a key that is automatically stored in the binary ICC profile itself, and stores the device ID of the device that it was created for. This means if you re-install the system, or email the profile file to someone with identical hardware, it automatically gets added as the default profile, unless you’ve manually set the device to something better.

From a security point of view, the colord daemon is no longer being ran as root, and instead uses a private group. Most of the work was done by Vincent Untz and the OpenSuse security team, but a few Ubuntu guys helped too and now we can worry less about random library vulnerabilities affecting us.

Benedikt Morbach also switched colord to optionally use a systemd service file, which will allow us to do some cool things in the future with regard to preventing network access, respawning on failure and that kind of thing.

So slowly but surely, we’re increasing the number of things that “just work” and updating colord to use a few best practices and the latest technologies. For the future, I’m looking at a wayland extension for full screen color management using a GLSL shader, but that’s some time away before it’ll be really useful, and allow us to simplify color management for applications even more by putting all the heavy lifting in toolkits.

… so we’re getting there. :)

January 16, 2012

PulseAudio vs. AudioFlinger

Arun put an awesome article up, detailing how PulseAudio compares to Android's AudioFlinger in terms of power consumption and suchlike. Suffice to say, PulseAudio rocks, but go and read the whole thing, it's worth it.

Apparently, AudioFlinger is a great choice if you want to shorten your battery life.

January 06, 2012

Firmware bugs considered enraging
Part of our work to make it possible to use UEFI Secure Boot on Linux has been to improve our EFI variable support code. Right now this has a hardcoded assumption that variables are 1024 bytes or smaller, which was true in pre-1.0 versions of the EFI spec. Modern implementations allow the maximum variable size to be determined by the hardware, and with implementations using large key sizes and hashes 1024 bytes isn't really going to cut it. My first attempt at this was a little ugly but also fell foul of the fact that sysfs only allows writes of up to the size of a page - so 4KB on most of the platforms we're caring about. So I've now reimplemented it as a filesystem[1], which is trickier but avoids this problem nicely.

Things were almost working fine - I could read variables of arbitrary sizes, and I could write to existing variables. I was just finishing hooking up new variable creation, but in the process accidentally set the contents of the Boot0002 variable to 0xffffffff 0xffffffff 0x00000000. Boot* variables provide the UEFI firmware with the different configured boot devices on the system - they can point either at a raw device or at a bootloader on a device, and they can do so using various different namespaces. They have a defined format, as documented in chapter 9 of the UEFI spec. At boot time the boot manager reads the variables and attempts to boot from them in a configured order as found in the BootOrder variable.

Now, obviously, 0xffffffff 0x00000000 is unlikely to conform to the specification. And when I rebooted the machine, it gave me a flashing cursor and did nothing. Fair enough - I should be able to choose another boot path from the boot manager. Except the boot manager behaves identically, and I get a flashing cursor and nothing else.

I reported this to the EDK2 development list, and Andrew Fish (who invented EFI back in the 90s) pointed me at the code that's probably responsible. It's in the BDS (Boot Device Selection) library that's part of the UEFI reference implementation from Intel, and you can find it here. The relevant function is BdsLibVariableToOption, which is as follows (with irrelevant bits elided):
BdsLibVariableToOption (
  IN OUT LIST_ENTRY                   *BdsCommonOptionList,
  IN  CHAR16                          *VariableName
  )
{
  UINT16                    FilePathSize;
  UINT8                     *Variable;
  UINT8                     *TempPtr;
  UINTN                     VariableSize;
  VOID                      *LoadOptions;
  UINT32                    LoadOptionsSize;
  CHAR16                    *Description;

  //
  // Read the variable. We will never free this data.
  //
  Variable = BdsLibGetVariableAndSize (
              VariableName,
              &gEfiGlobalVariableGuid,
              &VariableSize
              );
  if (Variable == NULL) {
    return NULL;
  }
So so far so good - we read the variable from flash and put it in Variable, Variable is now 0xffffffff 0xffffffff 0x00000000. If it hadn't existed we'd have skipped over and continued. VariableSize is 12.
  //
  // Get the option attribute
  //
  TempPtr   =  Variable;
  Attribute =  *(UINT32 *) Variable;
  TempPtr   += sizeof (UINT32);
Attribute is now 0xffffffff and TempPtr points to Variable + 4.
  //
  // Get the option's device path size
  //
  FilePathSize =  *(UINT16 *) TempPtr;
  TempPtr      += sizeof (UINT16);
FilePathSize is 0xffff, TempPtr points to Variable + 6.
  //
  // Get the option's description string size
  //
  TempPtr     += StrSize ((CHAR16 *) TempPtr);
TempPtr points to 0xffff 0x0000, so StrSize (which is basically strlen) will be 4. TempPtr now points to Variable + 10.
  //
  // Get the option's device path
  //
  DevicePath =  (EFI_DEVICE_PATH_PROTOCOL *) TempPtr;
  TempPtr    += FilePathSize;
TempPtr now points to Variable + 65545 (FilePathSize is 0xffff).
  LoadOptions     = TempPtr;
  LoadOptionsSize = (UINT32) (VariableSize - (UINTN) (TempPtr - Variable));
LoadOptionsSize is now 12 - (Variable + 65545 - Variable), or 12 - 65545, or -65533. But it's cast to an unsigned 32 bit integer, so it's actually 4294901763.
  Option->LoadOptions = AllocateZeroPool (LoadOptionsSize);
  ASSERT(Option->LoadOptions != NULL);
We attempt to allocate just under 4GB of RAM. This probably fails - if it does the boot manager exits. This probably means game over. But if it somehow succeeds:
CopyMem (Option->LoadOptions, LoadOptions, LoadOptionsSize);
we then proceed to read almost 4GB of content from uninitialised addresses, and since Variable was probably allocated below 4GB that almost certainly includes all of your PCI space (which is typically still below 4GB) and bits of your hardware probably generate very unhappy signals on the bus and you lose anyway.

So now I have a machine that won't boot, and the clear CMOS jumper doesn't clear the flash contents so I have no idea how to recover from it. And because this code is present in the Intel reference implementation, doing the same thing on most other UEFI machines would probably have the same outcome. Thankfully, it's not something people are likely to do by accident - using any of the standard interfaces will always generate a valid entry, so you could only trigger this when modifying variables by hand. But now I need to set up another test machine.

[1] All code in Linux will evolve until the point where it's implemented as a filesystem.

comment count unavailable comments

January 04, 2012

The economic incentive to violate the GPL
My post yesterday on how Google gains financial benefit from vendor GPL violations contained an assertion that some people have questioned - namely, "unscrupulous hardware vendors save money by ignoring their GPL obligations". And, to be fair, as written it's true but not entirely convincing. So instead, let's consider "unscrupulous hardware vendors have economic incentives to ignore their GPL obligations".

The direct act of compliance costs money


Complying with the GPL means having the source code that built the binaries you ship. This is easy if your workflow involves putting source in at one end and getting binaries out at the other, but getting to that workflow means having a certain degree of engineering rigour. If your current build process involves mixing a bunch of known good binaries you got from somewhere but you can't remember where with a hacked up source tree that exists on someone's hard drive and then pushing all of these into a tool that only runs on Windows ME, before taking the resulting image and replacing chunks of it by hand, compliance is effectively impossible.

We all know that this is against all kinds of best practices and probably causes so many problems that it's more expensive in the long term, but retooling and hiring someone to oversee all of this takes time and money, and given the margins on many of these devices that's probably enough to make you uncompetitive for a couple of product cycles. Maybe you'll be in a better position afterwards, but you don't know that there'll be an afterwards.

Suppliers who don't provide you with the source code may be cheaper than those who do


You can't be in compliance if you don't have the source code in the first place. The same arguments that apply to the hardware vendors also apply to the people selling you your chips, so there's also an economic incentive for them to avoid complying. And there's an obvious incentive for you to choose the cheaper chipset, even if they don't comply.

Getting the source may cost money


Buying a chipset doesn't necessarily get you the software that makes it work - several silicon vendors will charge you for the SDK. But many of these devices are effectively reference platforms, so are basically identical from a hardware perspective. So if one of your competitors paid for the SDK, you can just dump the binaries off their machine, flash them onto your own boards and save yourself a decent amount of money. You obviously don't get the source, and nor do you have the standing to insist that the vendor whose binaries you misappropriated give you the source.

In the absence of enforcement, GPL compliance only works if it's the norm


Let's imagine two companies, A and B. Both build a tablet device, and buy the full SDK including source code. Both find a bunch of bugs in the vendor SDK and fix a different subset of them. They ship. A provides source code. B doesn't. B can now take A's bugfixes and incorporate them, resulting in a more compelling product without any significant extra cost. You now have two products that can sell for the same price, but B's is better. A would need to prove that B copied their bugfixes rather than simply fixing them themselves , which probably isn't going to happen.

In a larger market, if B is the only vendor who does this then their advantage isn't large - some of A's work is misappropriated by B, but A does benefit from the engineering work contributed by C, D, E, F and G. A combination of social pressure and legal threats may bring B into compliance. But if infringement is the norm, A has no incentive at all to release the source - by doing so they'll be helping not only B, but also C, D, E, F and G. Everyone undercuts A and they go out of business quite quickly.

Moral: In the absence of enforcement, if everyone else is infringing, a single company who complies is at a disadvantage.

If compliance cost nothing then everyone would do it


You can argue that cheap tablets from China are infringing simply because nobody knows better. But what's HTC's excuse? They've clearly decided that there's a benefit in holding back their source code releases[1], balancing this against the risk of being sued. They know full well what they're doing. If compliance was free they'd ship the source at the same time as they shipped the binaries. Other significant vendors are also fully aware of their obligations but choose to ignore them anyway.

Summary


There are economic incentives to infringe the GPL, and therefore (all else being equal) an infringing device can be sold for less money. All else being equal, a cheaper device will sell more units. More sales means more devices selling adverts for Google. Google makes more money because Android vendors infringe the GPL.

Edited to add:

But don't just take my word for it - Jean-Baptiste Queru says the same here (search for "scrubbing" - is there any way to link directly to a Google plus comment?)

[1] The usual argument is "We will release the source code within 120 days", implying that it's a process that takes time and we should just be patient. Every single time I've started making threatening noises, the source has appeared within a week.

comment count unavailable comments
Android, GPL violations and Google
A bit over a year ago, I wrote about how an incredible number of Android tablets on the market were in violation of the terms of the GPL. I've had rather a lot else to do since then so it's now awfully out of date - but taking a quick browse through the current stack of cheaper devices indicates that things aren't all that much better. We've got source code for some chipsets that were missing it before, but to compensate we've got a whole bunch of new hardware that's entirely lacking. It's all pretty poor, really.

At the time, I wrote the following:

"(Side note: People sometimes ask why Google aren't doing more to prevent infringing devices. For the vast majority of these cases, Google's sole contribution has been to put Android source code on a public website. Red Hat own more of the infringing code than Google do. There's no real reason why Google should be the ones taking the lead role here, and there's fairly sound business reasons why it's not in their interest to do so)"

Factually speaking, nothing's changed. Each of these devices contains code owned by Google, and Google could absolutely take legal action against the vendors. Equally, so could Red Hat, Intel, Nokia and dozens of other companies who hold copyright on portions of the code carried on these devices, and so could thousands of individuals around the world. Nobody's obliged to enforce their copyrights, and in the absence of anyone else doing so it's unreasonable to insist that Google should do it.

However.

Google gives Android away. This seems like an odd thing for them to do, given that it's a significant engineering effort and costs a lot of money to produce. But remember what Android brings to Google - it's a platform with a well-integrated mechanism for distributing advertising to users. Scanning the market shows a huge number of ad-supported apps, and Google's getting money for every single one of those that gets shown. The more Android devices, the bigger the market for apps - and the wider their advertising reach.

In other words: unscrupulous hardware vendors save money by ignoring their GPL obligations. This lets them appeal on price, increasing the number of Android devices in use and increasing Google's profits. Google makes money off other people's violation of the GPL.

Could Google do anything to stop this? Yes. They could sue for copyright infringement, but that kind of thing's time consuming and awkward and any argument about the GPL always seems to end up as a big argument involving conspiracy theories. Instead, Google could attach some extra conditions to the Android trademark. Requiring that the trademark only be attached to GPL-compliant products ought to allow Google to take advantage of the existing well-tested mechanisms for seizing counterfeit goods, providing a direct economic incentive for companies to come into compliance. For added marks, they could restrict the adwords code to devices that use the trademark - if the vendor removes the trademark, applications depending on the adwords functionality would refuse to run and Google wouldn't make money off the infringing hardware.

Or, of course, they could just carry on making extra money as a result of vendors denying users the freedoms granted by the copyright holders. Although that sounds kind of evil to me.

Edit: You probably also want to read the followup post here.

comment count unavailable comments

January 03, 2012

TVs are all awful
A discussion a couple of days ago about DPI detection (which is best summarised by this and this and I am not having this discussion again) made me remember a chain of other awful things about consumer displays and EDID and there not being enough gin in the world, and reading various bits of the internet and wikipedia seemed to indicate that almost everybody who's written about this has issues with either (a) technology or (b) English, so I might as well write something.

The first problem is unique (I hope) to 720p LCD TVs. 720p is an HD broadcast standard that's defined as having a resolution of 1280x720. A 720p TV is able to display that image without any downscaling. So, naively, you'd expect them to have 1280x720 displays. Now obviously I wouldn't bother mentioning this unless there was some kind of hilarious insanity involved, so you'll be entirely unsurprised when I tell you that most actually have 1366x768 displays. So your 720p content has to be upscaled to fill the screen anyway, but given that you'd have to do the same for displaying 720p content on a 1920x1080 device this isn't the worst thing ever in the world. No, it's more subtle than that.

EDID is a standard for a blob of data that allows a display device to express its capabilities to a video source in order to ensure that an appropriate mode is negotiated. It allows resolutions to be expressed in a bunch of ways - you can set a bunch of bits to indicate which standard modes you support (1366x768 is not one of these standard modes), you can express the standard timing resolution (the horizontal resolution divided by 8, followed by an aspect ratio) and you can express a detailed timing block (a full description of a supported resolution).

1366/8 = 170.75. Hm.

Ok, so 1366x768 can't be expressed in the standard timing resolution block. The closest you can provide for the horizontal resolution is either 1360 or 1368. You also can't supply a vertical resolution - all you can do is say that it's a 16:9 mode. For 1360, that ends up being 765. For 1368, that ends up being 769.

It's ok, though, because you can just put this in the detailed timing block, except it turns out that basically no TVs do, probably because the people making them are the ones who've taken all the gin.

So what we end up with is a bunch of hardware that people assume is 1280x720, but is actually 1366x768, except they're telling your computer that they're either 1360x765 or 1368x769. And you're probably running an OS that's doing sub-pixel anti-aliasing, which requires that the hardware be able to address the pixels directly which is obviously difficult if you think the screen is one size and actually it's another. Thankfully Linux takes care of you here, and this code makes everything ok. Phew, eh?

But ha ha, no, it's worse than that. And the rest applies to 1080p ones as well.

Back in the old days when TV signals were analogue and got turned into a picture by a bunch of magnets waving a beam of electrons about all over the place, it was impossible to guarantee that all TV sets were adjusted correctly and so you couldn't assume that the edges of a picture would actually be visible to the viewer. In order to put text on screen without risking bits of it being lost, you had to steer clear of the edges. Over time this became roughly standardised and the areas of the signal that weren't expected to be displayed were called overscan. Now, of course, we're in a mostly digital world and such things can be ignored, except that when digital TVs first appeared they were mostly used to watch analogue signals so still needed to overscan because otherwise you'd have the titles floating weirdly in the middle of the screen rather than towards the edges, and so because it's never possible to kill technology that's escaped into the wild we're stuck with it.

tl;dr - Your 1920x1080 TV takes a 1920x1080 signal, chops the edges off it and then stretches the rest to fit the screen because of decisions made in the 1930s.

So you plug your computer into a TV and even though you know what the resolution really is you still don't get to address the individual pixels. Even worse, the edges of your screen are missing.

The best thing about overscan is that it's not rigorously standardised - different broadcast bodies have different recommendations, but you're then still at the mercy of what your TV vendor decided to implement. So what usually happens is that graphics vendors have some way in their drivers to compensate for overscan, which involves you manually setting the degree of overscan that your TV provides. This works very simply - you take your 1920x1080 framebuffer and draw different sized black borders until the edge of your desktop lines up with the edge of your TV. The best bit about this is that while you're still scanning out a 1920x1080 mode, your desktop has now shrunk to something more like 1728x972 and your TV is then scaling it back up to 1920x1080. Once again, you lose.

The HDMI spec actually defines an extension block for EDID that indicates whether the display will overscan or not, but doesn't provide any way to work out how much it'll overscan. We haven't seen many of those in the wild. It's also possible to send an HDMI information frame that indicates whether or not the video source is expecting to be overscanned or not, but (a) we don't do that and (b) it'll probably be ignored even if we did, because who ever tests this stuff. The HDMI spec also says that the default behaviour for 1920x1080 (but not 1366x768) should be to assume overscan. Charming.

The best thing about all of this is that the same TV will often have different behaviour depending on whether you connect via DVI or HDMI, but some TVs will still overscan DVI. Some TVs have options in the menu to disable overscan and others don't. Some monitors will overscan if you feed them an HD resolution over HDMI, so if you have HD content and don't want to lose the edges then your hardware needs to scale it down and let the display scale it back up again. It's all awful. I recommend you drink until everything's already blurry and then none of this will matter.

comment count unavailable comments

January 02, 2012

Multitouch in X - Pointer emulation
This post is part of a series on multi-touch support in the X.Org X server.
  1. Multitouch in X - Getting events
In this post, I'll outline how pointer emulation on touch events works. This post assumes basic knowledge of the XI2 Xlib interfaces.

Why pointer emulation?

One of the base requirements of adding multitouch support to the X server was that traditional, non-multitouch applications can still be used. Multitouch should be a transparent addition, available where needed, not required where not supported.

So we do pointer emulation for multitouch events, and it's actually specified in the protocol how we do it. Mainly so it's reliable and predictable for clients.

What is pointer emulation in X

Pointer emulation simply means that for specific touch sequences, we generate pointer events. The conditions for emulation are that the the touch sequence is eligible for pointer emulation (details below) and that no other client has a touch selection on that window/grab.

The second condition is important: if your client selects for both touch and pointer events on a window, you will never see the emulated pointer events. If you are an XI 2.2 client and you select for pointer but not touch events, you will see pointer events. These events are marked with the XIPointerEmulated so that you know they come from an emulated source.

Emulation on direct-touch devices

For direct-touch devices, we emulate pointer events for a touch sequence provided the touch is the first touch on the device, i.e. no other touch sequences were active for this device when the touch started. The touch sequence is emulated until it ends, even if other touches start and end while that sequence is active.

Emulation on dependent-touch devices

Dependent touch devices do not emulate pointer events. Rather, we send the normal mouse movements from the device as regular pointer events.

Button events and button state

Pointer emulation triggers motion events and, more importantly, button events. The button number for touches is hardcoded to 1 (any more specific handling such as long-click for right buttons should be handled by touch-aware clients instead), so the detail field of an emulated button event is 1 (unless the button is logically mapped).

The button state field on emulated pointer events adjusts for pointer emulation as it would for regular button events. The button state is thus (usually) 0x0 for the emulated ButtonPress and 0x100 for the MotionNotify and ButtonRelease events.

Likewise, any request that returns the button state will have the appropriate state set, even if no emulated event actually got sent.

Grab handling works as for regular pointer events, though the interactions between touch grabs and emulated pointer grabs are somewhat complex. I'll get to that in a later post.

The confusing bit

There is one behaviour about the pointer emulation that may be confusing, even though the specs may seem logical and the behaviour is within the specs.

If you put one finger down, it will emulate pointer events. If you then put another finger down, the first finger will continue to emulate pointer events. If you now lift the first finger (keeping the second down) and put the first finger down again, that finger will not generate events. This is noticable mainly in bi-manual or multi-user interaction.

The reason this doesn't work is simple: to the X server, putting the first finger down just looks like another touchpoint appearing when there is already one present. The server does not know that this is the same finger again, it doesn't know that your intention was to emulate again with that finger. Most of the semantics for such interaction is in your head alone and hard to guess. Guessing it wrong can be quite bad, since that new touchpoint may have been part of a two-finger gesture with the second finger and whoops - instead of scrolling you just closed a window, pasted your password somewhere or killed a kitten. So we err on the side of caution, because, well, think of the kittens.
Multitouch in X - Touch grab handling
This post is part of a series on multi-touch support in the X.Org X server.
  1. Multitouch in X - Getting events
  2. Multitouch in X - Pointer emulation
In this post, I'll outline how grabs on touch events work. This post assumes basic knowledge of the XI2 Xlib interfaces.

Passive grabs

The libXi interface has one new passive grab call: XIGrabTouchBegin, which works pretty much like the existing passive grab APIs. As with event selection, you must set all three event masks XI_TouchBegin, XI_TouchUpdate and XI_TouchEnd or a BadValue error occurs. Once a passive grab activates in response to a touch, the client must choose to either accept or reject a touch. Details on that below.

Grabs activate on a TouchBegin event and due to the nature of multitouch, multiple touch grabs may be active at any time - some of them for different clients.

Active grabs

Active grabs do not have a new call, they are handled through the event masks of the existing XIGrabDevice(3) call. If a client has an active touch grab on the device, it is automatically the owner of the touch sequence (ownership is described below). If a client has an active pointer or keyboard grab on the device, it is the owner of the touch sequence for pointer emulated touch events only. Other touch events are unaffected by the grab and are processed normally.

Acceptance and rejection

Pointer grabs provide exclusive access to the device, but to some degree a client can opt to replay the event it received on the next client. We expect that touch sequences will often trigger gesture recognition, and a client may realise after a few events that it doesn't actually want that touch sequence. So we expanded the replay semantics. clients with a touch grab must choose to either accept or reject a touch.

Accepting a touch signals to the server that the touch sequence is meant for this client and no-one else. The server then exclusively delivers to that client until the terminating TouchEnd.

Rejecting a touch sequence signals that the touch sequence is not meant for this client. Once a client rejects a touch sequence, the server sends the TouchEnd event to that client (if the touch is still active) and replays the full touch sequence [1] on the next grab or window. We use the term owner of a touch sequence to talk about the current recipient.

The order of events for two clients Cg and Cw, with Cg having a grab and Cw having a regular touch event selection on a window, is thus:
TouchBegin to Cg    → 
TouchUpdate to Cg   → 
TouchUpdate to Cg   → 
                    ← Cg rejects touch
                    ← Cw becomes new owner
TouchEnd+ to Cg     →
TouchBegin* to Cw   → 
TouchUpdate* to Cw  → 
TouchUpdate* to Cw  → 
#### physical touch ends #### 
TouchEnd to Cw      →
Events with + mark an event created by the server, * mark events replayed by the server

For nested grabs, this sequence simply repeats for each client until either a grabbing client accepts the touch or the client with the event selection becomes the owner.

In the above case, the touch ended after Cg rejected the touch. If the touch ends before the current owner accepted or rejected it, the owner gets the TouchEnd event and the touch is left handing until the owner accepts or rejects it. If accepted, that's it. If rejected, the new owner gets the full sequence in one go, including the TouchEnd event. The sequence is thus:
TouchBegin to Cg    → 
TouchUpdate to Cg   → 
TouchUpdate to Cg   → 
#### physical touch ends #### 
TouchEnd to Cg      →
                    ← Cg rejects touch
                    ← Cw becomes new owner
TouchBegin* to Cw   → 
TouchUpdate* to Cw  → 
TouchUpdate* to Cw  → 
TouchEnd* to Cw     →

Touch ownership handling

One additional event type that XI 2.2 introduces is the XI_TouchOwnership event. Clients selecting for this event signal that they need to receive touch events before they're the owner of the touch sequence. This event type can be selected on both grabs and event selections.

First up: there are specific use-cases where you need this. If you don't fall into them, you're better off just skipping on ownership events, they make everything more complicated. And whether you need ownership events depends not only on you, but also the stack you're running under. On normal touch event selection, touch events are only delivered to the current owner of the touch. With multiple grabs, the delivery is sequential and delivery of touch events may be delayed.

Clients selecting for touch ownership events get the events as they occur, even if they are not the current owner. The XI_TouchOwnership event is delivered if and when they become the current owner. The last part is important: if you select for ownership events, you may receive touch events but you may not become the owner of that sequence. So while you can start reacting to that sequence, anything your app does must be undo-able in case the e.g. window manager claims the touch sequence.

If we look at the same sequence as above with two clients selecting for ownership, the sequence looks like this:
TouchBegin to Cg     → 
TouchBegin to Cw     → 
TouchOwnership to Cg →
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
                     ← Cg rejects touch
                     ← Cw becomes new owner
TouchEnd+ to Cg      →
TouchOwnership to Cw →
#### physical touch ends #### 
TouchEnd to Cw      →
Note: TouchOwnership events do not correspond to any physical event, they are always generated by the server

If a touch ends before the owner accepts, the current owner gets the TouchEnd, all others get a TouchUpdate event instead. That TouchUpdate has a flag XITouchPendingEnd set, signalling that no more actual events will arrive from this touch but the touch is still waiting for owner acceptance.
TouchBegin to Cg     → 
TouchBegin to Cw     → 
TouchOwnership to Cg →
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
#### physical touch ends #### 
TouchEnd to Cg       →
TouchUpdate to Cw    →  (XITouchPendingEnd flag set)
                     ← Cg rejects touch
                     ← Cw becomes new owner
TouchOwnership to Cw →
TouchEnd to Cw       →
In both cases, we dealt with a rejecting owner. For an accepting owner, the sequences look like this:
TouchBegin to Cg     → 
TouchBegin to Cw     → 
TouchOwnership to Cg →
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
                     ← Cg accepts touch
TouchEnd+ to Cw      →
TouchUpdate to Cg    → 
#### physical touch ends #### 
TouchEnd to Cg      →
or
TouchBegin to Cg     → 
TouchBegin to Cw     → 
TouchOwnership to Cg →
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
TouchUpdate to Cg    → 
TouchUpdate to Cw    → 
#### physical touch ends #### 
TouchEnd to Cg       →
TouchUpdate to Cw    →  (XITouchPendingEnd flag set)
                     ← Cg accepts touch
TouchEnd* to Cw      →
In the case of multiple grabs, the same strategy applies in order of grab activation. Ownership events may be selected by some clients but not others. In that case, each client is treated as requested, so the event sequence the server deals with may actually look like this:
TouchBegin to C1     → 
TouchBegin to C3     → 
TouchOwnership to C1 →
TouchUpdate to C1    → 
TouchUpdate to C3    → 
TouchUpdate to C1    → 
TouchUpdate to C3    → 
                     ← C1 rejects touch
                     ← C2 becomes new owner
TouchEnd+ to C1      →
TouchBegin* to C2    → 
TouchUpdate* to C2   → 
TouchUpdate* to C2   → 
                     ← C2 rejects touch
                     ← C3 becomes new owner
TouchEnd+ to C2      →
TouchOwnership to C3 →
#### physical touch ends #### 
TouchEnd to C3       →


[1] obviously we need to store these events so "full sequence" really means all events until the buffer was full

December 28, 2011

Translators wanted!

Does anybody want to help translate the colorhug-client project to new languages? The transifex page is here.

If any strings are difficult to translate let me know and I’ll either add translator comments or reword them. Thanks!

December 22, 2011

update on hotplug server
No new videos yet, need to fix some more rendering bugs so it looks nicer :)

So I've been working towards 3 setups:

a) intel rendering + nouveau offload
b) nouveau rendering + DVI output + intel LVDS output
c) hotplug USB with either intel or nvidia rendering.

Categorisation of devices roles:
I've identified 4 devices roles so far:
preferred master: the device is happy to be master
reluctant master: the device can be a master but would rather not be
offload slave: device can be used as an additional DRI2 renderer for a master
output slave: device can be used an additional output for a master

For the 3 setups above:
a) intel would be preferred master, nvidia would be offload slave
b) nvidia would be preferred master, intel would be output slave
c) usb devices would be output slaves, however if no master exists, usb device would be reluctant master.

I've rebased the prime work[1] on top of the dma-buf upstream work, and worked through most of the lifetime problems. Some locking issues still exist, and I'll have to get back to them. But the code works and doesn't oops randomly which is good.

prime is the kernel magic needed for this work, as it allows sharing of a buffer between two drm drivers, so for (a) it shares the dri2 front pixmap between devices, for (b/c) it shares a pixmap that the rendering gpu copies dirty updates to and the output slaves use as their scanout pixmap.

So I've done nearly all the work to share between intel and nouveau and I've done the kernel driver work for udl, but I haven't done the last piece in userspace for (c), which is to use the shared pixmap as usb scanout via the modesetting driver.

Today I hacked in a switch on the first randr command, so I can start the X server with intel as master and nouveau in offload mode. I can run gears on intel or nouveau, then after the randr command and another randr command to set a mode, the X server migrates everything to the nouveau driver, puts it in master mode, and places the intel driver into output slave mode. It seems to render my xterm + metacity content fine.

So the current short-term TODO is:
fix some issues with my nouveau/exa port rendering
fix some issues with xcompmgr
add usb output slave support.

Medium-term TODO:
worked out how to control this stuff, via randr protocol. How much information do we need to expose to clients about GPUs, and how do we control them. Open issues with atomicity of updates to avoid major uglys. Switching from intel master to nvidia master + intel outputs, means we have to reconfigure the Intel output to point at the new pixmap, but the more steps we put in there for clients to do, the more ugly and flashing we'll see on screen, however we probably want a lot of this to be client driven (i.e. gnome-settings-daemon).

Longer term TODO:
Get GLX_ARB_robustness done, now that Ian has done the context creation stuff, this should be a lot more trivial. (so trivial someone else could do it :)

[1] http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-prime-dmabuf
Multitouch in X - Getting events
This post is part of a series on multi-touch support in the X.Org X server. I recommend re-reading Thoughts on Linux multitouch from last year for some higher-level comments.
In this post, I'll outline how to identify touch devices and register for touch events.

This post assumes basic knowledge of the XI2 Xlib interfaces. Code examples should not be scrutinised for language-correctness.

New event types

XI 2.2 defines four new event types: XI_TouchBegin, XI_TouchUpdate, XI_TouchEnd are the standard events that most applications will be using. The fourth event, XI_TouchOwnership is mainly for handling specific situations where reaction speed is at a premium and gesture processing when grabs are active. I won't be covering those in this post.

Identifying touch devices

To use multitouch functionality from a client application, the client must announce support for the X Input Extension version 2.2 through the XIQueryVersion(3) request.
int major = 2, minor = 2;
XIQueryVersion(dpy, &major, &minor);
if (major * 1000 + minor < 2002)
    printf("Server does not support XI 2.2\n");
Once announced, an XIQueryDevice(3) call may return a new class type, the XITouchClass. If this class is present on a device, the device supports multitouch.The class struct itself is defined like this:
typedef struct
{
    int         type;
    int         sourceid;
    int         mode;
    int         num_touches;
} XITouchClassInfo;
The num_touches field specifies the number of simultaneous touches supported by the device. If the number is 0, we simply don't know (likely) or the device supports an unlimited number of touches (less likely). Regardless of the value expect that some devices lie, so it's best to treat this value as a guide only.

The mode field specifies the type of touch devices. We currently define two types and the server behaviour differs depending on the type:
  • XIDirectTouch for direct-input touch devices (e.g. your average touchscreen or tablet).  For this type of device, the touch events will be delivered to the windows at the of the touch point. Again, similar to what you would expect from a tablet interface - you press top left and the application top-left responds.
  • XIDependentTouch for a indirect input devices with multi-touch functionality. Touchpads are the prime example here. Touch events on such devices will be sent to the window underneath the cursor and clients are expected to interpret the touchpoints as (semantically) relative to the cursor position. For example, if your cursor is inside a Firefox window and you touch with two fingers on the top-left corner of the touchpad, Firefox will get those events. It can then decide on how to interpret those touchpoints.
A device that has a TouchClass may send touch events, but these events use the same axes as pointer events. Having said that, a touch device may still send pointer events as well - if the physical device generates both.
Your code to identify touch devices could roughly look like this:
XIDeviceInfo *info;
int nevices;

info = XIQueryDevice(display, XIAllDevices, &ndevices);

for (i = 0; i < ndevices; i++)
{
    XIDeviceInfo *dev = &info[i];
    printf("Device name %d\n", dev->name);
    for (j = 0; j < dev->num_classes; j++)
    {
        XIAnyClassInfo *class = dev->classes[j];
        XITouchClassInfo *t = (XITouchClassInfo*)class;

        if (class->type != XITouchClass)
            continue;

        printf("%s touch device, supporting %d touches.\n",
               (t->mode == XIDirectTouch) ?  "direct" : "dependent",
               t->num_touches);
    }
}

Selecting for touch events

Selecting for touch events on a window is mostly identical to pointer events. A client creates an event mask and submits it with XISelectEvents(3). One exception applies: a client must always select for all three touch events [1], XI_TouchBegin, XI_TouchUpdate, XI_TouchEnd. Selecting for one or two only will result in a BadValue error.

As for button events, only one client may select for touch events on any given window and the event delivery attempts traverse from the bottom-most window in the window tree up to the root window. Where a matching event selection is found, the event is delivered and the traversal stops.

Handling touch events

The three event types [1] are XIDeviceEvents like pointer and keyboard events. So from a client's point of view, in essence all we added was new event types.

The detail field of touch events specifies the touch ID, a unique ID for this particular touch for the lifetime of the touch sequence. Each touch sequence consists of a TouchBegin event, zero or more TouchUpdate events and one TouchEnd event. Since multiple touch sequences may be ongoing at any time, keeping track of the ID is important. The server guarantees that the touch ID is unique per device and that it will not be re-used [2]. Note that while touch IDs increase, they increase by an implementation-defined amount. Don't rely on the next touch ID to be the current ID + 1.

The button state in a touch event is the state of the physical buttons only. A TouchUpdate or TouchEnd event will thus usually have a zero button state. [3]

That's pretty much it, otherwise the handling of touch events is identical to pointer or keyboard events. Touch event handling should be straightforward and the significant deviations from the current protocol are in the grab handling, something I'll handle in a future post.

[1] I know, it's four. Good that you're paying attention.
[2] Technically ID collision may occur. For that to happen, you'd need to hold at least one touch down while triggering enough touches to exhaust a 32 bit ID range. And hope that after the wraparound you will get the same ID. There are better ways to spend your weekend.
[3] pointer emulation changes this, but I'll get to that some other time.

December 19, 2011

Multitouch patches posted
After pulling way too many 12+ hour days, I've finally polished the patchset for native multitouch support in the X.Org server into a reasonable state. The full set of patches is now on the list. And I'm still expecting this to get merged for 1.12 (and thus in time for Fedora 17).

The code is available from the multitouch branches of the following repositories:
  git://people.freedesktop.org/~whot/xserver
  git://people.freedesktop.org/~whot/inputproto
  git://people.freedesktop.org/~whot/xf86-input-evdev
  git://people.freedesktop.org/~whot/libXi
Here's a screencast running Fedora 16 with the modified X server and a little multitouch event debugging application.

<object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://i.ytimg.com/vi/TBZtSf3sgeQ/0.jpg" height="266" width="320"><param name="movie" value="http://www.youtube.com/v/TBZtSf3sgeQ?version=3&amp;f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata"/> <param name="bgcolor" value="#FFFFFF"/> <embed height="266" src="http://www.youtube.com/v/TBZtSf3sgeQ?version=3&amp;f=user_uploads&amp;c=google-webdrive-0&amp;app=youtube_gdata" type="application/x-shockwave-flash" width="320"></embed></object>

Below is a short summary of what multitouch in X actually means, but one thing is important: being the windowing system, X provides multitouch support. That does not mean that every X application now supports multitouch, it merely means that they can now use multitouch if they want to. That also includes gestures, they need application support.

A car analogy: X provides a new road, the applications still have to opt to drive on it.

Multitouch events

XI 2.2 adds three main event types: XI_TouchBegin, XI_TouchUpdate and XI_TouchEnd. These three make up a touch sequence. X clients must subscribe to all three events at once and will then receive the events as they come in from the device (more or less, grabs can interfere here). Each touch event has a unique touch ID so clients can track the touches over time.

We support two device types: XIDirectDevice includes tablets and touchscreens where the events are delivered to the position the touch occurs at. XIDependentDevice includes multitouch-capable touchpads. Such devices still control a normal pointer by default, but for multi-finger gestures are possible. For such devices, the touchpoints are delivered to the window underneath the pointer.

That is pretty much the gist of it. I'll post more information over time as the release gets closer, so stay tuned.

Pointer emulation

Multitouch can be a compelling interaction method but as said above, X only provides support for multitouch. It will take a while for applications to pick it up (Carlos Garnacho is working on GTK3) and some never will. Since we still need to interact with those applications, we provide backwards-compatible pointer emulation. Again, the details are in the protocol but the gist of it is that for the first touchpoint we emulate pointer events.

That's the really nasty bit, because you now have to sync up the grab event semantics of the core, XI 1.x and XI2 protocols and wrap it all around the new grab semantics. So that if you have a multitouch app running under a window manager without multitouch support everything still works as expected.
That framework is now in place too though I expect it to still have bugs, especially in the hairier corner cases.

But other than that, it should work just as intended. I can interact with my GNOME3 desktop quite well and I get multitouch events to my test applications.

[edit Dec 20: typo fix]

December 06, 2011

A short update on multitouch
For the last couple of weeks I've been pretty much working full-time on getting multitouch/XI 2.2 ready for the merge (well, I was on holidays for a bit too). So first of all - sorry if I've been ignoring bugs or emails, I'm working to a few deadlines here. Anyway, here's a bit of a status update.

Right now, it looks like touch event delivery is working, including nested grabs.Chase Douglas started on the pointer emulation while I was away and we're now at the point where emulation works, except that pointer grabs on top of multitouch clients aren't handled yet. I'm still rather optimistic to get this into 1.12, though it's getting a bit unwieldly. Carlos Garnacho has already sent me some patches, so he's testing the lot against the GTK branches.

However, since touch support cannot simply be bolted on top and needs to be integrated properly, this has triggered some extra rewrites here and there. I'm currently some 200 commits ahead of master sync-point. I'm planning to get this number down to something sane before merging but meanwhile, sorry, I'll have to keep ignoring you until this is done.

December 05, 2011

Stalk^W Following designs, the easy way
If you want to follow all the new designs from the GNOME Design team, including work-in-progress mockups, gathering of relevant art, etc. be sure to subscribe yourself to the pages that interest you in the various sections of the GNOME Wiki.

A nice trick is using our Wiki's notification, with regex support. Head onto your notification settings page, and add those lines to the "Subscribed wiki pages":

GnomeLive:Design*
GnomeLive:GnomeShell/Design*
GnomeLive:GnomeOS/Design*

December 04, 2011

WebKitGTK+ Hackfest: Day 5
Yesterday was our sponsored dinner, at a very nice vegetarian place, followed by some cinema discussions in a bar where the toilets are hidden behind mirrored walls (most strange).

Still, quite a few happenings in code land:


The hackfest is drawing to a close, and I'll take this opportunity to thank our very kind sponsors for flights, accomodation and even feeding us in the office so we didn't have to stop hacking for long.









Also a big thank you to Igalia for providing us with hacking beer in the evenings (left-overs from Igalia's 10th anniversary party, a happy coincidence).

Many thanks to Xan, Juanjo and Alex for the hackfest organisation, and the personal chauffeur service, and my most heartfelt thanks to Juanjo for his infinite patience to our tourist needs (such as showing us the Torre de Hércules on a windy December afternoon).
Vegas Baby!
Before: No video, because no Flash, and no MP4 support


After: Video, through Totem's Vegas plugin

Totem's new Vegas browser plugin provides you with a way to watch Flash based videos, without using Flash, using libquvi's growing collection of supported sites.

Code is available from GNOME git this instant. Be sure to pass --enable-vegas-plugin=yes to compile the plugin.

December 03, 2011

WebKitGTK+ Hackfest: Day 4
The crema de ojuro took effect. While the effects simmered down, code fixing was still in full flow.


  • Philippe finished landing the fullscreen fixes for the <video>
  • Xan and Claudio started fixing GNOME 3.4 Epiphany design bugs (on the road towards the Web app design)
  • Alex, Martin, Joone and Nayan all looked into Accelerated Compositing. They all owe you, dear reader, blog posts full of nitty gritty details.
  • Jon didn't spend the day debugging bizarre browsers crashes
  • Wingo punched the air as he figured out a tricky memory allocation issue. He also listened to the Thundercats theme tune, in a loop
  • Gustavo and Dan figured out a design for multipart/x-mixed-replace support, as used by some streaming IP cameras, and Gustavo started the implementation
  • Nayan showed legendary patience waiting for tourists outside a haberdashery
  • Dan committed a number of libsoup related cleanups in WebKitGTK+, including a very impressive minus 200 lines cleanup.
WebKitGTK+ Hackfest: Day 3
Another incredible day of hacks, and UI design.


And despite the crema de ojuro, hacking carries on at the week-end. Join us in #webkit-gtk-hackfest on GIMPNet.

December 02, 2011

Improving code readability through temporary variables
We don't always have the luxury of using library interfaces that are sensibly designed and enforce readability self-explanatory (see this presentation). A fictional function may look like this:

extern void foo(struct foo *f,
Bool check_device,
int max_devices,
Bool emulate);

But a calls often end up like this:

foo(mystruct, TRUE, 0, FALSE);

Or, even worse, the function call could be:

foo(mystruct, !dev->invalid, 0, list_is_first_entry(dev->list));

The above is virtually unreadable and to understand it one needs to look at the function definition and the caller at the same time. The simple use of a temporary variable can do wonders here:

Bool check_device = !dev->invalid;
Bool emulate_pointer = list_is_first_entry(dev->list));

foo(mystruct, check_device, 0, emulate_pointer);

It adds a few lines of code (though I suspect the compiler will mostly reduce the output anyway) but it improves readability. Especially in the cases where the source field name is vastly different to the implied effect. In the second example, it's immediately obvious that pointer emulation should happen for the first entry.

December 01, 2011

WebKitGTK+ Hackfest: Day 2
After a late evening yesterday, the hackfest started a bit slower, but started picking up pace again with a big ticket item, the WebKit2 GTK+ API discussion. This was the destination for a lot of the WebKitGTK+ hackers, leaving us outsiders, well, outside. The discussion isn't quite finished.

This lead us onto a little lunchtime kick-about. The arrange 6 v. 6 game turned into a 5 v. 4 before getting to the ground, and finish as a 3 v. 4 when two of our most jet-lagged/backbroke hackers dropped out.

And then onto a lunch. And another late evening.
  • Philippe fixed more bugs in WebKitGTK+'s fullscreen video playback mode
  • Bob uploaded a new draft of his WebKitGTK+ cookbook
  • Gustavo was playing Street Fighter whilst increasing the size of his farm on Facebook (in WebApp mode!)
  • And the new buildbot is up, running, and churning through test suites in a loop, as fast as the hackers can add new code.

November 30, 2011

WebKitGTK+ Hackfest: Day 1, Afternoon
After num-num tapas for lunch (and some chocolatey cake), we got back to the agenda, with Jon presenting the design for the Web application, successor (in spirit, and perhaps in code) to Epiphany.


  • Andy did the initial work on adding new language features to JavaScriptCore (let and const, as used heavily in gnome-shell)
  • Martin and Gustavo worked on a way to automate running the WebKitGTK tests with test fonts, and are working on making all the tests automated, and reproduceable
  • Philippe fixed fullscreen support in the HTML5 YouTube player
  • Dan fixed the broken security status in Epiphany
  • Carlos worked on the WebKit2 support for windowed plugins, and the WebKit side of favicons support
  • Philippe, Martin and yours truly discussed sharing of fullscreen media controls (UI and behaviour) between WebKitGTK, Totem and Sushi, as well as a way to make fullscreening smoother.
I hear they didn't finish all the beers for Igalia's 10th anniversary party...
WebKitGTK+ Hackfest: Day 1, morning
After landing in (not so sunny) A Coruña yesterday, we started bright and early today with the WebKitGTK+ hackfest agenda.

We've got most of the topics pinned, as listed on the wiki. If you have more topics to add to the discussion, feel free to drop by #webkit-gtk-hackfest on GIMPNet IRC, and try to drum up interest in somebody championing your topic.

It looks like we could get some pretty cool demos done by the end of this week!

November 21, 2011

fakemail is handy

For debugging mail problem, e.g. when debugging some emailmerge stuff in LibreOffice recently, fakemail was really really handy when you have a bug which requires generating a couple of hundred emails in quick succession to trigger.

November 18, 2011

Clarifying the "secure boot attack"
Yesterday I wrote about an alleged attack on the Windows 8 secure boot implementation. As I later clarified, it turns out that the story was, to put it charitably, entirely wrong. The attack is a boot kit targeted towards BIOS-based boots. It lives in the MBR. It'll never be executed on any UEFI systems, let alone secure boot ones. In fact, this is precisely the kind of attack that secure boot is intended to protect against. So, context.

The MBR contains code that's executed by the BIOS at boot time. This code is unverifiable - it's permitted to have arbitrary functionality. There's only 440 bytes, but that's enough to jump to somewhere else and read code from elsewhere. There's no way for the BIOS to know that this code is malicious. And one thing this code can obviously do is load the normal boot code and modify it to behave differently. Any self-validation code in the loader can be patched out at this point. The modified loader will then load the kernel, and potentially also modify it. At this point, you've lost. Any attempts to validate the code can be redirected to the original code and so everything will look fine, up until the point where the user runs a specific application and suddenly your kernel is sending all your keystrokes over UDP to someone in Nigeria.

These attacks exist now. They're in the wild. In a normal UEFI world you'd do the same thing by just replacing the UEFI bootloader. But with secure boot you'll be able to validate that the bootloader is appropriately signed and if someone's modified it you'll drop into some remediation mode that recovers your files, from install media if necessary.

Obviously, this protection is based on all the components of secure boot (ie, everything that runs before ExitBootServices() is called) being perfect. As I said, if any of them accept untrusted input and misinterpret it in such a way that they can be tricked into running arbitrary code, you'll still have problems. But when discussing the pros and cons of secure boot, it's important to make sure that we're talking about reality rather than making provably false assertions.

comment count unavailable comments
Introducing the Journal

In the past weeks we have been working on a major new addition to systemd that will hopefully positively change the Linux ecosystem in a number of ways. But see for yourself, check out the full explanation on what we have implemented on the design document we put up on Google Docs.

November 17, 2011

Attacks on secure boot
This is interesting. It's obviously lacking in details yet, but it does highlight one weakness of secure boot. The security for secure boot is all rooted in the firmware - there's no external measurement to validate that everything functioned as expected. That means that if you can cause any trusted component to execute arbitrary code then you've won. So, what reads arbitrary user data? The most obvious components are any driver that binds to user-controlled hardware, any filesystem driver that reads user-provided filesystems and any signed bootloader that reads user-configured data. A USB drive could potentially trigger a bug in the USB stack and run arbitrary code. A malformed FAT filesystem could potentially trigger a bug in the FAT driver and run arbitrary code. A malformed bootloader configuration file or kernel could potentially trigger a bug in the bootloader and run arbitrary code. It may even be possible to find bugs in the PE-COFF binary loader. And once you have the ability to run arbitrary code, you can replace all the EFI entry points and convince the OS that everything is fine anyway.

None of this should be surprising. Secure boot is predicated upon the firmware only executing trusted material until the OS handoff. If you can sidestep that restriction then the entire chain of trust falls down. We're talking about a large body of code that was written without the assumption that it would have to be resistant to sustained attack, and which has now been put in a context where people are actively trying to break it. Bugs are pretty inevitable. I'd expect a lot of work to be done on firmware implementations between now and Windows 8 ship date.

(Update: It's probably worth pointing out that although Ars talk about it defeating secure boot, the primary sources don't seem to mention secure boot at all. Windows 8 will boot without secure boot, so it's possible that this attack merely handles the non-secure boot case)

(Further update: It turns out to be a purely BIOS based attack. In other words, it's an example of the kind of thing that secure boot is intended to prevent, not any kind of compromise of a secure boot implementation)

comment count unavailable comments
GPT disks in a BIOS world
Starting with Fedora 16 we're installing using GPT disklabels by default, even on BIOS-based systems. This is worth noting because most BIOSes have absolutely no idea what GPT is, which you'd think would create some problems. And, unsurprisingly, it does. Shock. But let's have an overview.

GPT, or GUID Partition Table, is part of the UEFI specification. It defines a partition table format that allows up to 128 partitions per disk, with 64 bit start and end values allowing partitions up to 9.4ZB (assuming 512 byte blocks). This is great, because the existing MBR partitioning format only allows up to 2.2TB when using 512 byte blocks. But most BIOSes (and most older operating systems) don't understand GPT, so plugging in a GPT-partitioned disk would result in the system believing that the drive was uninitialised. This is avoided by specifying a protective MBR. This is a valid MBR partition table with a single partition covering the entire disk (or the first 2.2TB of the disk if it's larger than that) and the partition type set to 0xee ("GPT Protective"). GPT-unaware BIOSes and operating systems will see a partition they don't understand and simply ignore it.

But how do we boot a GPT-labelled disk with a protective MBR on a system that doesn't understand GPT? The key here is that BIOS is pretty dumb. Typically a BIOS will see a disk and just attempt to execute the code in the first sector. This MBR code knows how to do the rest of the boot, including parsing the partition table if necessary. The BIOS doesn't need to care at all.

Of course, some BIOSes choose to care. We've seen a small number of machines that, when exposed to a GPT disk, refuse to boot because they parse the MBR partition map and don't like what they see. This is typically accompanied by a message along the lines of "No operating system found". What we've found is that they're looking for a partition marked with the bootable flag, and if no partitions are marked bootable they assume that there's no OS. This is in contrast to the traditional use of the flag, which is merely a hint to the MBR as to which partition boot code it should execute.

So, should we set that flag? The UEFI specification specifically forbids it - table 15 states that the BootIndicator byte must be set to 0. Once again we're left in an unfortunate position where the specification and reality collide in an awkward way.

If this happens to you after a Fedora 16 install, you have two choices. The first is to reinstall with the "nogpt" boot argument. The installer will then set up a traditional MBR partition table. The second is to boot off a live CD and run fdisk against the boot disk. It'll give a bunch of scary warnings. Ignore them. Hit "a", then "1", then "w" to write it to disk. Things ought to work then. We'll figure out something better for F17.

comment count unavailable comments
Making timeouts work with suspend
A reasonably common design for applications that want to run code at a specific time is something like:

time_t wakeup_time = get_next_event_time();
time_t now = time(NULL);
sleep(wakeup_time-now);

This works absolutely fine, except that sleep() ignores time spent with a suspended system. If you sleep(3600) and then immediately suspend for 45 minutes you'll wake up after 105 minutes, not 60. Which probably isn't what you want. If you want a timer that'll expire at a specific time (or immediately after resume if that time passed during suspend), use the POSIX timer family (timer_create, timer_settime and friends) with CLOCK_REALTIME. It's a signal-driven interface rather than a blocking one, so implementation may be a little more complicated, but it has the advantage of actually working.

comment count unavailable comments

November 13, 2011

libexttextcat 3.2.0

Released libexttextcat 3.2.0 (Extended Text Categorization used to guess the language that input text is written in). It can be found in this download dir. No code changes from 3.1.1, but adds a large collection of extra language signatures to nearly add the same language support to libexttextcat as LibreOffice supports, modulo languages that LibreOffice supports which don’t have a convenient UDHR translation to use as a basis to generate a language fingerprint.

Introducing the ColorHug open source colorimeter

For the past 3 weeks I’ve been working long nights on an open source colorimeter called the ColorHug. This is hardware that measures the colors shown on the screen and creates a color profile. Existing hardware is proprietary and 100% closed, and my hardware is open source. It has a GPL bootloader, GPL firmware image and GPL hardware schematics and PCBs. It’s faster than the proprietary hardware, and more importantly a lot cheaper.

Making hardware does cost money, and I can’t give the hardware away for free like I do my other software. I’m aiming to do an initial production run of 50 units, but I’m going to need some advanced orders just to make sure I don’t get stuck with a lot of stock and no buyers. I’m offering a 20% discount on each unit, on the assumption the first users will be testing the firmware and reporting problems. If you want to support a cool open source project, I’m asking £48 for each unit, plus postage and packaging. There’s a whole website if you want to know more about the project, and there’s even a newsletter if you don’t need hardware, but what to know how we’re getting on.

I would very much appreciate it if people could publicize this project, and help me get to my target of 50 pre-orders.

Thanks,

Richard

November 10, 2011

Blue sky, white sand, and NetworkManager 0.9.2

The view as I type 'git tag 0.9.2'... (shazwan cc-by-2.0)

There’s no better way to celebrate the release of NetworkManager 0.9.2 than a sip of ice-cold cocktail.  It’s something pink-colored — I don’t know what — and it’s phenomenal.  And if I ever run out, I just ring a bell and somebody fills it up!  It’s basically like Paradise, except Paradise doesn’t have the latest version of NetworkManager.  Here’s a hot tip: make your first half billion and buy yourself a private island.  Then move there and write open-source software for fun.  It’s a pretty great life.  After a hard day on the beach bending networks to my will, I wind down by building buried hatches solely to confuse the island’s next owner (I’m trading up to a private archipelago in a few years).

But I digress.  So many people contributed to this release.  Unfortunately they couldn’t all fly out to my private island for the release party so I’ll just have to call them out by name instead.  I’m sure they’ll take Internet fame as a consolation prize, right?  So a huge thanks to Alfredo Matos, Colin Walters, Dan Winship, David Rothlisberger, Evan Broder, Florian Echtler, Gary Ching-Pang Lin, Jirí Klimeš, Larry Reaves, Ludwig Nussel, Mathieu Trudel-Lapierre, Michael Stapelberg, Thomas Bechtold, Thomas Graf, Thomas Jarosch, Tore Anderson, Michael Biebl, Vincent Untz, Anders Feder, Giovanni Campagna, Murilo Opsfelder, David Woodhouse, all our translators, and all our testers.  It wouldn’t have happened without you.

This release packs in some great stuff aside from the usual bug fixes and pixie dust: translated country names in the mobile broadband provider wizard, VPN details in the applet’s Connection Information dialog, auto-unlocking of GSM modems, support for libnl2 and libnl3, better IPv6 handling, enhancements for nmcli, rfkill fixes for EeePCs, GObject Introspection updates, better cooperation with unmanaged devices, timestamps for VPN connections, increased dnsmasq cache size, and more.  Get your tarballs on:

http://ftp.gnome.org/pub/gnome/sources/NetworkManager/0.9/
http://ftp.gnome.org/pub/gnome/sources/network-manager-applet/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openconnect/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openswan/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openvpn/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-pptp/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-vpnc/0.9/

0.9.4: a Smörgåsbord of Freaking Awesome

What’s even more exciting is what’s all piled up for 0.9.4.  We’ve killed WEXT and now use the more robust nl80211 for talking to well-behaved kernel drivers.  We’ve uncoupled IPv4 and IPv6 addressing so that when one completes the connection is usable while we wait for the other one to complete or time out.  We’ve added bonding support, and VLANs and bridges are next.  We’ll have better firewall interaction.  We’ll probably have connectivity detection as well.  Many of these features are finished and merged to git master already.

Hey, 0.8.6 is out too!

If you’re into anachronisms, then we’ve got another release for you too.  0.8.6 got tagged earlier this week, and it’s got IPv6 fixes, auto-unlocking of GSM modems, improved usability of IP address and routing entry in the editor, notifications of mobile broadband changes, VPN information in the Connection Information dialog, better handling of gadget devices, retry of Ethernet connections on carrier bounces, allowing certificate paths in keyfile plugins, MAC address blacklists, on-the-fly recognition of newly installed VPN plugins, subject verification of 802.1x certificates, builds without PolicyKit, and much more.

So really it’s just raining NetworkManager goodness.  Except here on my island, where it’s always sunny, breezy, and absolutely perfect.  Time for another drink.  Cheers.

November 09, 2011

Better logging

First an update on deprecations: GLib and GTK+ master have been fully converted to use function attributes for deprecations now, and most uses of G_DISABLE_DEPRECATED guards in GLib headers have been removed. They are still used for deprecated things that are not symbols, like macros and enum values. I haven’t done the same for GTK_DISABLE_DEPRECATED yet, but it should happen fairly soon. Looking forward to a future of less deprecation-induced build breakage (unless you insist on -Werror, of course…).

Continuing the theme of ‘making things suck less for developers’… another favourite is logging.  It is very hard to get right; you want to add plenty of debug and diagnostic messages to help with, well, debugging. But then people complain if you fill up their syslog or .xsession-errors with tons of debug spew, like here. The GLib logging system with g_log() and the macro wrappers g_debug(), g_warning(), etc leaves much to be desired. It does have the concept of replaceable log handler functions, which is ok for complicated uses. But unfortunately, the default handler is so inflexible that many applications are forced to install a custom handler just to be able to turn debug output on and off.

As a small step towards making better logging, the default log handler in GLib master now only emits errors, warnings and criticals. Informational and debug messages are only emitted if you set the G_MESSAGES_DEBUG environment variable. The value can be a list of log domains that you are interested in, or the special value ‘all’ to get it all. As a reminder, the log domain is almost invisible in your code unless you use g_log() directly; it’s what you define by adding -DG_LOG_DOMAIN=”MyCoolApp” to your CPPFLAGS.

 

Properly booting a Mac
This is mostly for my own reference, but since it might be useful to others:

By "Properly booting" I mean "Integrating into the boot system as well as Mac OS X does". The device should be visible from the boot picker menu and should be selectable as a startup disk. For this to happen the boot should be in HFS+ format and have the following files:
  • /mach_kernel (can be empty)
  • /System/Library/CoreServices/boot.efi (may be booted, if so should be a symlink to the actual bootloader)
  • /System/Library/CoreServices/SystemVersion.plist which should look something like
    <xml version="1.0" encoding="UTF-8"?>
    <plist version="1.0">
    <dict>
            <key>ProductBuildVersion</key>
            <string></string>
            <key>ProductName</key>
            <string>Linux</string>
            <key>ProductVersion</key>
            <string>Fedora 16</string>
    </dict>
    </plist>
That's enough to get it to appear in the startup disk boot pane. Getting it in the boot picker requires it to be blessed. You probably also want a .VolumeIcon.icns in / in order to get an appropriate icon.

Now all I need is an aesthetically appealing boot loader.

comment count unavailable comments

November 08, 2011

Application/Compositor Synchronization

This blog entry continues an extended series of posts over the last couple of years. Older entries:

What we figured out in the last post was that if you can’t animate at 60fps, then from the point of achieving a smooth display, a very good thing to do is to just animate as fast as you can while still giving the compositor time to redraw. The process is represented visually below. (You can click on the image for a larger version.)

The top section shows a timeline of activity for the Application, Compositor, and X server. At the bottom, we show the contents of the application’s window pixmap, the back buffer, and the front buffer as time progresses. From this, we can get an idea of the time between the point where a user hits a key and the point where that displays on the screen: the input latency. The keystroke C almost immediately makes its way into a new application frame, and that new frame is almost immediately drawn by the compositor into the back buffer, and the back buffer is almost immediately swapped by the X server. On the other hand, the keystroke D suffers multiple delays.

What happens if we use the same algorithm when we’re unloaded – when the total drawing time is less than the interval between screen refreshes? Then it looks like:

This is basically working pretty well – but we note that even though the application is drawing quickly and the entire system is unloaded we still have a lot of input latency. If we plot the latency versus the application drawing time it looks like:

The shaded area shows the theoretical range of latencies, the solid line the theoretical average latency, and the points show min/max/avg latencies as measured in a simulation. (It should be mentioned that this is only the latency when we’re continually drawing frames. An isolated frame won’t have any problems with previously queued frames, so will appear on the screen with minimal latency.)

We could potentially improve this by having the application delay rendering a new frame – the compositor can use the time used to render the last frame to make a guess as to a “deadline” – a time by which the application needs to have the frame rendered. We can again look at a timeline plot and simulated latencies for this algorithm:

There are downsides to delaying frame render – the obvious one is that if we guess wrong and the application starts the frame too late, then we can entirely miss a frame. From a smoothness point of view this looks really bad. In general, an application should only use a deadline provided by the compositor if it has reason to believe that the next frame is roughly similar to the previous one. Another disadvantage is that the delay algorithm does cause a frame-rate cliff as soon as the time to draw a frame exceeds the vblank period – there is an instant drop from 60fps to 30fps.

Which of these two algorithms is better likely depends upon the application: if an application wants maximum animation smoothness and protection from glitches, drawing frames as early makes sense. On the other hand, if input latency is a critical factor – for a game or real-time music application, then delaying frame drawing as late as possible would be preferable.

So, what we want to do from the level of application/compositor synchronization is provide enough information to allow applications to implement different algorithms. After drawing a frame, the compositor should send a message to the application containing:

  • The expected time at which the frame will be displayed on screen
  • If possible, a deadline by which the application needs to have finished drawing the next frame to get it appear onscreen.
  • The time that the next frame will be displayed onscreen

But even without the deadline information, just having a basic response at the end of the frame already greatly improves the situation from the current situation. I’m working on a proposal to add application/compositor synchronization to the Extended Window Manager Hints specification.


November 07, 2011

Kernel Hackers Panel

At LinuxCon Europe/ELCE I had the chance to moderate the kernel hackers panel with Linus Torvalds, Alan Cox, Paul McKenney and Thomas Gleixner on stage. I like to believe it went quite well, but check it out for yourself, as a video recording is now available online:

<video controls="1" height="450" width="800"> <source src="http://free-electrons.com/pub/video/2011/elce/elce-2011-torvalds-cox-gleixner-mackenney-kernel-developer-panel-450p.webm"> </video>

For me personally I think the most notable topic covered was Control Groups, and the clarification that they are something that is needed even though their implementation right now is in many ways less than perfect. But in the end there is no reasonable way around it, and much like SMP, technology that complicates things substantially but is ultimately unavoidable.

Other videos from ELCE are online now, too.

ObexFTP in GNOME, (non-)update
If you've tried to use ObexFTP browsing (browsing files on mobile phones over Bluetooth) in GNOME in recent times, and didn't get a good experience from it (crashes, or very unreliable browsing), those problems are known, and due to the architecture used to implement the functionality.

If you want to help make ObexFTP browsing good again, please try to convince one of your coder friends to help port the existing code to use the "gobex" library that the obexd D-Bus service uses.

Unless somebody steps up in the GNOME 3.4 timeframe, I will disable the access to the functionality in gnome-bluetooth. The brokenness makes us look very bad, and the files are still available through other (cabled) means in most cases.

November 03, 2011

Understanding the current state of UEFI
This story has been floating around for a week or so. The summary is that someone bought a system that has UEFI and is having trouble installing Linux on it. In itself, not a problem. But various people have either conflated this with the secure boot issue or suggested that UEFI is a fundamentally anti-Linux technology.

Right now there are no machines shipping to the public with secure boot enabled. None at all. If you're having problems installing Linux on a machine with UEFI then it's not because of secure boot. So what is actually causing the problem?

UEFI is a complicated specification, with 2.3.1A being 2214 pages long. It's a large body of code. There's a lot of subtleties. It's very easy for people to get things wrong. For example, we've seen issues where calling SetVirtualAddressMap() resulted in the firmware referencing boot services code, a clear violation of the spec on the firmware authors' part. We've also found machines that failed to boot because grub wasn't aligning its stack properly, a clear violation of the spec on our part.

Software is difficult. People make mistakes. When something mysteriously fails to work the immediate assumption should be that you've found a bug, not a conspiracy. Over time we'll find those bugs and fix them, but until then just treat UEFI boot failures like any other bug - annoying, but not malicious.

comment count unavailable comments
UEFI secure boot white paper
As people have probably noticed, last week we published a white paper on the UEFI secure boot issue. This was written in collaboration with Canonical and wouldn't have been possible without the tireless work of Jeremy Kerr. It's the kind of problem that affects the entire Linux community, and I'm glad that we've both demonstrated that being competitors is less important than working together for the benefit of everyone.

comment count unavailable comments