January 29, 2015

Lenovos X1 Carbon 3rd touchpad woes

Lenovo released a new set of laptops for 2015 with a new (old) feature: the trackpoint device has the physical buttons back. Last year's experiment apparently didn't work out so well.

What do we do in Linux with the last generation's touchpads? The kernel marks them with INPUT_PROP_TOPBUTTONPAD based on the PNPID [1]. In the X.Org synaptics driver and libinput we take that property and emulate top software buttons based on it. That took us a while to get sorted and feed into the myriad of Linux distributions out there but at some point last year the delicate balance of nature was restored and the touchpad-related rage dropped to the usual background noise.

Slow-forward to 2015 and Lenovo introduces the new series. In the absence of unnecessary creativity they are called the X1 Carbon 3rd, T450, T550, X250, W550, L450, etc. Lenovo did away with the un(der)-appreciated top software buttons and re-introduced physical buttons for the trackpoint. Now, that's bound to make everyone happy again. However, as we learned from Agent Smith, happiness is not the default state of humans so Lenovo made sure the harvest is safe.

What we expected to happen was that the trackpoint device has BTN_LEFT, BTN_MIDDLE, BTN_RIGHT and the touchpad has BTN_LEFT and is marked with INPUT_PROP_BUTTONPAD (i.e. it is a Clickpad). That is the case on the x220 generation and the T440 generation. Though the latter doesn't actually have trackpoint buttons and we emulated them in software.

On the X1 Carbon 3rd, the trackpoint has BTN_LEFT, BTN_MIDDLE, BTN_RIGHT but they never send events. The touchpad has BTN_LEFT and BTN_0, BTN_1 and BTN_2 [2]. Clicking the left button on the trackpoint generates BTN_0 on the touchpad device, clicking the right button generates BTN_1 on the touchpad device. So in short, Lenovo has decided to wire the newly re-introduced trackpoint buttons to the touchpad, not the trackpoint. [3] The middle button is currently dead, which is a kernel bug. Meanwhile we think of it as security feature - never accidentally paste your password into your IRC session again!

What does this mean for us? Neither synaptics nor evdev nor libinput currently support this so we've been busy aipodae and writing patches like crazy. The patch goes into the kernel and udev.... The two patches needed go into the kernel and udev, and libinput. No, the three patches needed go into the kernel, udev and libinput, and synaptics. The four patches, no, wait. Amongst the projects needing patches are the kernel, udev, libinput and synaptics. I'll try again:

With those put together, things pretty much work as they're supposed to. libinput handles middle button scrolling as well this way but synaptics won't, much for the same reason it didn't work in the previous generation: synaptics can't talk to evdev and vice versa. And given that synaptics is on life support or in pallative care, depending how you look at it, I recommend not holding your breath for a fix. Otherwise you may join it quickly.

Note that all the patches are fresh off the presses and there may be a few bits changing before they are done. If you absolutely can't live without the trackpoint's buttons you can work around it by disabling the synaptics kernel driver until the patches have trickled down to your distribution.

The tracking bug for all this is Bug 88609. Feel free to CC yourself on it. Bring popcorn.

Final note: I haven't seen logs from the T450, T550, ... devices yet yet so this is so far only confirmed on the X1 Carbon so far. Given the hardware is essentially identical I expect it to be true for the rest of the series though.

[1] We also apply quirks for the 2013 generation because the firmware was buggy - a problem Synaptics Inc. has since fixed (but currently gives us slight headaches).
[2] It is also marked with INPUT_PROP_TOPBUTTONPAD which is a bug. It uses a new PNPID but one that was in the range we previously believed was for pads without trackpoint buttons. That's an an easy thing to fix.
[3] The reason for that seems to be HW design: this way they can keep the same case/keyboard and just swap the touchpad bits.
[4] synaptics is old enough to support dedicated scroll buttons. Buttons that used to send BTN_0 and BTN_1 and are thus interpreted as scroll up/down event.

January 28, 2015

Detecting fake flash

I’ve been using F3 to check my flash drives, and this is how I discovered my drives were counterfeit. It seems to me this kind of feature needs to be built inside gnome-multi-writer itself to avoid sending fake flash out to customers. Last night I wrote a simple tool called gnome-multi-writer-probe which does the following few things:

* Reads the existing data from the drive in 32kb chunks every 32Mbish into RAM
* Writes random blocks of 32kb every 32MBish, and also stores in RAM
* Resets the drive
* Reads all the 32k blocks from slightly different addresses and sizes and compares them to the random data in RAM
* Writes all the saved data back to the drive.

I only takes a few seconds on most drives. It also tries to be paranoid, and saves the data back to the drive the best it can when it encounters an error. That said, please don’t use this tool on any drives that have important data on them; assume you’ll have to reformat them after using this tool. Also, it’s probably a really good idea to unmount any drives before you try this.

If you’ve got access to gnome-multi-writer from git (either from jhbuild, or from my repo) then please could you try this:

sudo gnome-multi-writer-probe /dev/sdX

Where sdX is the USB drive you want to test. I’d be interested of the output, and especially interested if you have any fake flash media you can test this with. Either leave a comment here, grab me on IRC or send me an email. Thanks.

January 27, 2015

Scammers at promo-newa.com

tl;dr Don’t use promo-newa.com, they are scammers that sell fake flash.

Longer version: For the ColorHug project we buy a lot of the custom parts direct from China at a fraction of the price available to us in the UK, even with import tax considered. It would be impossible to produce such a low cost device and still make enough money to make it worth giving up our evenings and weekends. This often means sending thousands of dollars to sketchy-looking companies willing to take on small (to them!) custom orders of a few thousand parts.

So far we’ve been very lucky, until last week. I ordered 1000 customized 1GB flash drives to use as a LiveUSB image rather than using a LiveCD. I checked out the company as usual, and ordered a sample. The sample came back good quality, with 1GB of fast flash. Payment in full was sent, which isn’t unusual for my other suppliers in China.

Fast forward a few weeks. 1000 USB drives arrived, which look great. Great, until you start using them with GNOME MultiWriter, which kept throwing validation warnings. Using the awesome F3 and a few remove-insert cylces later, the f3probe tool told me the flash chip was fake, reporting the capacity to be 1GB, when it was actually 96Mb looped around 10 times.

Taking the drives apart you could also see the chip itself was different from the sample, and the plastic molding and metal retaining tray was a lower quality. I contacted the seller, who said he would speak to the factory later that day. The seller got back to me today, and told me that the factory has produced “B quality drives” and basically, that I got what I paid for. For another 1600USD they would send me the 1GB ICs, which I would have to switch in the USB units. Fool me once, shame on you; fool me twice, shame on me.

I suppose people can use the tiny flash drives to get the .icc profile off the LiveCD image, which was always a stumbling block for some people, but basically the drives are worthless to me as LiveUSB devices. I’m still undecided whether to include them in the ColorHug box; i.e. is a free 96Mb drive better than them all going into landfill?

As this is China, I understand all my money is gone. The company listing is gone from Alibaba, so there’s not a lot I can do there. So other people can hopefully avoid this same mistake, I’ve listed all the details here, which hopefully will become googleable:

Promo-Newa Electronic Limited(Shenzhen)
Wei and Ping Group Limited(Hongkong)  

Office: Building A, HuaQiang Garden, North HuaQiang Road, Futian district, Shenzhen China, 0755-3631 4600
Factory: Building 4, DengXinKeng Industrial Zone, JiHua Road,LongGang District, Shenzhen, China
Registered Address: 15/B—15/F Cheuk Nang Plaza 250 Hennessy Road, HongKong
Email: sales@promo-newa.com
Skype: promonewa

January 22, 2015

Sandboxed applications for GNOME

It is no secret that we’ve been interested in sandboxed applications for a while. It is evident here, here, here or here, to name just a few.

What may not be widely known yet is that we have been working on putting together a working implementation of these ideas. Alexander Larsson has made steady progress, and we’re now almost at the point where it is useful for  other people to start playing with it.

If you want to go straight to the playing part, you can head to this wiki page, which has all the links and explanations you need.


Some rights reserved by whiteoakart

Why sandboxed apps ?

There are several reasons:

  • We want to make it possible for 3rd parties to create and distribute applications that work on multiple distributions.
  • We want to run the applications with as little access as possible to the host (for example user files or network access), so that users can try applications without necessarily having to trust them fully.
  • We want to make it much easier to write applications – jhbuild has its uses, but it is an endless source of frustration and a very high hurdle for new contributors.

Traditionally, the only answer available to people who need to distribute an executable that works across several Linux distributions is to statically link all but the lowest-level dependencies.

That is not only wasteful in terms of bandwidth for downloading, but also at runtime, when every application loads its own copy of those dependencies, instead of sharing them.

And the real problem comes when one of those dependencies needs to be updated, e.g. because of a security issue (some people still remember the infamous zlib double-free incident). Dealing with this in a fairly efficient way is a strong point of the Linux packaging model, as its proponents are quick to point out.

So,  are sandboxed apps any better ? By themselves, they aren’t.

Runtimes and bundles

We suggest to introduce the concept of a runtime to help with this. A runtime provides a well-defined environment that an app can run in – one way to think of it is as a /usr filesystem with fixed contents.
Typical examples would be “GNOME 3.14″ or “KDE 5.6″.

It is important to note: you can have multiple runtimes installed on the system, and even multiple versions of the same runtime. Things that are not included in the runtime will still have to be bundled with the application – but the problem becomes much more manageable.

What about the applications themselves ? In the filesystem, an app bundle is simply a directory which contains a metadata file that describes the application, what runtime it needs, and various other things. The actual contents of the app bundle are in a subdirectory. The last component of an app bundle is another subdirectory, containing the various files that are needed by the host system and the session to present the app to the user: desktop files, icons, etc.

The only way to run such a bundled application is through a helper, which sets up the  sandbox. It uses kernel namespaces and bind mounts to isolate the application from the host system and its filesystem. The app bundle contents get mounted under /self, and the runtime gets mounted under /usr.

But what about the developer experience ? The runtime idea has a counterpart that helps with this, the developer runtime, or sdk. It is basically, the runtime with the ‘devel’ parts added, including tools like a compiler and a debugger.  And similar to the ‘xdg-app run’ command that sets up a sandbox to run an application in, there is an ‘xdg-app build’ command that sets up a ‘developer sandbox’ with the sdk.

Is this progress ?

One question I expect is: What about freedom ? This sounds just like corporate walled gardens and app stores. I think this is a fair question – we are trying to replicate some of the strong points of the app store model. The existing examples of this model have a strong flavor of control, and focus entirely on consumption as opposed to creation.

But I think we can actually turn this into a freedom-enhancing change, if we pay attention while building it.

One vision I have is that we could have a “Modify this application” context menu item in gnome-shell which downloads the sources of the app bundle, sets up the right sdk, opens the sources in your favourite IDE, where you can make your modifications, build it, test it and create a new bundle that you can share with your friends.

In particular the last part (wrapping your modifications in an easy-to-share form) is really not easy in the traditional distribution world, where everything has to be a package that comes from your distributor.

This might be a great fit for gnome-builder, which will hopefully gain support for building bundled applications. Coincidentally, the gnome-builder project is just entering the last week of its fundraising campaign – if you haven’t donated yet, you have 7 days left to do so.

Our implementation

Some notable facts about the implementation that Alex’ has been working on:

  • Both runtimes and app bundles can be installed per-user and system-wide, and they can coexist perfectly fine with traditional applications. There’s no need for everybody to adopt this model at once, we can transition gradually to it.
  • We use OSTree to distribute both runtimes and applications as well as updates. OSTree is a good fit for this, because its use of content-based addressing and hardlinks transparently makes runtimes and bundles share disk space, and at the same time it doesn’t impose strong requirements on the host system.  But OSTree does not have to be the only distribution mechanism – the definition of the filesystem layout for applications and the sandboxing setup is has no dependencies on it.
  • The build tooling supports using rpmbuild and rpms to build runtimes and application. With this, what we do becomes very similar to the rpm-ostree project: They use rpms to populate OS images on the server side, we use rpms to put together runtimes and applications. In both cases, the results get distributed to end users via OSTree.
  • We have a repository with a few example applications and a yocto-based runtime for GNOME 3.15.
Whats next ?

There are lots of smaller (and some bigger things left to do).

Currently, we are working on making gnome-software show and handle these application bundles in addition to  traditional packaged applications.

Our short-term goal is to provide an initial test version of a ‘reference runtime’ for the GNOME 3.16 release.

If you want to learn more, you can study the SandboxedApps wiki page that I’ve already mentioned, or you can come to DevConf.cz, where Alex will be presenting his work.

Lenovo T440, T540 latest generation touchpad issues

It seems we can't ever get rid of the issues with this series. Daniel Martin filed a kernel bug for the latest series of these devices (Oct 2014) and it looks like they all need manual fixing again.

When the *40 series first came out, the PS/2 firmware was buggy and advertised bogus coordinate ranges for the x/y axes. Since we use those coordinate ranges to set up the size and position of software buttons (very much needed since that series did away with the physical trackpoint buttons) we added kernel patches for each of those laptops. The kernel would match on the PNPID (e.g. LEN0036 on a T440) and fix the min/max range for both axes based actual measurements. Since this requires someone to have a laptop, measure it, file a bug or send a patch, wait for it to get into the kernel, wait for it to get into distros it took quite a while to get all models supported.

Lenovo has updated the series in Oct 2014 and it's starting to get in the hands of users. And as it turns out, the touchpads now have different coordinate ranges but the same PNPID. And the values reported by the firmware are still bogus, so we need the same quirk, again, for each touchpad. Update 22/01/15: looks like the ranges are correct enough, so we don't need to update all ranges separately.

So in short: if you have one of the latest series *40 touchpads, your touchpad software buttons will be off. CC yourself on the kernel bug and if you have a model that's not listed there yet, add the required data. Eventually that will end up in the kernel and then everything is hunky-dory again. Until then, have a drink on behalf of the Synaptics/Lenovo QA departments.

Now the obvious question: why does this work with Windows? They don't use the PS/2 protocol but the SMBus/RMI4 interface and thus PS/2 firmware correctness is apparently not top priority for the QA departments. But the SMBus protocol requires the Host Notify feature, which caused Synaptics to reimplement the i2c driver for Windows. And that's what is shipped/preinstalled as driver. We don't support Host Notify on Linux, so there goes that idea. But there's strong suspicion that's not the only piece of the puzzle that's missing anyway...

Update 22/01/15: The min/max ranges advertised seem to be correct in the newer versions which would indicate that Synaptics has fixed the firmware. That's great (except for re-using the PNPID). Now we need to just detect that and drop the quirks for the newer touchpads. Hans has a good suggestion for how to do this, so with a bit of luck this will end up being only one kernel patch instead of one per device.

January 21, 2015

Plugable USB Hubs

Joshua from Plugable sent me 4 different USB hubs this week so they could be added as quirks to gnome-multi-writer. If you’re going to be writing lots of USB drives, the Plugable USB3 hubs now work really well. I’ve got a feeling that inserting and removing the drive is going to be slower than the actual writing and verifying now…

Moving update information from the distribution to upstream

I’ve been talking to various people about the update descriptions we show to the user. Without exception, the messages we show to end users are really bad. For example, the overly-complex-but-not-actually-useful:

Screenshot from 2015-01-21 10:56:34

Or, the even more-to-the-point:

Update to 3.15.4

I’m guilty of both myself. Why is this? Typically this text is written an over-worked and under-paid packager doing updates to many applications and packages. Sometimes the packager might be the upstream maintainer, or at least involved in the project, but many times it’s just some random person that got fingered to maintain a particular package. This doesn’t make an awesome person to write beautiful prose and text that thousands of end users are going to read. It also doesn’t make sense to write the same beautiful prose again and again for every distribution out there.

So, what do we need? We need a person who probably wrote the code, or at least signed it off, who cares about the project and cares about the user experience. i.e. the upstream maintainer.

What I’m proposing is that we ask upstream maintainers to write the release information in a way that can be shown in the software center. NEWS files are not stanardized, and you don’t typically have a NEWS file for each application in your upstream tarball, so we need something else.

Suprise suprise, it’s AppStream to the rescue. AppStream has a <release> object that fits the bill almost completely; you can put upstream version information and long (optionally translated) formatted descriptions.

Of course, you don’t want to write both NEWS and the various appdata files at release time, as that just increased the workload of the overly-busy upstream maintainer. In this case we can use appstream-util appdata-to-news in the buildsystem and generate the former from the latter automatically. We’re outputting markdown for the NEWS file, which seems to be a fairly good approximation of what NEWS files actually look like at least for GNOME.

For a real-world example, see the GNOME MultiWriter example commit that uses this.

There are several problems with this approach. One is that the translators might have to translate lots more text; and the obvious solution to that seems to be to only mark strings to be translated for stable versions. Alas, projects like GNOME don’t allow any new strings in stable versions, so we’ll either have to come up with an except for release notes ammendment to that, or just say that all the release notes are only ever available in C locale.

The huge thing to take away from this blog, if you are intending to use this new feature is that update descriptions have to be understandable by end users. Various bug fixes is not helpful, but Fixes a crash when adding a joystick is. End users neither care or understand Use libgusb rather than libusbx and technical details that do not affect the UI or UX of the application should be omitted.

This doesn’t work for distribution releases, e.g. 3.14.1-1 to 3.14.1-2, but typically these are not huge changes that we need to show so prominently to the user.

I’m also writing a news-to-appdata.py script, so if anyone wants to take the plunge on an existing project it might be good to wait for that unless you like lots of copy and pasting.

Comments, as always, welcome.

January 19, 2015

GNOME Wayland porting – the endgame

Mosaic by Alison's Eyes.

Its been a while since I mentioned Wayland in this space – of course that doesn’t mean that work on GNOME Wayland support has stopped.

Quite the opposite, in fact. The GNOME 3.15.4 release that is due this week will close a number of the remaining gaps:

  • libinput 0.8 has been released, with most of the APIs that we need
  • Handling of input configuration has largely moved to mutter (the picture is not quite as clean as it could be, because we still need support for X configurations that do not use libinput in gnome-settings-daemon)
  • The corresponding control-center changes are about to land too
  • gnome-shell supports pointer barriers for the hot corner under Wayland as well
  • GTK+ popovers are using subsurfaces now, so they can extend beyond window boundaries
  • GTK+ prefers the Wayland backend over the X backend, if both are available

One gap that has not been closed yet is support for Wacom tablets. The required libinput changes and Wayland protocol extensions did not quite make it into libinput 0.8 – but much of the work has been done and will hopefully land before long.

The hero’s of this latest round of Wayland progress are Carlos Garnacho, Rui Matos, Jonas Ådahl, Jasper St. Pierre, Peter Hutterer and Hans de Goede, to name just a few.

I can hear you ask the question:

Thats all great, but when is it going to be done ?!

Clearly, we want to reach an endpoint where we can declare the Wayland port done and use it by default, in Fedora.  So, in order to not drag this out forever, we are aiming for the following:

  •  Use Wayland for the login screen in Fedora 22

Why ? The login screen is pretty self-contained and we don’t have to worry about application compatibility or support for exotic devices. And we will get Wayland to run on (almost) all systems this way, which should give a lot more exposure and help shake out lingering bugs and hardware issues.

  • Use the Wayland backends of the various toolkits (mainly GTK+, but also SDL and maybe Qt), if we are in a Wayland session.

Why ? This ensures that we get the maximum amount of exposure from the brave souls who are trying the Wayland session today, while not affecting the experience of everybody else who prefers the traditional X session.

As mentioned above, this change is happening for GTK+ in 3.15.4.

  • Make the Wayland session the default in rawhide, soon after we’ve branched for F22.

Why ?  If we want to have a shot at switching the default for Fedora 23, we need to do this anyway. And doing it sooner rather than later is the only logical choice.

+ +

xf86-input-libinput compatibility with evdev and synaptics

A Fedora 22 feature is to use the libinput X.Org driver as default driver for all non-tablet devices. This replaces the current defaults of synaptics for touchpads and evdev for anything else (tablets usually use the wacom driver, if installed). As expected, changing a default has some repercussions both for users and for developers. These are outlined below, based on the libinput 0.8 release. Future versions may add features, so check with your latest local version. Generally, the behaviour should roughly stay the same, big changes such as devices not being detected or not working is most likely a bug. Some behaviours are new, e.g. always-on palm detection, top software buttons on specific touchpads, etc. If in doubt, check the libinput documentation for hints on whether something is supposed to work in a particular manner.

Changes visible to Users

Any custom xorg.conf snippets will cease to work, if they are properly stacked. Options set by snippets are almost always exclusive to one particular driver. When the default driver changes, the snippet may not apply to the device anymore. Whether they stop working depends whether the Driver line is present. Consider this example snippet:

Section "InputClass"
Identifier "enable tapping"
MatchProduct "my touchpad"
Driver "synaptics"
Option "TapButton1" "1"
EndSection
This snippet does two things: it assigns the synaptics driver to the "my mouse" device and sets the option TapButton1. The assignment will override the default libinput assignment, i.e. this device won't change behaviour, you just don't get to use any new features. If the Driver line is not present then this snippet won't do anything, the libinput driver does not provide a TapButton1 option. It is safe to leave the snippet in place, options that are not supported by a driver are simply ignored.

The xf86-input-libinput man page has a list of options that can be set. For example, the above snippet would have an equivalent as


Section "InputClass"
Identifier "enable tapping"
MatchDriver "libinput"
MatchProduct "my touchpad"
Option "Tapping" "on"
EndSection
Note that this matches on a driver rather than assign the driver. Since options are driver-specific this is the correct approach.

The other visible change is a difference in default pointer speed. We have fine-tuning pointer acceleration on our TODO lists, but we're not quite there yet and any help would be appreciated. In the meantime you may see changes in pointer acceleration.

Finally, you may see certain features have gone the way of the dodo. Specifically the touchpad code exposes a lot less knobs to tweak. Most of that is intentional, some of it may warrant discussion. If there is a particular feature you are missing or doesn't work as you want to, please file a bug.

Changes visible to developers

These changes affect desktop environments, specifically the part that configures input devices. The changes affect three categories: pointer acceleration, button mapping, touchpad disabling and device properties. The property "libinput Send Events Modes Available" exists on all devices, it can be used to determine if a device is handled by the libinput driver.

Pointer acceleration

The X server exposes a variety of knobs for its pointer acceleration code. The oldest knob (and specified in the core protocol) is the XChangePointerControl request. In some environments this is exposed as a single slider, in others it's split into multiple settings (Acceleration and Threshold, for example).

libinput does away with this and only exposes a single 1-value float property "libinput Accel Speed" with a range of -1 (slowest) to 1 (fastest). The XChangePointerControl request has no effect on a libinput device. It is up to you how to map the current speed mappings into the [-1, 1] range.

Button mapping

The X server provides button mapping through the XSetPointerMapping request. This is most commonly used to apply a left-handed configuration and to enable natural scrolling. The call will continue to work with the libinput driver, but better methods are available.

The property "libinput Left Handed Enabled" takes a single boolean 8-bit value to enable and disable left-handed mode. Unlike the X request this will automatically take care of the tapping configuration (and other things in the future). If the property is not available on a device, that device has no left-handed mode.

The property "libinput Natural Scrolling Enabled" takes a single boolean 8-bit value to enable and disable natural scrolling. This applies to smooth scrolling and legacy button scrolling (which the libinput driver doesn't do anyway). If the property is not available on a device, that device has no natural scrolling mode.

Touchpad disabling

In the synaptics driver, disabling the touchpad is usually done with the "Synaptics Off" property. This is used by syndaemon to turn the touchpad off while typing. libinput does this by default, so it is safe to simply ignore this at all. Don't bother starting syndaemon, it won't control the libinput driver.

Device properties

Any code that handles a driver-specific property (prefixed by "evdev" or "synaptics") will stop working. These properties are not exposed by the libinput driver (we tried, it was not viable). KDE's kcm_touchpad module is a particularly bad offender here, it exposes almost every toggle the driver ever had. Make sure the code behaves well if the properties are not present and you're basically good to go.

If you decide to handle libinput-specific properties, the general rule is: if a single-value property is not present, that configuration does not apply to this device. Bitmask-style properties are split into an "libinput foo Available" and "libinput foo Enabled". The former lists the available settings, the latter enables a specific setting. Have a look at the xf86-input-libinput source for details on each property.

The Whole Damn World Takes Effect to NetworkManager 1.0

nyold

2004

Facebook launched.

The first Ubuntu release appeared.

It was the Year of the Linux Desktop.

Novell had just bought Ximian and Mono happened.

Google IPOed.

Firefox 1.0 showed up.

This was your cellphone and PDAs were still a thing.

This love took you over and made you think you got it.

And NetworkManager was first released.

Fast forward to 2014…

nynew

NetworkManager 1.0!

Right before the 2014 holidays, and more than 10 years after the first line of NetworkManager was typed, we released version 1.0.  A huge milestone on the way to making NetworkManager more cooperative, more flexible, more configurable, and more useful than ever before.

How you ask?

1: libnm: the new GLib client library

For all the GLib/GObject users out there, we’ve rebuilt libnm-util and libnm-glib from the ground up into a new single library called libnm.  It uses GDBus instead of dbus-glib.  It provides GIO-style asynchronous methods. It also exposes IP addresses, MAC addresses, and other properties as strings instead of byte arrays, and combines the old NMClient and NMRemoteSettings objects into a single NMClient object, among other things.

from gi.repository import GLib, NM

for dev in NM.Client.new(None).get_devices():
    ipcfg = dev.get_ip4_config()
    if ipcfg:
        for addr in ipcfg.get_addresses():
            print "(%s) %s/%d" % (dev.get_iface(), addr.get_address(), addr.get_prefix())

2: a smaller, faster DHCP client

While it doesn’t do DHCPv6 (yet!) this internal client (based off systemd/connman code) is much faster than dhclient and dhcpcd, and doesn’t consume huge amounts of memory like dhclient.  Use the ‘dhcp=internal’ option in NetworkManager.conf to enable it and let us know how it works.  We’ll be adding DHCPv6 support and enhancing the recognized options in the near future.

3: configure and quit

Have a more static configuration and still want to use NetworkManager configuration and API to manage it?  The ‘configure-and-quit=yes’ option in NetworkManager.conf will configure your interfaces and quit the NM process, spawning small helpers to preserve DHCP and IPv6 addresses.  This saves cycles (and therefore money) and is simpler to manage.

4: more cooperative

Continuing the trend, NetworkManager 1.0 does a much better job of leaving externally configured interfaces alone until you tell it to do something.  In addition to improvements for IPv6 sysctl recognition and user-added route preservation, externally created virtual interfaces are no longer automatically set IFF_UP, and NetworkManager handles external master/slave relationship changes more smoothly.

5: more powerful nmcli

We’ve added PolicyKit and interactive password support to nmcli, allowing full command-line-only operation for most network connections, even for less privileged users.  There’s a new ‘nmcli dev connect’ command that brings up an interface using the best available connection.  You can also delete virtual interfaces directly through nmcli.

6: improved IPv6

We’ve ensured that if network interfaces are supposed to be down and unconfigured, that the kernel doesn’t assign a link-local address to them, to prevent potential security issues when you think networking is down.  We’ve also added support for IPv6 WWAN connections and fixes to respect router-delivered MTUs.

7: Bluetooth DUN support

Bluez5 changed API for Dial-Up-Networking functionality, which broke the NetworkManager support.  At long last we’ve added that support back, no thanks to Bluez.  Happy mobile networking!

8: more flexible and cooperative routing

Every interface that can have a default route now gets one, and NetworkManager manages the priorities to ensure they don’t conflict.  Plus, if you need to, you can manually manage priorities on a per-connection basis to prefer WiFi over WWAN or WWAN over ethernet, or whatever you need.

9: fewer dependencies

We’ve also removed some direct dependencies (PolicyKit), slimmed down code, and split functionality into selectable plugins, leading to easier installs on limited systems and better configurability.

That’s just the tip of the iceberg; we’ve improved almost every part of NetworkManager and we’re not stopping there.  We’re planning improvements to container use-cases, WiFi, VPNs, power savings, client APIs, and much  more.  2015 is gonna be a great year, and not just because the version number is greater than 1!

January 15, 2015

Providing the physical movement of wheel events in libinput

libinput 0.8 was released yesterday. One feature I'd like to talk about here: the change to provide mouse wheel events as physical distances.

Mouse wheels are clicks. In the evdev protocol they're sent via the REL_WHEEL and REL_HWHEEL axes, with a value of 1 per click. Spinning the wheel fast enough will give you a higher value per event but it's still just a multiple of the physical clicks. This is a conundrum for libinput.

libinput exports scroll events as "axis" events, the value returned by libinput_event_pointer_get_axis_value() for touchpads and button scrolling is in "pixels". libinput doesn't have a concept of pixels of course but the compositor will likely take the relative motion and apply it to the cursor. Scroll events are in the same coordinate space, i.e. the scrolling for two-finger scrolling has the same feel as moving the pointer. This continuous coordinate space is at odds with the discrete values coming from a wheel. We added axis sources to the API so you can now tell whether an event was generated by the wheel or some other scroll methods. But still, the discrete values from a wheel are at odds with the other sources.

For libinput 0.8, we changed the default reporting mode of the wheel. For the click count, a new call libinput_event_pointer_get_axis_value_discrete() provides that number (positive and negative for both direction). The libinput_event_pointer_get_axis_value() on a wheel now returns the movement of the wheel in degrees. That gives us a continuous coordinate space that also opens up new possibilities: if you know the rotation of a mouse wheel in degrees, you know things like "has the wheel been turned a full rotation". I don't quite know how, but I'm sure there are interesting interfaces you can make from that :)

Of course, the physical properties don't change, the degrees are always a multiple of the click count, and on most mice one click count is a 15 degree movement. The Logitech M325 is a notable exception here with a 20 degree angle. This isn't advertised by the hardware of course so we rely on the udev hwdb to set it where it differs from the default. The patch for this has been pushed to systemd and will soon arrive at a distribution near you.

And to answer a question I'm sure will come up: those mice that support a free spinning scrollwheel don't change the reporting mode. The G500s for example merely moves a physical bit that provides friction and the click feel/noise. The device still reports in 15 degree angle counts. I strongly suspect that other mice with this feature are the same (if not, we can now work this continuous motion into libinput and handle it propertly).

gnome-battery-bench

One thing we want to do for the next versions of GNOME and Fedora is to improve battery performance. Your laptop may well be advertised by the manufacturer to have “up to 10 hours of battery life” or some such claim. You probably don’t get anywhere near this.

Let’s put out some rough numbers here to give an overall sense of scale for the problem. For a modern ultrabook:

  • The battery is 50 watt-hours (Wh) – it can power a load of 50W for an hour or a load of 5W for 10 hours.
  • The baseline idle consumption of the system – this is RAM refresh, the power consumption of peripherals in power-saving mode, etc, is 5W.
  • The screen and keyboard backlights, if both turned on to 100%, draw 5W.
  • The CPU/GPU can sustain about 15W – this is a thermal limit, so it can draw more for short bursts, but over time it will be throttled to an average.
  • All other peripherals (Wifi, bluetooth, touchpad, etc.) can use about 5W of power combined when not in power-saving mode.

So the power draw of the system can range from about 5W (the manufacturer’s 10 hours) to 30W (1 hour 40 minutes). If you have such an ultrabook, how is it doing right now? I’d guess it’s using about 15W unless you pay a lot of attention to power usage. Some of the things that might be going wrong:

  • Your keyboard/screen backlights are likely higher than is needed.
  • Some of the devices on your system don’t have power-saving turned on properly.
  • You likely have some background CPU activity (webpage ads, for example).

Of course, if you are running a compilation in the background, you want your CPU to be using that full 15W of power to get it done as soon as possible – in this case, your battery isn’t going to last very long. But a lot of usage is closer to idle.

Measuring power usage

I’ve made assertions above about power used by different things. But how did I actually measure that? powertop is the state of the art for measuring power usage and tweaking it on Linux. It will show you a figure for current battery discharge rate, but it bounces around by several watts; partly because powertop’s own data collection loads the system. The effect of a kernel option is usually much smaller than that. One of the larger effects I discovered on my laptop was that turning USB autosuspend for the touchscreen saves about 150mW. When you tweak a tunable in powertop, without a way to measure power usage more accurately, it’s hard to know whether any observed differences are real.

To support figuring out what is going on with power, I wrote gnome-battery-bench. What it does is pretty simple – it plays back recorded sequences of events in a loop and monitors battery charge to estimate power usage. Because battery usage is being averaged over minutes, a lot of the noise is averaged out. And the event sequences can be changed to exercise different usage patterns by the user.

gnome-battery-benchThe above screenshot shows gnome-battery-bench running a “Light Duty” benchmark that combines scrolling around in a Wikipedia page and typing in gedit. Instantaneous usage bounces around a lot from the activity and from random noise, but after a few cycles the averaged power and estimated battery lifetime converge. The corresponding idle power usage is about 5.5W, so we see then know that we’re using about 2.9W from the activity.

gnome-battery-bench is designed as a graphical application because I want to encourage people to explore with it and find out interactively what is using power on their system. And graphing is also useful so that the user can see when something is going wrong with the measurement; sometimes batteries will report data that jumps around. But there’s also a command line version that can be used for automatic scripting of benchmarks.

I decided to use recorded sequences of events for a couple of reasons: first, it’s easy for anybody to create new test sequences – you just run the gnome-battery-bench command line tool in record mode and do what you want to test. Second, playing back event sequences at a low level simulates user interaction very accurately. There is little CPU overhead, and as far as the desktop is concerned it’s exactly like user input. Of course, not everything can be easily tested by simply playing back canned sequences, but our goal here isn’t to test everything, just to be able to test some things that are reasonably representative.

The gnome-battery-bench README file describes in more detail how it works and how to install it on your system.

Next steps

gnome-battery-bench is basically usable as is. The main remaining thing to do with it is to spend some time designing and recording a couple of sequences that better reflect actual usage. The current tests I checked in are basically just placeholders.

On the operating system, we need to make sure that we are actually shipping with as many power-saving options on for peripherals as can be supported. In particular, “SATA link power management” makes a several-watt difference.

Backlight management is another place we can make improvements. Some problems are simply bad defaults. If ambient light sensors are present on the system, using them can be a big win. If not, simply using appropriate defaults is already an improvement.

Beyond that, in GNOME, we can optimize application and system code for efficiency and to not do things unnecessarily often in the background. Eventually I’d like to figure out a way to have power consumption also tracked by perf.gnome.org so we can see how code changes affect our power consumption and avoid regressions.


January 09, 2015

Finding hidden applications with GNOME Software

When you do a search in GNOME Software it returns any result of any application with AppStream metadata and with a package name it can resolve in any remote repository. This works really well for software you’re installing from the main distribution repos, but less well for some other common cases.

Lets say I want to install Google Chrome so that my 2 year old daughter can ring me on hangouts, and tell me that dinner is ready. Lets search for Chrome on my Fedora Rawhide system.

Screenshot from 2015-01-09 16:37:45

Whoa! Wait, how did you do that? First, this exists in /etc/yum.repos.d/google-chrome.repo — the important line being enabled_metadata=1. This means “download just the metadata even when enabled=0” and means we can get information about what packages are available in repos we are not enabling by default for legal or policy reasons.

[google-chrome]
name=google-chrome
baseurl=http://dl.google.com/linux/chrome/rpm/stable/x86_64
enabled=0
gpgcheck=1
repo_gpgcheck=1
enabled_metadata=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub

We’ve also got a little XML document with the AppStream metadata (just the long description and keywords) called /usr/share/app-info/xmls/google-chrome.xml which could be included in the usual vendor-supplied fedora-22.xml if that’s what we want to do.

Screenshot from 2015-01-09 16:40:09

The other awesome feature this unlocks is when we have addon repos that are not enabled by default. For instance, my utopia repo full of super new free software applications could be included in Fedora, and if the user types in the search word we ask if the repo should be enabled. This would solve a lot of use cases if we could ship .repo files for a few popular COPRs of stuff we don’t (yet) ship in Fedora, but are otherwise free and open source software.

Screenshot from 2015-01-09 16:51:00

All the components to do this are upstream in Fedora 22 (you need a new librepo, libhif, PackageKit, libappstream-glib and gnome-software, phew!) although I’m sure we’ll be tweaking the UI and UX before Fedora 22 is released. Comments welcome.

 

GNOME MultiWriter 3.15.2

I’ve just released GNOME MultiWriter 3.15.2, which is the first release that makes it pretty much feature complete for me.

Reads and writes are now spread over root hubs to increase throughput. If you’ve got a hub with more than 7 ports and the port numbers don’t match the decals on the device please contact me for more instructions.

In this release I’ve also added the ability to completely wipe a drive (write the image, then NULs to pad it out to the size of the media) and made that and the verification step optional. We also now show a warning dialog to the user the very first time the application is used, and some global progress in the title bar so you can see the total read and write throughput of the application.

With this release I’ve now moved the source to git.gnome.org and will do future releases to ftp.gnome.org like all the other GNOME modules. If you see something obviously broken and you have GNOME commit access, please just jump in and fix it. The translators have done a wonderful job using transifex, but now I’m leaving the just-as-awesome GNOME translator teams handle localisation.

If you’ve got a few minutes, and want to try it out, you can clone the git repo or install a package for Fedora.

Richard

January 05, 2015

GNOME MultiWriter and Large Hubs

Today I released the first version of GNOME MultiWriter, and built it for Rawhide and F21. It’s good enough for a first release, although there are still a few things left to do. The most important now is probably the self-profiling mode so that we know the best number of parallel threads to use for the read and the write. I want this to Just Work without any user interaction, but I’ll have to wait for my shipment of USB drives to arrive before I can add that functionality.

Also important to the UX is how we display failed devices. Most new USB devices accept the ISO image without a fuss, but the odd device will disconnect before completion or throw a write error. In this case it’s important to know which device is the one that belongs in the rubbish bin. This is harder than you think, as the electrical port number is not always what matches the decal on the plastic box.

For my test system I purchased a 10-port USB hub. I was interested to know how the vendor implemented this, as I don’t know of a SOIC chip that can drive more than 7 ports. It turns out, my 10-port hub is actually a 4-port hub, with a 7-port hub attached to the last port of the first hub. The second hub was also wired 1,2,3,4,7,6,5 rather than 1,2,3,4,5,6,7. This could cause my dad some issues when we tell him that device #5 needs removing.

I’ve merged some code into GNOME MultiWriter to work around this, but if you’ve got a hub with >7 ports I’d be interested to know if this works for you, or if we need to add some more VID/PID matches. If you do try this out you need libgusb from git master today. Helpfully gnome-multi-writer outputs quirk info to the command line if you use --verbose, so that makes debugging this stuff easier.

January 02, 2015

Introducing GNOME MultiWriter

I spent last night writing a GNOME application to duplicate a ton of USB devices. I looked at mdcp, Clonezilla and also just writing something loopy in bash, but I need something simple my dad could use for a couple of hours a week without help.

It’s going to be super useful for me when I start sending our LiveUSB disks in the ColorHug box (rather than LiveCDs) and possibly useful to other people just wanting to copy a USB drive for QA testing on a small group of people, a XFCE live CD of Fedora rawhide for a code sprint, and that kind of thing.

GNOME MultiWriter allows you to write and verify an ISO file to up to 20 USB devices at once.

Screenshot from 2015-01-02 16:24:35

Bugs (and especially pull requests) accepted on GitHub; if there’s sufficient interest I’ll move the project to git.gnome.org after a few releases.

December 22, 2014

OpenHardware : Ambient Light Sensor

My OpenHardware post about an entropy source got loads of high quality comments, huge thanks to all who chimed in. There look to be a few existing projects producing OpenHardware, and the various comments have links to the better ones. I’ll put this idea back on the shelf for another holiday-hacking session. I’ve still not given up on the SD card interface, although it looks like emulating a storage device might be the easiest and quickest route for any future project.

So, on to the next idea. An OpenHardware USB ambient light sensor. A lot of hardware doesn’t have a way of testing the ambient light level. Webcams don’t count, they use up waaaay too much power and the auto-white balence is normally hardcoded in hardware. So I was thinking of producing a very low cost mini-dongle to measure the ambient light level so that lower-spec laptops could be saving tons of power. With smartphones people are now acutely aware than up to 60% of their battery power is just making the screen light up and I’m sure we could be smarter about what we do in GNOME. The problem traditionally, has been the lack of hardware with this support.

Anyone interested?

December 19, 2014

OpenHardware Random Number Generator

Before I spend another night reading datasheets; would anyone be interested in an OpenHardware random number generator in an full-size SD card format? The idea being you insert the RNG into the SD slot of your laptop, leave it there, and the kernel module just slurps trusted entropy when required.

Why SD? It’s a port that a a lot of laptops already have empty, and on server class hardware you can just install a PCIe addon card. I think I can build such a thing for less than $50, but at the moment I’m just waiting for parts for a prototype so that’s really just a finger-in-the-air estimate. Are there enough free software people who care about entropy-related stuff?

December 17, 2014

Actually shipping AppStream metadata in the repodata

For the last couple of releases Fedora has been shipping the appstream metadata in a package. First it was the gnome-software package, but this wasn’t an awesome dep for KDE applications like Apper and was a pain to keep updated. We then moved the data to an appstream-data package, but this was just as much of a hack that was slightly more palatable for KDE. What I’ve wanted for a long time is to actually ship the metadata as metadata, i.e. next to the other files like primary.xml.gz on the mirrors.

I’ve just pushed the final patches to libhif, PackageKit and appstream-glib, which that means if you ship metadata of type appstream and appstream-icons in repomd.xml then they get downloaded automatically and decompressed into the right place so that gnome-software and apper can use the data magically.

I had not worked on this much before, as appstream-builder (which actually produces the two AppStream files) wasn’t suitable for the Fedora builders for two reasons:

  • Even just processing the changed packages, it took a lot of CPU, memory, and thus time.
  • Downloading screenshots from random websites all over the internet wasn’t something that a build server can do.

So, createrepo_c and modifyrepo_c to the rescue. This is what I’m currently doing for the Utopia repo.

createrepo_c --no-database x86_64/
createrepo_c --no-database SRPMS/
modifyrepo_c					\
	--no-compress				\
	/tmp/asb-md/appstream.xml.gz		\
	x86_64/repodata/
modifyrepo_c					\
	--no-compress				\
	/tmp/asb-md/appstream-icons.tar.gz	\
	x86_64/repodata/

If you actually do want to create the metadata on the build server, this is what I use for Utopia:

appstream-builder			\
	--api-version=0.8		\
	--origin=utopia			\
	--cache-dir=/tmp/asb-cache	\
	--enable-hidpi			\
	--max-threads=4			\
	--min-icon-size=48		\
	--output-dir=/tmp/asb-md	\
	--packages-dir=x86_64/		\
	--temp-dir=/tmp/asb-icons	\
	--screenshot-uri=http://people.freedesktop.org/~hughsient/fedora/21/screenshots/

For Fedora, I’m going to suggest getting the data files from alt.fedoraproject.org during compose. It’s not ideal as it still needs a separate server to build them on (currently sitting in the corner of my office) but gets us a step closer to what we want. Comments, as always, welcome.

December 16, 2014

Viewing the Xorg.log with journalctl

Those running Fedora Rawhide or GNOME 3.12 may have noticed that there is no Xorg.log file anymore. This is intentional, gdm now starts the X server so that it writes the log to the systemd journal. Update 29 Mar 2014: The X server itself has no capabilities for logging to the journal yet, but no changes to the X server were needed anyway. gdm merely starts the server with a /dev/null logfile and redirects stdin/stderr to the journal.

Thus, to get the log file use journalctl, not vim, cat, less, notepad or whatever your $PAGER was before.

This leaves us with the following commands.


journalctl -e _COMM=Xorg
Which would conveniently show something like this:

Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "wacom"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Lenovo Optical USB Mouse: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Integrated Camera: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Sleep Button: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Video Bus: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Power Button: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (EE) Server terminated successfully (0). Closing log file.
The -e toggle jumps to the end and only shows 1000 lines, but that's usually enough. journalctl has a bunch more options described in the journalctl man page. Note the PID in square brackets though. You can easily limit the output to just that PID, which makes it ideal to attach to the log to a bug report.

journalctl _COMM=Xorg _PID=5438
Previously the server kept only a single backup log file around, so if you restarted twice after a crash, the log was gone. With the journal it's now easy to extract the log file from that crash five restarts ago. It's almost like the future is already here.

Update 16/12/2014: This post initially suggested to use journactl /usr/bin/Xorg. Using _COMM is path-independent.

Fedora 21

Added 16/12/2014: If you recently updated to/installed Fedora 21 you'll notice that the above command won't show anything. As part of the Xorg without root rights feature Fedora ships a wrapper script as /usr/bin/Xorg. This script eventually executes /usr/libexecs/Xorg.bin which is the actual X server binary. Thus, on Fedora 21 replace Xorg with Xorg.bin:


journalctl -e _COMM=Xorg.bin
journalctl _COMM=Xorg.bin _PID=5438
Note that we're looking into this so that in a few updates time we don't have a special command here.

December 10, 2014

A looking glass for GTK+

gnome-shell has  a nice integrated developer tool called “Looking glass”. To bring it up, simply enter “lg” into the command prompt:

Alt-F2 lg

This brings up a translucent overlay with an interactive JavaScript prompt. Among the nice things it offers are tab completion and an object picker.

Here is how it looks:

Looking glass If you haven’t tried it yet,  you really should.

Over the years, many people have asked,

“Can I use Looking glass to debug my GTK+ application ?”.

So far, the answer has always been no. But now, the always awesome Alex Larsson has stepped up and made essentially the same functionality available for the GTK+ Inspector:

Interactive InspectorAs you can see, it has an object picker, tab completion and a result history just like its ‘big brother’.  Alex made a little demo video for it.

To avoid a gjs dependency in GTK+ itself and the associated cyclic dependency issues, this is implemented as an extension in a loadable module.  The code currently lives on github, in Alex’ gjs-inspector repository.

Enjoy

December 08, 2014

A look at new developer features
As the development window for GNOME 3.16 advances, I've been adding a few new developer features, selfishly, so I could use them in my own programs.

Connectivity support for applications

Picking up from where Dan Winship left off, we've merged support for application to detect the network availability, especially the "connected to a network but not to the Internet" case.

In glib/gio now, watch the value of the "connectivity" property in GNetworkMonitor.

Grilo automatic network awareness

This glib/gio feature allows us to show/hide Grilo sources from applications' view if they require Internet and LAN access to work. This should be landing very soon, once we've made the new feature optional based on the presence of the new GLib.

Totem

And finally, this means we'll soon be able to show a nice placeholder when no network connection is available, and there are no channels left.

Grilo Lua resources support

A long-standing request, GResources support has landed for Grilo Lua plugins. When a script is loaded, we'll look for a separate GResource file with ".gresource" as the suffix, and automatically load it. This means you can use a local icon for sources with the URL "resource:///org/gnome/grilo/foo.png". Your favourite Lua sources will soon have icons!

Grilo Opensubtitles plugin

The developers affected by this new feature may be a group of one, but if the group is ever to expand, it's the right place to do it. This new Grilo plugin will fetch the list of available text subtitles for specific videos, given their "hashes", which are now exported by Tracker.

GDK-Pixbuf enhancements

I can point you to the NEWS file for the latest version, but the main gains are that GIF animations won't eat all your memory, DPI metadata support in JPEG, PNG and TIFF formats, and, for image viewers, you can tell whether a TIFF file is multi-page to open it in a more capable viewer.

Batched inserts, and better filters in GOM

Does what it says on the tin. This is useful for populating the database quicker than through piecemeal inserts, it also means you don't need to chain inserts when inserting multiple items.

Mathieu also worked on fixing the priority of filters when building complex queries, as well as supporting more than 2 items in a filter ("foo OR bar OR baz" for example).

December 06, 2014

pointer acceleration in libinput - building a DPI database for mice

click here to jump to the instructions

Mice have an optical sensor that tells them how far they moved in "mickeys". Depending on the sensor, a mickey is anywhere between 1/100 to 1/8200 of an inch or less. The current "standard" resolution is 1000 DPI, but older mice will have 800 DPI, 400 DPI etc. Resolutions above 1200 DPI are generally reserved for gaming mice with (usually) switchable resolution and it's an arms race between manufacturers in who can advertise higher numbers.

HW manufacturers are cheap bastards so of course the mice don't advertise the sensor resolution. Which means that for the purpose of pointer acceleration there is no physical reference. That delta of 10 could be a millimeter of mouse movement or a nanometer, you just can't know. And if pointer acceleration works on input without reference, it becomes useless and unpredictable. That is partially intended, HW manufacturers advertise that a lower resolution will provide more precision while sniping and a higher resolution means faster turns while running around doing rocket jumps. I personally don't think that there's much difference between 5000 and 8000 DPI anymore, the mouse is so sensitive that if you sneeze your pointer ends up next to Philae. But then again, who am I to argue with marketing types.

For us, useless and unpredictable is bad, especially in the use-case of everyday desktops. To work around that, libinput 0.7 now incorporates the physical resolution into pointer acceleration. And to do that we need a database, which will be provided by udev as of systemd 218 (unreleased at the time of writing). This database incorporates the various devices and their physical resolution, together with their sampling rate. udev sets the resolution as the MOUSE_DPI property that we can read in libinput and use as reference point in the pointer accel code. In the simplest case, the entry lists a single resolution with a single frequency (e.g. "MOUSE_DPI=1000@125"), for switchable gaming mice it lists a list of resolutions with frequencies and marks the default with an asterisk ("MOUSE_DPI=400@50 800@50 *1000@125 1200@125"). And you can and should help us populate the database so it gets useful really quickly.

How to add your device to the database

We use udev's hwdb for the database list. The upstream file is in /usr/lib/udev/hwdb.d/70-mouse.hwdb, the ruleset to trigger a match is in /usr/lib/udev/rules.d/70-mouse.rules. The easiest way to add a match is with the libevdev mouse-dpi-tool (version 1.3.2). Run it and follow the instructions. The output looks like this:


$ sudo ./tools/mouse-dpi-tool /dev/input/event8
Mouse Lenovo Optical USB Mouse on /dev/input/event8
Move the device along the x-axis.
Pause 3 seconds before movement to reset, Ctrl+C to exit.
Covered distance in device units: 264 at frequency 125.0Hz | |^C
Estimated sampling frequency: 125Hz
To calculate resolution, measure physical distance covered
and look up the matching resolution in the table below
16mm 0.66in 400dpi
11mm 0.44in 600dpi
8mm 0.33in 800dpi
6mm 0.26in 1000dpi
5mm 0.22in 1200dpi
4mm 0.19in 1400dpi
4mm 0.17in 1600dpi
3mm 0.15in 1800dpi
3mm 0.13in 2000dpi
3mm 0.12in 2200dpi
2mm 0.11in 2400dpi

Entry for hwdb match (replace XXX with the resolution in DPI):
mouse:usb:v17efp6019:name:Lenovo Optical USB Mouse:
MOUSE_DPI=XXX@125
Take those last two lines, add them to a local new file /etc/udev/hwdb.d/71-mouse.hwdb. Rebuild the hwdb, trigger it, and done:

$ sudo udevadm hwdb --update
$ sudo udevadm trigger /dev/input/event8
Leave out the device path if you're not on systemd 218 yet. Check if the property is set:

$ udevadm info /dev/input/event8 | grep MOUSE_DPI
E: MOUSE_DPI=1000@125
And that shows everything worked. Restart X/Wayland/whatever uses libinput and you're good to go. If it works, double-check the upstream instructions, then file a bug against systemd with those two lines and assign it to me.

Trackballs are a bit hard to measure like this, my suggestion is to check the manufacturer's website first for any resolution data.

Update 2014/12/06: trackball comment added, udevadm trigger comment for pre 218

December 01, 2014

Why the 255 keycode limit in X is a real problem

A long-standing and unfixable problem in X is that we cannot send a number of keys to clients because their keycode is too high. This doesn't affect any of the normal keys for typing, but a lot of multimedia keys, especially "newly" introduced ones.

X has a maximum keycode 255, and "Keycodes lie in the inclusive range [8,255]". The reason for the offset 8 keeps escaping me but it doesn't matter anyway. Effectively, it means that we are limited to 247 keys per keyboard. Now, you may think that this would be enough and thus the limit shouldn't really affect us. And you're right. This post explains why it is a problem nonetheless.

Let's discard any ideas about actually increasing the limit from 8 bit to 32 bit. It's hardwired into too many open-coded structs that this is simply not an option. You'd be breaking every X client out there, so at this point you might as well rewrite the display server and aim for replacing X altogether. Oh wait...

So why aren't 247 keycodes enough? The reason is that large chunks of that range are unused and wasted.

In X, the keymap is an array in the form keysyms[keycode] = some keysym (that's a rather simplified view, look at the output from "xkbcomp -xkb $DISPLAY -" for details). The actual value of the keycode doesn't matter in theory, it's just an index. Of course, that theory only applies when you're looking at one keyboard at a time. We need to ship keymaps that are useful everywhere (see xkeyboard-config) and for that we need some sort of standard. In the olden days this meant every vendor had their own keycodes (see /usr/share/X11/xkb/keycodes) but these days Linux normalizes it to evdev keycodes. So we know that KEY_VOLUMEUP is always 115 and we can hook it up thus to just work out of the box. That however leaves us with huge ranges of unused keycodes because every device is different. My keyboard does not have a CD eject key, but it has volume control keys. I have a key to start a web browser but I don't have a key to start a calculator. Others' keyboards do have those keys though, and they expect those keys to work. So the default keymap needs to map the possible keycodes to the matching keysyms and suddenly our 247 keycodes per keyboard becomes 247 for all keyboards ever made. And that is simply not enough.

To work around this, we'd need locally hardware-adjusted keymaps generated at runtime. After loading the driver we can look at the keys that exist, remap higher keycodes into an unused range and then communicate that to move the keysyms into the newly mapped keycodes. This is...complicated. evdev doesn't know about keymaps. When gnome-settings-daemon applied your user-specific layout, evdev didn't get told about this. GNOME on the other hand has no idea that evdev previously re-mapped higher keycodes. So when g-s-d applies your setting, it may overwrite the remapped keysym with the one from the default keymaps (and evdev won't notice).

As usual, none of this is technically unfixable. You could figure out a protocol extension that drivers can talk to the servers and the clients to notify them of remapped keycodes. This of course needs to be added to evdev, the server, libX11, probably xkbcomp and libxkbcommon and of course to all desktop environments that set they layout. To write the patches you need a deep understanding of XKB which would definitely make your skillset a rare one, probably make you quite employable and possibly put you on the fast track for your nearest mental institution. XKB and happiness don't usually go together, but at least the jackets will keep you warm.

Because of the above, we go with the simple answer: "X can't handle keycodes over 255"

November 29, 2014

is this a protocol? displaylink3
I'm not sure

but if hd0;u]; means anything to anyone from displaylink, or is the first unencrypted bytes they send, then oops.

Looks like I have some work to do next week.

November 28, 2014

LibreOffice Coverity Defect Density 0.00

So today's statistics for the latest coverity run over LibreOffic:

LibreOffice:  5,973,881 lines of code in Selected Components and 0.00 defect density

Defect density is measured by the number of defects per 1,000 lines of code, identified by the Coverity platform
11,751 Total defects, 21 Outstanding, 331 Dismissed, 11,399 Fixed
That's the dashboard reported figure. There are 21 unresolved warnings at the moment which works out at a true defect density of 0.003515303, we're in rounding to 0 territory. I reckon 11 of the remaining are really false positives but I'd still like to figure out how to "wiggle" the code to get their data validity check detected correctly.

November 24, 2014

GTK+ Inspector update

GTK+ Inspector is a debugging tool that is built directly into GTK+ and is available in every GTK+ application by using of the shortcuts Ctrl-Shift-d or Ctrl-Shift-i.

Since I last wrote about it, a number of things have changed, so it is time to give an update on the state of GtkInspector as of GTK+ 3.15.2.

The UI has been revamped a bit to make best use of the limited space in the inspector window.  Some of our newer widgets, such as GtkStackSwitcher, GtkSidebar and GtkSearchEntry, were helpful here:

InspectorInspectorInspectorThe object list has a new search implementation. It tries to deal better with search in a tree than the built-in search in GtkTreeView. Please try it and let me know what you think.

InspectorWe’ve added a new feature: object statistics. This is made possible by corresponding new functionality in GLib. To enable it, run your application with

GOBJECT_DEBUG=instance-count

InspectorThe inspector is now using a separate display connection.  This isolates it from many of the changes that you can make in it, such as CSS tweaks:

InspectorAfter 3.14, we have started to integrate OpenGL rendering into GTK+.  This is reflected in the inspector, which shows information about the OpenGL stack and offers some GL-related debug settings:

InspectorMore recently, I’ve spent my coding time helping to make glade support all of the new GTK+ widgets and features.  We are not quite there yet, but you can already use client-side decorations, GtkHeaderBar, GtkSearchBar, GtkStack, GtkStackSwitcher and GtkSidebar with glade from git master.

GladeI hope to add a few more new widgets to this list soon.  My personal goal for this effort is to use glade for all the ui files inside GTK+.

November 11, 2014

systemd For Administrators, Part XXI

Container Integration

Since a while containers have been one of the hot topics on Linux. Container managers such as libvirt-lxc, LXC or Docker are widely known and used these days. In this blog story I want to shed some light on systemd's integration points with container managers, to allow seamless management of services across container boundaries.

We'll focus on OS containers here, i.e. the case where an init system runs inside the container, and the container hence in most ways appears like an independent system of its own. Much of what I describe here is available on pretty much any container manager that implements the logic described here, including libvirt-lxc. However, to make things easy we'll focus on systemd-nspawn, the mini-container manager that is shipped with systemd itself. systemd-nspawn uses the same kernel interfaces as the other container managers, however is less flexible as it is designed to be a container manager that is as simple to use as possible and "just works", rather than trying to be a generic tool you can configure in every low-level detail. We use systemd-nspawn extensively when developing systemd.

Anyway, so let's get started with our run-through. Let's start by creating a Fedora container tree in a subdirectory:

# yum -y --releasever=20 --nogpg --installroot=/srv/mycontainer --disablerepo='*' --enablerepo=fedora install systemd passwd yum fedora-release vim-minimal

This downloads a minimal Fedora system and installs it in in /srv/mycontainer. This command line is Fedora-specific, but most distributions provide similar functionality in one way or another. The examples section in the systemd-nspawn(1) man page contains a list of the various command lines for other distribution.

We now have the new container installed, let's set an initial root password:

# systemd-nspawn -D /srv/mycontainer
Spawning container mycontainer on /srv/mycontainer
Press ^] three times within 1s to kill container.
-bash-4.2# passwd
Changing password for user root.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
-bash-4.2# ^D
Container mycontainer exited successfully.
#

We use systemd-nspawn here to get a shell in the container, and then use passwd to set the root password. After that the initial setup is done, hence let's boot it up and log in as root with our new password:

$ systemd-nspawn -D /srv/mycontainer -b
Spawning container mycontainer on /srv/mycontainer.
Press ^] three times within 1s to kill container.
systemd 208 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ)
Detected virtualization 'systemd-nspawn'.

Welcome to Fedora 20 (Heisenbug)!

[  OK  ] Reached target Remote File Systems.
[  OK  ] Created slice Root Slice.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Created slice System Slice.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Reached target Slices.
[  OK  ] Listening on Delayed Shutdown Socket.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Socket.
         Starting Journal Service...
[  OK  ] Started Journal Service.
[  OK  ] Reached target Paths.
         Mounting Debug File System...
         Mounting Configuration File System...
         Mounting FUSE Control File System...
         Starting Create static device nodes in /dev...
         Mounting POSIX Message Queue File System...
         Mounting Huge Pages File System...
[  OK  ] Reached target Encrypted Volumes.
[  OK  ] Reached target Swap.
         Mounting Temporary Directory...
         Starting Load/Save Random Seed...
[  OK  ] Mounted Configuration File System.
[  OK  ] Mounted FUSE Control File System.
[  OK  ] Mounted Temporary Directory.
[  OK  ] Mounted POSIX Message Queue File System.
[  OK  ] Mounted Debug File System.
[  OK  ] Mounted Huge Pages File System.
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Create static device nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Trigger Flushing of Journal to Persistent Storage...
         Starting Recreate Volatile Files and Directories...
[  OK  ] Started Recreate Volatile Files and Directories.
         Starting Update UTMP about System Reboot/Shutdown...
[  OK  ] Started Trigger Flushing of Journal to Persistent Storage.
[  OK  ] Started Update UTMP about System Reboot/Shutdown.
[  OK  ] Reached target System Initialization.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting Login Service...
         Starting Permit User Sessions...
         Starting D-Bus System Message Bus...
[  OK  ] Started D-Bus System Message Bus.
         Starting Cleanup of Temporary Directories...
[  OK  ] Started Cleanup of Temporary Directories.
[  OK  ] Started Permit User Sessions.
         Starting Console Getty...
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Login Service.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.

Fedora release 20 (Heisenbug)
Kernel 3.18.0-0.rc4.git0.1.fc22.x86_64 on an x86_64 (console)

mycontainer login: root
Password:
-bash-4.2#

Now we have everything ready to play around with the container integration of systemd. Let's have a look at the first tool, machinectl. When run without parameters it shows a list of all locally running containers:

$ machinectl
MACHINE                          CONTAINER SERVICE
mycontainer                      container nspawn

1 machines listed.

The "status" subcommand shows details about the container:

$ machinectl status mycontainer
mycontainer:
       Since: Mi 2014-11-12 16:47:19 CET; 51s ago
      Leader: 5374 (systemd)
     Service: nspawn; class container
        Root: /srv/mycontainer
     Address: 192.168.178.38
              10.36.6.162
              fd00::523f:56ff:fe00:4994
              fe80::523f:56ff:fe00:4994
          OS: Fedora 20 (Heisenbug)
        Unit: machine-mycontainer.scope
              ├─5374 /usr/lib/systemd/systemd
              └─system.slice
                ├─dbus.service
                │ └─5414 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-act...
                ├─systemd-journald.service
                │ └─5383 /usr/lib/systemd/systemd-journald
                ├─systemd-logind.service
                │ └─5411 /usr/lib/systemd/systemd-logind
                └─console-getty.service
                  └─5416 /sbin/agetty --noclear -s console 115200 38400 9600

With this we see some interesting information about the container, including its control group tree (with processes), IP addresses and root directory.

The "login" subcommand gets us a new login shell in the container:

# machinectl login mycontainer
Connected to container mycontainer. Press ^] three times within 1s to exit session.

Fedora release 20 (Heisenbug)
Kernel 3.18.0-0.rc4.git0.1.fc22.x86_64 on an x86_64 (pts/0)

mycontainer login:

The "reboot" subcommand reboots the container:

# machinectl reboot mycontainer

The "poweroff" subcommand powers the container off:

# machinectl poweroff mycontainer

So much about the machinectl tool. The tool knows a couple of more commands, please check the man page for details. Note again that even though we use systemd-nspawn as container manager here the concepts apply to any container manager that implements the logic described here, including libvirt-lxc for example.

machinectl is not the only tool that is useful in conjunction with containers. Many of systemd's own tools have been updated to explicitly support containers too! Let's try this (after starting the container up again first, repeating the systemd-nspawn command from above.):

# hostnamectl -M mycontainer set-hostname "wuff"

This uses hostnamectl(1) on the local container and sets its hostname.

Similar, many other tools have been updated for connecting to local containers. Here's systemctl(1)'s -M switch in action:

# systemctl -M mycontainer
UNIT                                 LOAD   ACTIVE SUB       DESCRIPTION
-.mount                              loaded active mounted   /
dev-hugepages.mount                  loaded active mounted   Huge Pages File System
dev-mqueue.mount                     loaded active mounted   POSIX Message Queue File System
proc-sys-kernel-random-boot_id.mount loaded active mounted   /proc/sys/kernel/random/boot_id
[...]
time-sync.target                     loaded active active    System Time Synchronized
timers.target                        loaded active active    Timers
systemd-tmpfiles-clean.timer         loaded active waiting   Daily Cleanup of Temporary Directories

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

49 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

As expected, this shows the list of active units on the specified container, not the host. (Output is shortened here, the blog story is already getting too long).

Let's use this to restart a service within our container:

# systemctl -M mycontainer restart systemd-resolved.service

systemctl has more container support though than just the -M switch. With the -r switch it shows the units running on the host, plus all units of all local, running containers:

# systemctl -r
UNIT                                        LOAD   ACTIVE SUB       DESCRIPTION
boot.automount                              loaded active waiting   EFI System Partition Automount
proc-sys-fs-binfmt_misc.automount           loaded active waiting   Arbitrary Executable File Formats File Syst
sys-devices-pci0000:00-0000:00:02.0-drm-card0-card0\x2dLVDS\x2d1-intel_backlight.device loaded active plugged   /sys/devices/pci0000:00/0000:00:02.0/drm/ca
[...]
timers.target                                                                                       loaded active active    Timers
mandb.timer                                                                                         loaded active waiting   Daily man-db cache update
systemd-tmpfiles-clean.timer                                                                        loaded active waiting   Daily Cleanup of Temporary Directories
mycontainer:-.mount                                                                                 loaded active mounted   /
mycontainer:dev-hugepages.mount                                                                     loaded active mounted   Huge Pages File System
mycontainer:dev-mqueue.mount                                                                        loaded active mounted   POSIX Message Queue File System
[...]
mycontainer:time-sync.target                                                                        loaded active active    System Time Synchronized
mycontainer:timers.target                                                                           loaded active active    Timers
mycontainer:systemd-tmpfiles-clean.timer                                                            loaded active waiting   Daily Cleanup of Temporary Directories

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

191 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

We can see here first the units of the host, then followed by the units of the one container we have currently running. The units of the containers are prefixed with the container name, and a colon (":"). (The output is shortened again for brevity's sake.)

The list-machines subcommand of systemctl shows a list of all running containers, inquiring the system managers within the containers about system state and health. More specifically it shows if containers are properly booted up, or if there are any failed services:

# systemctl list-machines
NAME         STATE   FAILED JOBS
delta (host) running      0    0
mycontainer  running      0    0
miau         degraded     1    0
waldi        running      0    0

4 machines listed.

To make things more interesting we have started two more containers in parallel. One of them has a failed service, which results in the machine state to be degraded.

Let's have a look at journalctl(1)'s container support. It too supports -M to show the logs of a specific container:

# journalctl -M mycontainer -n 8
Nov 12 16:51:13 wuff systemd[1]: Starting Graphical Interface.
Nov 12 16:51:13 wuff systemd[1]: Reached target Graphical Interface.
Nov 12 16:51:13 wuff systemd[1]: Starting Update UTMP about System Runlevel Changes...
Nov 12 16:51:13 wuff systemd[1]: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Nov 12 16:51:13 wuff systemd[1]: Started Update UTMP about System Runlevel Changes.
Nov 12 16:51:13 wuff systemd[1]: Startup finished in 399ms.
Nov 12 16:51:13 wuff sshd[35]: Server listening on 0.0.0.0 port 24.
Nov 12 16:51:13 wuff sshd[35]: Server listening on :: port 24.

However, it also supports -m to show the combined log stream of the host and all local containers:

# journalctl -m -e

(Let's skip the output here completely, I figure you can extrapolate how this looks.)

But it's not only systemd's own tools that understand container support these days, procps sports support for it, too:

# ps -eo pid,machine,args
 PID MACHINE                         COMMAND
   1 -                               /usr/lib/systemd/systemd --switched-root --system --deserialize 20
[...]
2915 -                               emacs contents/projects/containers.md
3403 -                               [kworker/u16:7]
3415 -                               [kworker/u16:9]
4501 -                               /usr/libexec/nm-vpnc-service
4519 -                               /usr/sbin/vpnc --non-inter --no-detach --pid-file /var/run/NetworkManager/nm-vpnc-bfda8671-f025-4812-a66b-362eb12e7f13.pid -
4749 -                               /usr/libexec/dconf-service
4980 -                               /usr/lib/systemd/systemd-resolved
5006 -                               /usr/lib64/firefox/firefox
5168 -                               [kworker/u16:0]
5192 -                               [kworker/u16:4]
5193 -                               [kworker/u16:5]
5497 -                               [kworker/u16:1]
5591 -                               [kworker/u16:8]
5711 -                               sudo -s
5715 -                               /bin/bash
5749 -                               /home/lennart/projects/systemd/systemd-nspawn -D /srv/mycontainer -b
5750 mycontainer                     /usr/lib/systemd/systemd
5799 mycontainer                     /usr/lib/systemd/systemd-journald
5862 mycontainer                     /usr/lib/systemd/systemd-logind
5863 mycontainer                     /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
5868 mycontainer                     /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt102
5871 mycontainer                     /usr/sbin/sshd -D
6527 mycontainer                     /usr/lib/systemd/systemd-resolved
[...]

This shows a process list (shortened). The second column shows the container a process belongs to. All processes shown with "-" belong to the host itself.

But it doesn't stop there. The new "sd-bus" D-Bus client library we have been preparing in the systemd/kdbus context knows containers too. While you use sd_bus_open_system() to connect to your local host's system bus sd_bus_open_system_container() may be used to connect to the system bus of any local container, so that you can execute bus methods on it.

sd-login.h and machined's bus interface provide a number of APIs to add container support to other programs too. They support enumeration of containers as well as retrieving the machine name from a PID and similar.

systemd-networkd also has support for containers. When run inside a container it will by default run a DHCP client and IPv4LL on any veth network interface named host0 (this interface is special under the logic described here). When run on the host networkd will by default provide a DHCP server and IPv4LL on veth network interface named ve- followed by a container name.

Let's have a look at one last facet of systemd's container integration: the hook-up with the name service switch. Recent systemd versions contain a new NSS module nss-mymachines that make the names of all local containers resolvable via gethostbyname() and getaddrinfo(). This only applies to containers that run within their own network namespace. With the systemd-nspawn command shown above the the container shares the network configuration with the host however; hence let's restart the container, this time with a virtual veth network link between host and container:

# machinectl poweroff mycontainer
# systemd-nspawn -D /srv/mycontainer --network-veth -b

Now, (assuming that networkd is used in the container and outside) we can already ping the container using its name, due to the simple magic of nss-mymachines:

# ping mycontainer
PING mycontainer (10.0.0.2) 56(84) bytes of data.
64 bytes from mycontainer (10.0.0.2): icmp_seq=1 ttl=64 time=0.124 ms
64 bytes from mycontainer (10.0.0.2): icmp_seq=2 ttl=64 time=0.078 ms

Of course, name resolution not only works with ping, it works with all other tools that use libc gethostbyname() or getaddrinfo() too, among them venerable ssh.

And this is pretty much all I want to cover for now. We briefly touched a variety of integration points, and there's a lot more still if you look closely. We are working on even more container integration all the time, so expect more new features in this area with every systemd release.

Note that the whole machine concept is actually not limited to containers, but covers VMs too to a certain degree. However, the integration is not as close, as access to a VM's internals is not as easy as for containers, as it usually requires a network transport instead of allowing direct syscall access.

Anyway, I hope this is useful. For further details, please have a look at the linked man pages and other documentation.

Analysing input events with evemu

Over the last couple of years, we've put some effort into better tooling for debugging input devices. Benjamin's hid-replay is an example for a low-level tool that's great for helping with kernel issues, evemu is great for userspace debugging of evdev devices. evemu has recently gained better Python bindings, today I'll explain here how those make it really easy to analyse event recordings.

Requirement: evemu 2.1.0 or later

The input needed to make use of the Python bindings is either a device directly or an evemu recordings file. I find the latter a lot more interesting, it enables me to record multiple users/devices first, and then run the analysis later. So let's go with that:


$ sudo evemu-record > mouse-events.evemu
Available devices:
/dev/input/event0: Lid Switch
/dev/input/event1: Sleep Button
/dev/input/event2: Power Button
/dev/input/event3: AT Translated Set 2 keyboard
/dev/input/event4: SynPS/2 Synaptics TouchPad
/dev/input/event5: Lenovo Optical USB Mouse
Select the device event number [0-5]: 5
That pipes any event from the mouse into the file, to be terminated by ctrl+c. It's just a text file, feel free to leave it running for hours.

Now for the actual analysis. The simplest approach is to read all events from a file and print them:


#!/usr/bin/env python

import sys
import evemu

filename = sys.argv[1]
# create an evemu instance from the recording,
# create=False means don't create a uinput device from it
d = evemu.Device(filename, create=False)

for e in d.events():
print e
That prints out all events, so the output should look identical to the input file's event list. The output you should see is something like:

E: 7.817877 0000 0000 0000 # ------------ SYN_REPORT (0) ----------
E: 7.821887 0002 0000 -001 # EV_REL / REL_X -1
E: 7.821903 0000 0000 0000 # ------------ SYN_REPORT (0) ----------
E: 7.825872 0002 0000 -001 # EV_REL / REL_X -1
E: 7.825879 0002 0001 -001 # EV_REL / REL_Y -1
E: 7.825883 0000 0000 0000 # ------------ SYN_REPORT (0) ----------

The events are an evemu.InputEvent object, with the properties type, code, value and the timestamp as sec, usec accessible (i.e. the underlying C struct). The most useful method of the object is InputEvent.matches(type, code) which takes both integer values and strings:


if e.matches("EV_REL"):
print "this is a relative event of some kind"
elif e.matches("EV_ABS", "ABS_X"):
print "absolute X movement"
elif e.matches(0x03, 0x01):
printf "absolute Y movement"

A practical example: let's say we want to know the maximum delta value our mouse sends.



import sys
import evemu

filename = sys.argv[1]
# create an evemu instance from the recording,
# create=False means don't create a uinput device from it
d = evemu.Device(filename, create=False)

if not d.has_event("EV_REL", "REL_X") or \
not d.has_event("EV_REL", "REL_Y"):
print "%s isn't a mouse" % d.name
sys.exit(1)

deltas = []

for e in d.events():
if e.matches("EV_REL", "REL_X") or \
e.matches("EV_REL", "REL_Y"):
deltas.append(e.value)

max = max([abs(x) for x in deltas])
print "Maximum delta is %d" % (max)
And voila, with just a few lines of code we've analysed a set of events. The rest is up to your imagination. So far I've used scripts like this to help us implement palm detection, figure out ways how to deal with high-DPI mice, estimate the required size for top softwarebuttons on touchpads, etc.

Especially for printing event values, a couple of other functions come in handy here:


type = evemu.event_get_value("EV_REL")
code = evemu.event_get_value("EV_REL", "REL_X")

strtype = evemu.event_get_name(type)
strcode = evemu.event_get_name(type, code)
They do what you'd expect from them, and both functions take either strings and actual types/codes as numeric values. The same exists for input properties.

Wacom Intuos Creative Stylus 2 and the missing support on Linux

The following was debugged and discovered by Benjamin Tissoires, I'm merely playing the editor and publisher. All credit and complimentary beverages go to him please.

Wacom recently added two interesting products to its lineup: the Intuos Creative Stylus 2 and the Bamboo Stylus Fineline. Both are styli only, without the accompanying physical tablet and they are marketed towards the Apple iPad market. The basic idea here is that touch location is provided by the system, the pen augments that with buttons, pressure and whatever else. The tips of the styli are 2.9mm (Creative Stylus 2) and 1.9mm (Bamboo Fineline), so definitely smaller than your average finger, and smaller than most other touch pens. This could of course be useful for any touch-capable Linux laptop, it's a cheap way to get an artist's tablet. The official compatibility lists the iPads only, but then that hasn't stopped anyone in the past.

We enjoy a good relationship with the Linux engineers at Wacom, so naturally the first thing was to ask if they could help us out here. Unfortunately, the answer was no. Or more specifically (and heavily paraphrased): "those devices aren't really general purpose, so we wouldn't want to disclose the spec". That of course immediately prompted Benjamin to go and buy one.

From Wacom's POV not disclosing the specs makes sense and why will become more obvious below. The styli are designed for a specific use-case, if Wacom claims that they can work in any use-case they have a lot to lose - mainly from the crowd that blames the manufacturer if something doesn't work as they expect. Think of when netbooks were first introduced and people complained that they weren't full-blown laptops, despite the comparatively low price...

The first result: the stylus works on most touchscreens (and Benjamin has a few of those) but not on all of them. Specifically, the touchscreen on the Asus N550JK didn't react to it. So that's warning number 1: it may not work on your specific laptop and you probably won't know until you try.

Pairing works, provided you have a Bluetooth 4.0 chipset and your kernel supports it (tested on 3.18-rc3). Problem is: you can connect the device but you don't get anything out of it. Why? Bluetooth LE. Let's expand on that: Bluetooth LE uses the Generic Attribute Profile (GATT). The actual data is divided into Profiles, Services and Characteristics, which are clearly named by committee and stand for the general topic, subtopic/item and data point(s). So in the example here the Profile is Heart Rate Profile, the Service is Heart Rate Measurement and the Characteristic is the actual count of "lub-dub" on your ticker [1]. All are predefined. Again, why does this matter? Because what we're hoping for is the Hid Service or the Hid over GATT Service service. In both cases we could then use the kernel's uhid module to get the stylus to work. Alas, the actual output of the device is:


[bluetooth]# info C5:37:E8:73:57:BE
Device C5:37:E8:73:57:BE
Name: Stylus1
Alias: Stylus1
Appearance: 0x0341
Paired: yes
Trusted: yes
Blocked: no
Connected: yes
LegacyPairing: no
UUID: Vendor specific (00001523-1212-efde-1523-785feabcd123)
UUID: Generic Access Profile (00001800-0000-1000-8000-00805f9b34fb)
UUID: Generic Attribute Profile (00001801-0000-1000-8000-00805f9b34fb)
UUID: Device Information (0000180a-0000-1000-8000-00805f9b34fb)
UUID: Battery Service (0000180f-0000-1000-8000-00805f9b34fb)
UUID: Vendor specific (6e400001-b5a3-f393-e0a9-e50e24dcca9e)
Modalias: usb:v056Ap0329d0001
So we can see GAP and GATT, Device Information and Battery Service (both predefined) and 2 Vendor specific profiles (i.e. "magic fairy dust"). And this is where Benjamin got stuck - each of these may have a vendor-specific handshake, protocol, etc. And it's not even sure he'll be able to set the device up so it talks to him. So warning number 2: you can see and connect the device, but it'll talk gibberish (or nothing).

Now, it's probably possible to reverse engineer that if you have sufficient motivation. We don't. The Bluetooth spec is available though, once you work your way through that you can start working on the vendor specific protocol which we know nothing about.

Last but not least: the userspace component. The device itself is not ready-to-use, it provides pressure but you'd still have to associate it with the right touch point. That's not trivial, especially in the presence of other touch points (the outside of your hand while using the stylus for example). So we'd need to add support for this in the X drivers and libinput to make it work. Wacom and/or OS X presumably solved this for iPads, but even there it doesn't just work. The applications need to support it and "You do have to do some digging to figure out to connect the stylus to your favorite art apps -- it's a different procedure for each one, but that's common among these styluses." That's something we wouldn't do the same way on the Linux desktop. So warning number 3: if you can make the kernel work, it won't work as you expect in userspace, and getting it to work is a huge task.

Now all that pretty much boils down to: is it worthwhile? Our consensus so far was "no". I guess Wacom was right in holding back the spec after all. These devices won't work on any tablet and even if they would, we don't have anything in the userspace stack to actually support them properly. So in summary: don't buy the stylus if you plan to use it in Linux.

[1] lub-dub is good. ta-lub-dub is not. you don't want lub-dub-ta. wikipedia

November 07, 2014

LibreOffice Coverity Defect Density 0.02

Coverity Defect Density: LibreOffice vs Average

We run LibreOffice through Coverity approximately once a week. According to Coverity's overview dashboard our current status is:

LibreOffice: 7,271,857 line of code and 0.02 defect density

Open Source Defect Density By Project Size

Line of Code (LOC) Defect Density
Less than 100,0000.35
100,000 to 499,9990.5
500,000 to 1 million0.7
More than 1 million0.65
Note: Defect density is measured by the number of defects per 1,000 lines of code, identified by the Coverity platform. The numbers shown above are from our 2013 Coverity Scan Report, which analyzed 250 million lines of open source code.
The "lines of code" here is 7,271,857 vs 9,500,825 in older reports because I'm now building against system-libraries instead of building those in-tree in order to speed up the process. Those "external" libraries have always been marked as "ignore in analysis" in coverity so that change has no effect on the defect density of our own code.

If anyone knows how we could rework our code or otherwise automatically silence https://communities.coverity.com/thread/2993 that would be great. This false positive keeps cropping up in uses of uno::Sequence so they keep popping up.

We're now at that happy place where we are getting a very small and manageable number of actually new warnings in "really" modified code each run rather than getting the same old one again and again as general refactoring perturbs the code enough that they get newly detected.
more on Displaylink3 and HDCP encryption
okay another braindump (still nothing working).

The git repo mentioned in previous post has all the code I've hacked up so far.

I finished writing the HDCP protocol stages, and sending all the msgs and getting replies from the device.

So I've successfully reached a point where I've negotiated a HDCP session key with the device, and we are both happy about it. Unfortunately I've no idea what I'm meant to be encrypting to send to the device. The next packet the USB traces contain is 384-bytes of encrypted data.

Now HDCP v2 had a vulnerabilty in its key neg, and I've written code to try and use this fact. So I've taken a trace I made from Windows, and extracted the necessary bits, and using that I've managed to derive the master key used in that trace, and subsequently managed to derived the session key for it. So I've replayed the first encrypted packet from the trace to the device and got an encrypted response the same as in the trace.

I've tried changing a bit in the session key, riv value and data I'm sending, and doing that causes the device not to reply with the answer. This to me implies that the device is using the HDCP cipher to encode the control channel. Now HDCP does say you should only do this for video streams, but maybe DisplayLink forgot to read that bit.

Now where does this leave me, in theory I should be able to replay the full trace (haven't had time yet) and I should see the same picture on screen as I did (though I can't remember what monitor/device I used, so I might have to retrace and restage my tests before then).

However I really need to decrypt the encrypted data in the trace, and from reading the HDCP spec the only values I need to feed the AES engine are ks ^ lc128, riv, streamctr, inputctr. I'm assuming streamctr and inputctr are 0 for the first packet (I could be wrong, maybe they use some wacky streamctr to avoid messing with hdcp), riv and ks I've captured. So lc128 is possibly the crux.

Now what is lc128? Its a secret 128-bit value in the HDCP world given only to HDCP adopters. Its normally something you'd store in hw on the GPU etc as an input to the hw cipher. But in displaylink there is no GPU encrypting the data. Now its possible that displaylink don't use the same lc128 as the HDCP people, unlikely but possible. Maybe they cipher their streams with their own lc128, and only use the offical hdcp lc128 for actual HDCP streams.

I don't think lc128 has leaked, I'm not sure what the consequences of it leaking would be, but hey its just a magic number, and if displaylink are using as an input to their AES code, it must be in RAM at some point, now I need to figure out ways to work that out. I'm not sure how long it would take to brute force as 128-bit key space, probably impossible.

At any point if someone from DisplayLink wants to talk, you know where to find me :-)

November 06, 2014

Select and toggle off master elements directly via delete

Toggle off master elements directly via delete

To remove footer, header, date/time and sheetnumber master elements in LibreOffice the user has to toggle them off via view->master->master elements. Actually selecting the corresponding placeholder preview of the element in master view and pressing delete doesn't do anything.

In LibreOffice 4.4 this infuriating behavior is changed and LibreOffice will do the right thing and toggle off any selected master elements when you press delete

November 01, 2014

Hardware support news
Trackballs

I dusted off (literally) my Logitech Marble trackball to replace the Intuos tablet + mouse combination that I was using to cut down on the lateral movement of my right arm which led to back pains.

Not that you care about that one bit, but that meant that I needed a way to get a scroll wheel working with this scroll-wheel less trackball. That's now implemented in gnome-settings-daemon for GNOME 3.16. You'd run:


gsettings set org.gnome.settings-daemon.peripherals.trackball scroll-wheel-emulation-button 8

With "8" being the mouse button number to use to make the trackball ball into a wheel. We plan to add an interface to configure this in the Settings.

Touchscreens

Touchscreens are now switched off when the screensaver is on. This means you'll usually need to use one of the hardware buttons on tablets, or a mouse or keyboard on laptops to turn the screen back on.

Note that you'll need a kernel patch to avoid surprises when the touchscreen is re-enabled.

More touchscreens

The driver for the Goodix touchscreen found in the Onda v975w is now upstream as well.

October 31, 2014

a day with DisplayLink USB3 and HDCP
So for some reason I decided to look at the displaylink usb3 adaptors today. (no good news).

This blog post is so I don't forget all of this when I page it out. Notes, HDCP1.0 being broken doesn't matter to this, maybe HDCPv2.0 being a bit broken could be used, but I'm not sure how!

The displaylink USB3 protocol is based on HDCP protocol. I've traced the first few packets and it clearly
looks like the host sends two packets

AKE_Init,
AKE_Transmitter_Info

and the device sends back
AKE_Send_Cert

at least.

AKE_Send_Cert contains a 522 byte certificate, containing a receiver id, public key, some misc bytes and a signature generated with the DCP LLC private key, that you have to verify.

so the HDCP v2.2 spec contains the DP LLC public key, and I've written some code to verify the spec using openssl, but it totally fails to work. This is probably due to me doing something stupid, or not understanding what I'm doing, if you are openssl knowledgeable and want to look, the hack fest is
http://cgit.freedesktop.org/~airlied/dl3dev/

It might be the DisplayLink devices use a different signing key than the DP LLC one.

That repo contains some code to talk to the device (currently disabled) and do the initial sequence, along with an attempt to verify the cert.

Now once I get past this hurdle, the larger one seems to remain, the HDCP 2.0 spec has a global secret 128-bit value called LC128, that everyone who implements HDCP gets and hides somewhere. Its probably sitting in the displaylink driver in hex, but I'd hope they at least hide it better than that. It may also be possibly supplied by the OS, Windows or OSX. (I've no clue yet). That value is used in the key negotiation.

Now it might be possible that Displaylink allow non-HDCP encrypted data to be sent to the device, in which case win if I can find out where/how to do that, or it might be the device requires HDCP and decrypts non-HDCP content before sending it over VGA/DVI. I've no ideas yet on that front either.

Ah well probably enough learning for today, I knew nothing about HDCP this morning, so I can't say it made my life any better learning about it :-P

October 30, 2014

Hacker News metrics (first rough approach)
I'm not a huge fan of Hacker News[1]. My impression continues to be that it ends up promoting stories that align with the Silicon Valley narrative of meritocracy, technology will fix everything, regulation is the cancer killing agile startups, and discouraging stories that suggest that the world of technology is, broadly speaking, awful and we should all be ashamed of ourselves.

But as a good data-driven person[2], wouldn't it be nice to have numbers rather than just handwaving? In the absence of a good public dataset, I scraped Hacker Slide to get just over two months of data in the form of hourly snapshots of stories, their age, their score and their position. I then applied a trivial test:
  1. If the story is younger than any other story
  2. and the story has a higher score than that other story
  3. and the story has a worse ranking than that other story
  4. and at least one of these two stories is on the front page
then the story is considered to have been penalised.

(note: "penalised" can have several meanings. It may be due to explicit flagging, or it may be due to an automated system deciding that the story is controversial or appears to be supported by a voting ring. There may be other reasons. I haven't attempted to separate them, because for my purposes it doesn't matter. The algorithm is discussed here.)

Now, ideally I'd classify my dataset based on manual analysis and classification of stories, but I'm lazy (see [2]) and so just tried some keyword analysis:
KeywordPenalisedUnpenalised
Women134
Harass20
Female51
Intel23
x8634
ARM34
Airplane12
Startup4626

A few things to note:
  1. Lots of stories are penalised. Of the front page stories in my dataset, I count 3240 stories that have some kind of penalty applied, against 2848 that don't. The default seems to be that some kind of detection will kick in.
  2. Stories containing keywords that suggest they refer to issues around social justice appear more likely to be penalised than stories that refer to technical matters
  3. There are other topics that are also disproportionately likely to be penalised. That's interesting, but not really relevant - I'm not necessarily arguing that social issues are penalised out of an active desire to make them go away, merely that the existing ranking system tends to result in it happening anyway.

This clearly isn't an especially rigorous analysis, and in future I hope to do a better job. But for now the evidence appears consistent with my innate prejudice - the Hacker News ranking algorithm tends to penalise stories that address social issues. An interesting next step would be to attempt to infer whether the reasons for the penalties are similar between different categories of penalised stories[3], but I'm not sure how practical that is with the publicly available data.

(Raw data is here, penalised stories are here, unpenalised stories are here)


[1] Moving to San Francisco has resulted in it making more sense, but really that just makes me even more depressed.
[2] Ha ha like fuck my PhD's in biology
[3] Perhaps stories about startups tend to get penalised because of voter ring detection from people trying to promote their startup, while stories about social issues tend to get penalised because of controversy detection?

comment count unavailable comments
appdata-tools is dead

PSA: If you’re using appdata-validate, please switch to appstream-util validate from the appstream-glib project. If you’re also using the M4 macro, just replace APPDATA_XML with APPSTREAM_XML. I’ll ship both the old binary and the old m4 file in appstream-glib for a little bit, but I’ll probably remove them again the next time we bump ABI. That is all. :)

On joining the FSF board
I joined the board of directors of the Free Software Foundation a couple of weeks ago. I've been travelling a bunch since then, so haven't really had time to write about it. But since I'm currently waiting for a test job to finish, why not?

It's impossible to overstate how important free software is. A movement that began with a quest to work around a faulty printer is now our greatest defence against a world full of hostile actors. Without the ability to examine software, we can have no real faith that we haven't been put at risk by backdoors introduced through incompetence or malice. Without the freedom to modify software, we have no chance of updating it to deal with the new challenges that we face on a daily basis. Without the freedom to pass that modified software on to others, we are unable to help people who don't have the technical skills to protect themselves.

Free software isn't sufficient for building a trustworthy computing environment, one that not merely protects the user but respects the user. But it is necessary for that, and that's why I continue to evangelise on its behalf at every opportunity.

However.

Free software has a problem. It's natural to write software to satisfy our own needs, but in doing so we write software that doesn't provide as much benefit to people who have different needs. We need to listen to others, improve our knowledge of their requirements and ensure that they are in a position to benefit from the freedoms we espouse. And that means building diverse communities, communities that are inclusive regardless of people's race, gender, sexuality or economic background. Free software that ends up designed primarily to meet the needs of well-off white men is a failure. We do not improve the world by ignoring the majority of people in it. To do that, we need to listen to others. And to do that, we need to ensure that our community is accessible to everybody.

That's not the case right now. We are a community that is disproportionately male, disproportionately white, disproportionately rich. This is made strikingly obvious by looking at the composition of the FSF board, a body made up entirely of white men. In joining the board, I have perpetuated this. I do not bring new experiences. I do not bring an understanding of an entirely different set of problems. I do not serve as an inspiration to groups currently under-represented in our communities. I am, in short, a hypocrite.

So why did I do it? Why have I joined an organisation whose founder I publicly criticised for making sexist jokes in a conference presentation? I'm afraid that my answer may not seem convincing, but in the end it boils down to feeling that I can make more of a difference from within than from outside. I am now in a position to ensure that the board never forgets to consider diversity when making decisions. I am in a position to advocate for programs that build us stronger, more representative communities. I am in a position to take responsibility for our failings and try to do better in future.

People can justifiably conclude that I'm making excuses, and I can make no argument against that other than to be asked to be judged by my actions. I hope to be able to look back at my time with the FSF and believe that I helped make a positive difference. But maybe this is hubris. Maybe I am just perpetuating the status quo. If so, I absolutely deserve criticism for my choices. We'll find out in a few years.

comment count unavailable comments

October 28, 2014

Weekend hack, updated

After a bit more debugging, and with some help from Jasper, here is the inspector debugging an X11 application while displaying under Wayland:

Wayland inspection!It also turns out that the GTK_INSPECTOR_DISPLAY environment variable is sufficient to make this work. I started this demo with

GTK_DEBUG=interactive GDK_BACKEND=x11,wayland \
GTK_INSPECTOR_DISPLAY=wayland-0 ./gtk3-demo
Update

The reverse (application displaying under Wayland, inspector  on X) also works:

Reverse inspection

 

 

October 25, 2014

A weekend hack

I’ve been working on making GtkInspector use a different display connection. This helps isolating it from some of the changes you can trigger from inside the inspector UI. Then I thought, why not use a different backend ?!

We did enough work on GDK backend separation that it could almost work. But since we didn’t add API to actually connect to specific backends (users and applications get some control with GDK_BACKEND and gdk_set_allowed_backends()),  nobody has ever used multiple backends in the same process. And things that don’t get used don’t work.  So some fixes were necessary.

But now I have gtk3-demo running under X while the inspector is using the Broadway backend and shows up in the web browser:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1304-1" preload="metadata" width="474"><source src="http://blogs.gnome.org/mclasen/files/2014/10/multidisplay.webm?_=1" type="video/webm">http://blogs.gnome.org/mclasen/files/2014/10/multidisplay.webm</video>

To my knowledge this is the first time a GTK+ application is using multiple GDK backends at the same time.

Anticipating some questions:

  • What is this good for ?

Not that much, really. I consider it mostly a curiosity. It does prove that the GDK backend separation is (almost) working.

For most cases where this could be used with the inspector, it is probably preferable to move the inspector out of process and communicate via D-Bus. There are some people who are interested in this, so it may happen.

  • Can I use this in my application ?

No. I haven’t committed that hack that makes the inspector use a different backend . If we wanted to support this, we would need to add a real API to connect to a specific backend.

What I did commit though is the code that makes the inspector always use a separate display connection (with the same backend), so you get the better isolation that can be seen in the video.

  • Does this also work with Wayland ?

No. I first tried to make it work with Wayland+X, but the Wayland backend is more complicated than Broadway. I’m still getting crashes fairly soon. Broadway pretty much just worked, after a  round of fixes.

October 23, 2014

perf.gnome.org – introduction

My talk at GUADEC this year was titled Continuous Performance Testing on Actual Hardware, and covered a project that I’ve been spending some time on for the last 6 months or so. I tackled this project because of accumulated frustration that we weren’t making consistent progress on performance with GNOME. For one thing, the same problems seemed to recur. For another thing, we would get anecdotal reports of performance problems that were very hard to put a finger on. Was the problem specific to some particular piece of hardware? Was it a new problem? Was it an a problems that we have already addressed? I wrote some performance tests for gnome-shell a few years ago – but running them sporadically wasn’t that useful. Running a test once doesn’t tell you how fast something should be, just how fast it is at the moment. And if you run the tests again in 6 months, even if you remember what numbers you got last time, even if you still have the same development hardware, how can you possibly figure out what what change is responsible? There will have been thousands of changes to dozens of different software modules.

Continuous testing is the goal here – every time we make a change, to run the same tests on the same set of hardware, and then to make the results available with graphs so that everybody can see them. If something gets slower, we can then immediately figure out what commit is responsible.

We already have a continuous build server for GNOME, GNOME Continuous, which is hosted on build.gnome.org. GNOME Continuous is a creation of Colin Walters, and internally uses Colin’s ostree to store the results. ostree, for those not familiar with it is a bit like Git for trees of binary files, and in particular for operating systems. Because ostree can efficiently share common files and represent the difference between two trees, it is a great way to both store lots of build results and distribute them over the network.

I wanted to start with the GNOME Continuous build server – for one thing so I wouldn’t have to babysit a separate build server. There are many ways that the build can break, and we’ll never get away from having to keep a eye on them. Colin and, more recently, Vadim Rutkovsky were already doing that for GNOME Continuouous.

But actually putting performance tests into the set of tests that are run by build.gnome.org doesn’t work well. GNOME Continuous runs it’s tests on virtual machines, and a performance test on a virtual machine doesn’t give the numbers we want. For one thing, server hardware is different from desktop hardware – it generally has very limited graphics acceleration, it has completely different storage, and so forth. For a second thing, a virtual machine is not an isolated environment – other processes and unpredictable caching will affect the numbers we get – and any sort of noise makes it harder to see the signal we are looking for.

Instead, what I wanted was to have a system where we could run the performance tests on standard desktop hardware – not requiring any special management features.

Another architectural requirement was that the tests would keep on running, no matter what. If a test machine locked up because of a kernel problem, I wanted to be able to continue on, update the machine to the next operating system image, and try again.

The overall architecture is shown in the following diagram:

HWTest Architecture The most interesting thing to note in the diagram the test machines don’t directly connect to build.gnome.org to download builds or perf.gnome.org to upload the results. Instead, test machines are connected over a private network to a controller machine which supervises the process of updating to the next build and actually running, the tests. The controller has two forms of control over the process – first it controls the power to the test machines, so at any point it can power cycle a test machine and force it to reboot. Second, the test machines are set up to network boot from the test machines, so that after power cycling the controller machine can determine what to boot – a special image to do an update or the software being tested. The systemd journal from the test machine is exported over the network to the controller machine so that the controller machine can see when the update is done, and collect test results for publishing to perf.gnome.org.

perf.gnome.org is live now, and tests have been running for the last three months. In that period, the tests have run thousands of times, and I haven’t had to intervene once to deal with a . Here’s perf.gnome.org catching a regression (fix)

perf.gnome.org regressionI’ll cover more about the details of how the hardware testing setup work and how performance tests are written in future posts – for now you can find some more information at https://wiki.gnome.org/Projects/HardwareTesting.


Linux Container Security
First, read these slides. Done? Good.

(Edit: Just to clarify - these are not my slides. They're from a presentation Jerome Petazzoni gave at Linuxcon NA earlier this year)

Hypervisors present a smaller attack surface than containers. This is somewhat mitigated in containers by using seccomp, selinux and restricting capabilities in order to reduce the number of kernel entry points that untrusted code can touch, but even so there is simply a greater quantity of privileged code available to untrusted apps in a container environment when compared to a hypervisor environment[1].

Does this mean containers provide reduced security? That's an arguable point. In the event of a new kernel vulnerability, container-based deployments merely need to upgrade the kernel on the host and restart all the containers. Full VMs need to upgrade the kernel in each individual image, which takes longer and may be delayed due to the additional disruption. In the event of a flaw in some remotely accessible code running in your image, an attacker's ability to cause further damage may be restricted by the existing seccomp and capabilities configuration in a container. They may be able to escalate to a more privileged user in a full VM.

I'm not really compelled by either of these arguments. Both argue that the security of your container is improved, but in almost all cases exploiting these vulnerabilities would require that an attacker already be able to run arbitrary code in your container. Many container deployments are task-specific rather than running a full system, and in that case your attacker is already able to compromise pretty much everything within the container. The argument's stronger in the Virtual Private Server case, but there you're trading that off against losing some other security features - sure, you're deploying seccomp, but you can't use selinux inside your container, because the policy isn't per-namespace[2].

So that seems like kind of a wash - there's maybe marginal increases in practical security for certain kinds of deployment, and perhaps marginal decreases for others. We end up coming back to the attack surface, and it seems inevitable that that's always going to be larger in container environments. The question is, does it matter? If the larger attack surface still only results in one more vulnerability per thousand years, you probably don't care. The aim isn't to get containers to the same level of security as hypervisors, it's to get them close enough that the difference doesn't matter.

I don't think we're there yet. Searching the kernel for bugs triggered by Trinity shows plenty of cases where the kernel screws up from unprivileged input[3]. A sufficiently strong seccomp policy plus tight restrictions on the ability of a container to touch /proc, /sys and /dev helps a lot here, but it's not full coverage. The presentation I linked to at the top of this post suggests using the grsec patches - these will tend to mitigate several (but not all) kernel vulnerabilities, but there's tradeoffs in (a) ease of management (having to build your own kernels) and (b) performance (several of the grsec options reduce performance).

But this isn't intended as a complaint. Or, rather, it is, just not about security. I suspect containers can be made sufficiently secure that the attack surface size doesn't matter. But who's going to do that work? As mentioned, modern container deployment tools make use of a number of kernel security features. But there's been something of a dearth of contributions from the companies who sell container-based services. Meaningful work here would include things like:

  • Strong auditing and aggressive fuzzing of containers under realistic configurations
  • Support for meaningful nesting of Linux Security Modules in namespaces
  • Introspection of container state and (more difficult) the host OS itself in order to identify compromises

These aren't easy jobs, but they're important, and I'm hoping that the lack of obvious development in areas like this is merely a symptom of the youth of the technology rather than a lack of meaningful desire to make things better. But until things improve, it's going to be far too easy to write containers off as a "convenient, cheap, secure: choose two" tradeoff. That's not a winning strategy.

[1] Companies using hypervisors! Audit your qemu setup to ensure that you're not providing more emulated hardware than necessary to your guests. If you're using KVM, ensure that you're using sVirt (either selinux or apparmor backed) in order to restrict qemu's privileges.
[2] There's apparently some support for loading per-namespace Apparmor policies, but that means that the process is no longer confined by the sVirt policy
[3] To be fair, last time I ran Trinity under Docker under a VM, it ended up killing my host. Glass houses, etc.

comment count unavailable comments
An early view of GTK+ 3.16

A number of new features have landed in GTK+ recently. These are now available in the 3.15.0 release. Here is a quick look at some of them.

Overlay scrolling

We’ve had long-standing feature requests to turn scrollbars into overlayed indicators, for touch systems. An implementation of this idea has been merged now. We show traditional scrollbars when a mouse is detected, otherwise we fade in narrow, translucent indicators. The indicators are rendered on top of the content and don’t take up extra space. When you move the pointer over the indicator, it turns into a full-width scrollbar that can be used as such.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1295-2" preload="metadata" width="474"><source src="http://blogs.gnome.org/mclasen/files/2014/10/overlay-scroll2.webm?_=2" type="video/webm">http://blogs.gnome.org/mclasen/files/2014/10/overlay-scroll2.webm</video>

Other new scrolling-related features are support for synchronized scrolling of multiple scrolled windows with a shared scrollbar (like in the meld side-by-side view), and an ::edge-overshot signal that is generated when the user ‘overshoots’ the scrolling at either end.

OpenGL support

This is another very old request – GtkGLExt and GtkGLArea have been around for more than a decade.  In 3.16, GTK+ will come with a GtkGLArea widget.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1295-3" preload="metadata" width="474"><source src="http://blogs.gnome.org/mclasen/files/2014/10/opengl.webm?_=3" type="video/webm">http://blogs.gnome.org/mclasen/files/2014/10/opengl.webm</video>

Alex’ commit message explains all the details, but the high-level summary is that we now render with OpenGL when we have to, and we can  fully control the stacking of pieces that are rendered with OpenGL or with cairo: You can have a translucent popup over a 3D scene, or mix buttons into your 3D views.

While it is nice to have a GLArea widget, the real purpose is to prepare GDK for Emmanuele’s scene graph work, GSK.

A Sidebar widget

Ikey Doherty contributed the GtkSidebar widget. It is a nice and clean widget to turn the pages of a GtkStack.

sidebar
IPP Printing

The GTK+ print dialog can now handle IPP printers which don’t provide a PPD to describe their capabilities.

Pure CSS theming

For the last few years, we’ve implemented more and more CSS functionality in the GTK+ style code. In 3.14, we were able to turn Adwaita into a pure CSS theme. Since CSS has clear semantics that don’t include ‘call out to arbitrary drawing code’, we are not loading and using theme engines anymore.

We’ve landed this change early in 3.15 to give theme authors enough time to convert their themes to CSS.

More to come

With these features, we are about halfway through our plans for 3.16. You can look at the GTK+ roadmap to see what else we hope to achieve in the next few months.

October 21, 2014

A GNOME Kernel wishlist
GNOME has long had relationships with Linux kernel development, in that we would have some developers do our bidding, helping us solve hard problems. Features like inotify, memfd and kdbus were all originally driven by the desktop.

I've posted a wishlist of kernel features we'd like to see implemented on the GNOME Wiki, and referenced it on the kernel mailing-list.

I hope it sparks healthy discussions about alternative (and possibly existing) features, allowing us to make instant progress.

October 15, 2014

GNOME Software and Fonts

A few people have asked me now “How do I make my font show up in GNOME Software” and until today my answer has been something along the lines of “mrrr, it’s complicated“.

What we used to do is treat each font file in a package as an application, and then try to merge them together using some metrics found in the font and 444 semi-automatically generated AppData files from a manually updated .csv file. This wasn’t ideal as fonts were being renamed, added and removed, which quickly made the .csv file obsolete. The summary and descriptions were not translated and hard to modify. We used the pre-0.6 format AppData files as the MetaInfo specification had not existed when this stuff was hacked up just in time for Fedora 20.

I’ve spent the better part of today making this a lot more sane, but in the process I’m going to need a bit of help from packagers in Fedora, and maybe even helpful upstreams. This are the notes of what I’ve got so far:

Font components are supersets of font faces, so we’d include fonts together that make a cohesive set, for instance,”SourceCode” would consist of “SoureCodePro“, “SourceSansPro-Regular” and “SourceSansPro-ExtraLight“. This is so the user can press one button and get a set of fonts, rather than having to install something new when they’re in the application designing something. Font components need a one line summary for GNOME Software and optionally a long description. The icon and screenshots are automatically generated.

So, what do you need to do if you maintain a package with a single font, or where all the fonts are shipped in the same (sub)package? Simply ship a file like this in /usr/share/appdata/Liberation.metainfo.xml like this:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Copyright 2014 Your Name <you@domain> -->
<component type="font">
  <id>Liberation</id>
  <metadata_license>CC0-1.0</metadata_license>
  <name>Liberation</name>
  <summary>Open source versions of several commercial fonts</summary>
  <description>
    <p>
      The Liberation Fonts are intended to be replacements for Times New Roman,
      Arial, and Courier New.
    </p>
  </description>
  <update_contact>richard_at_hughsie_dot_com</updatecontact>
  <url type="homepage">http://fedorahosted.org/liberation-fonts/</url>
</component>

There can be up to 3 paragraphs of description, and the summary has to be just one line. Try to avoid too much technical content here, this is designed to be shown to end-users who probably don’t know what TTF means or what MSCoreFonts are.

It’s a little more tricky when there are multiple source tarballs for a font component, or when the font is split up into subpackages by a packager. In this case, each subpackage needs to ship something like this into /usr/share/appdata/LiberationSerif.metainfo.xml:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Copyright 2014 Your Name <you@domain> -->
<component type="font">
  <id>LiberationSerif</id>
  <metadata_license>CC0-1.0</metadata_license>
  <extends>Liberation</extends>
</component>

This won’t end up in the final metadata (or be visible) in the software center, but it will tell the metadata extractor that LiberationSerif should be merged into the Liberation component. All the automatically generated screenshots will be moved to the right place too.

Moving the metadata to font packages makes the process much more transparent, letting packagers write their own descriptions and actually influence how things show up in the software center. I’m happy to push some of my existing content from the .csv file upstream.

These MetaInfo files are not supposed to replace the existing fontconfig files, nor do I think they should be merged into one file or format. If your package just contains one font used internally, or where there is only partial coverage of the alphabet, I don’t think we want to show this in GNOME Software, and thus it doesn’t need any new MetaInfo files.

October 14, 2014

GNOME Summit wrap-up
Day 3

The last day of the Summit was a hacking-focused day.

In one corner,  Rob Taylor was bringing up Linux and GNOME on an Arm tablet. Next to him, Javier Jardon was working on a local gnome-continuous instance . Michael Cantanzaro and Robert Schroll were working on D-Bus communication to let geary use Webkit2. Christan Hergert was adding editor preferences to gnome-builder and reviewed design ideas with Ryan Lerch. I spent most of the day teaching glade about new widgets in GTK+. I also merged OpenGL support for GTK+.

Overall, a quite productive day.

Thanks again to the sponsors

 

and to Walter Bender and the MIT for hosting us.

 

October 12, 2014

GNOME Summit update

The Boston summit is well underway, time for some update on what is happening.

Day 1

BreakfastAround 20 people got together Saturday morning and, after bagels and coffee got together in a room and collected topics that everybody was interested in working on. With Christian in attendance, gnome-builder was high on the list, but OSTree and gnome-continuous also showed up several times on the list.

Our first session was discussing issues of application writers with GTK+. Examples that were mentioned are the headerbar child order changing between 3.10 and 3.12, and icon sizes changing for some icons between 3.12 and 3.14 . We haven’t come to any great conclusions as to what we can do here, other than being careful about changes. GTK+ has a README file that gets updated with release notes for each major release, but it is not very visible. I will make an effort to draw attention to such potentially problematic changes earlier in the cycle.

Another kind of problem that was pointed out is that  the old way of doing action-based menus with GtkAction and GtkUIBuilder has been deprecated, but the ‘new way’ with GAction, GMenuModel and GtkApplication has not really been fully developed yet; at least, there is not great documentation for migrating from the old to the new, or for just starting from scratch with the new way. The HowDoI‘s for this are a start, but not sufficient.

RMS

After the morning session, Richard Stallman came by and spent an hour with us, discussing the history of free software, current challenges, and how GNOME fits into this larger picture from his perspective.

In the afternoon, we had a long session discussion ‘future apps’, which is hard to summarize. It revolved around runtimes, sdks, bundling, pros and cons of btrfs vs ostree, portals,  how to get apps ported to it, what to do about qt apps. This thread covers a lot of the same ground.

I managed to land GType statistics support in GObject and GtkInspector.

Day 2

Over breakfast, we talked about the best way to communicate with a webkit webview process, which somehow led us to kdbus. Takeaways from this discussion:

  • We (GNOME release team) should push for getting all users of webkit ported to webkit2 this cycle
  • We should create a document that explains what application and service authors and distributors need to do when porting to kdbus

Day 2

After breakfast, we started out with a live demo of gnome-builder, which looks really promising. Christian is reusing a lot of existing things, from gtksourceview over devhelp to gitg and glade. But having a polished, unified application bringing it all together makes a real difference!

As this session was winding down, I started testing a GNOME 3.14.0 Wayland session with the goal of recording as many of the smaller (and larger) issues as possible in bugs.  The result so far can be found here. While I reported a long list of Wayland issues today,  all is not bad. It was no problem at all to write this blog post in a Wayland session!

Thanks to Red Hat and Codethink,  who are sponsoring this event

 

October 11, 2014

And now for some hardware (Onda v975w)
Prodded by Adam Williamson's fedlet work, and by my inability to getting an Android phone to display anything, I bought an x86 tablet.

At first, I was more interested in buying a brand-name one, such as the Dell Venue 8 Pro Adam has, or the Lenovo Miix 2 that Benjamin Tissoires doesn't seem to get enough time to hack on. But all those tablets are around 300€ at most retailers around, and have a smaller 7 or 8-inch screen.

So I bought a "not exported out of China" tablet, the 10" Onda v975w. The prospect of getting a no-name tablet scared me a little. Would it be as "good" (read bad) as a PadMini or an Action Pad?


Vrrrroooom.


Well, the hardware's pretty decent, and feels rather solid. There's a small amount of light leakage on the side of the touchscreen, but not something too noticeable. I wish it had a button on the bezel to mimick the Windows button on some other tablets, but the edge gestures should replace it nicely.

The screen is pretty gorgeous and its high DPI triggers the eponymous mode in GNOME.

With help of various folks (Larry Finger, and the aforementioned Benjamin and Adam), I got the tablet to a state where I could use it to replace my force-obsoleted iPad 1 to read comic books.

I've put up a wiki page with the status of hardware/kernel support. It's doesn't contain all my notes just yet (sound is working, touchscreen will work very very soon, and various "basic" features are being worked on).

I'll be putting up the fixed-up Wi-Fi driver and more instructions about installation on the Wiki page.

And if you want to make the jump, the tablets are available at $150 plus postage from Aliexpress.

Update: On Google+ and in comments of this blog, it was pointed out that the seller on Aliexpress was trying to scam people. All my apologies, I just selected the cheapest from this website. I personally bought it on Amazon.fr using NewTec24 FR as the vendor.

October 02, 2014

Actions have consequences (or: why I'm not fixing Intel's bugs any more)
A lot of the kernel work I've ended up doing has involved dealing with bugs on Intel-based systems - figuring out interactions between their hardware and firmware, reverse engineering features that they refuse to document, improving their power management support, handling platform integration stuff for their GPUs and so on. Some of this I've been paid for, but a bunch has been unpaid work in my spare time[1].

Recently, as part of the anti-women #GamerGate campaign[2], a set of awful humans convinced Intel to terminate an advertising campaign because the site hosting the campaign had dared to suggest that the sexism present throughout the gaming industry might be a problem. Despite being awful humans, it is absolutely their right to request that a company choose to spend its money in a different way. And despite it being a dreadful decision, Intel is obviously entitled to spend their money as they wish. But I'm also free to spend my unpaid spare time as I wish, and I no longer wish to spend it doing unpaid work to enable an abhorrently-behaving company to sell more hardware. I won't be working on any Intel-specific bugs. I won't be reverse engineering any Intel-based features[3]. If the backlight on your laptop with an Intel GPU doesn't work, the number of fucks I'll be giving will fail to register on even the most sensitive measuring device.

On the plus side, this is probably going to significantly reduce my gin consumption.

[1] In the spirit of full disclosure: in some cases this has resulted in me being sent laptops in order to figure stuff out, and I was not always asked to return those laptops. My current laptop was purchased by me.

[2] I appreciate that there are some people involved in this campaign who earnestly believe that they are working to improve the state of professional ethics in games media. That is a worthy goal! But you're allying yourself to a cause that disproportionately attacks women while ignoring almost every other conflict of interest in the industry. If this is what you care about, find a new way to do it - and perhaps deal with the rather more obvious cases involving giant corporations, rather than obsessing over indie developers.

For avoidance of doubt, any comments arguing this point will be replaced with the phrase "Fart fart fart".

[3] Except for the purposes of finding entertaining security bugs

comment count unavailable comments

September 30, 2014

GTK+ widget templates now in Javascript
Let's get the features in early!

If you're working on a Javascript application for GNOME, you'll be interested to know that you can now write GTK+ widget templates in gjs.

Many thanks to Giovanni for writing the original patches. And now to a small example:

const MyComplexGtkSubclass = new Lang.Class({
Name: 'MyComplexGtkSubclass',
Extends: Gtk.Grid,
Template: 'resource:///org/gnome/myapp/widget.xml',
Children: ['label-child'],

_init: function(params) {
this.parent(params);

this._internalLabel = this.get_template_child(MyComplexGtkSubclass,
'label-child');
}
});

And now you just need to create your widget:

let content = new MyComplexGtkSubclass();
content._internalLabel.set_label("My updated label");

You'll need gjs from git master to use this feature. And if you see anything that breaks, don't hesitate to file a bug against gjs in the GNOME Bugzilla.