Jonathan Lam

Core Developer @ Hudson River Trading

Blog

VEIKK v3 driver notes (again)

On 7/12/2021, 7:58:20 PM

Update 12/19/21: Only got around to writing this now, so some of the ideas may be a little stale. This describes some of the design decisions I made in summer.

This is the fourth, and probably final, installment of a series of blog posts about the VEIKK Linux driver that I've been working on since 2019. Too late have I realized that I don't have enough free time on my hands to work on this to production, and to actively maintain it -- the stress of school and work is simply too great¹. The previous posts are (on my old blog):

By the most recent post, I had decided that buttom mappings would be best in userspace, in order to separate mechanism (kernel) from policy (userspace). This is essentially the premise of v3. (v2, the current stable version, still has most of the configuration in the kernel using sysfs parameters, and button mappings would be hell using this configuration.) It was my goal to implement button mappings in v3, but clearly that was more complicated than I had expected. I'll try to explain some of the difficulties here.

What new discoveries have been made since the last blog post?

The major discovery is that there is a magic code that can be sent to the digitizers that changes the event codes that are sent to the driver on a button press. These are easy to interpret and do not introduce "collisions" as the default event codes do. (This can be thought of as the equivalent of n-key rollover on a keyboard.) More information can be found here; this led to the development of the v3-alpha-ff0a branch. I was very excited about this because I was able to use a USB packet sniffer to detect this -- by differentiating between the packets sent with and without the driver, I was able to determine the magic bytes.

On the VEIKK side, there have been some advancements as well. VEIKK has released some new tablets since the last blog post, as well as official Linux .deb and .rpm drivers. (They appear to be LGPL but users appear to have problems locating the source.) This means that I can finally stop working about my stopgap solution being inadequate².

What happened to "the current solution"?

"The current solution" refers to the section in the last blog post that uses this title; it refers to sending complicated keycodes like Ctrl+Alt+Shift+Keypad - for a macro button to prevent collisions, and to send standard keycodes to userspace that can be represented by X tools like xbindkeys. This was not meant to be a good solution, but it was meant to be "good enough" and not introduce new mapping tools.

After a year of thought, I realized that the world is a messy place and there really is not a great solution -- "the current solution" makes too many bad assumptions. Those complicated keycodes, while unlikely to be used by the user in any other situation and thus not cause xbindkeys to misfire, also disallow you from using any modifier keys for anything else when using the macro buttons. Also, as mentioned in the post, xbindkeys is not a perfect solution: it has a decent learning curve and it requires "proxy keys" and another tool to emulate key combinations or executing a command, and those "other tools" are all somewhat finnicky.

The conclusion is to create a custom tool for the job -- the existing tools will not do without feeling extremely hacky and decentralized (making it a nightmare for someone to get introduced to the system). By writing my own mapping script, users who want to make a change to the mapping mechanism only have to look in one place, and do not have to go through some of the archaic documentation for the other tools. I wrote about this conclusion here. Since this configuration tool is so different from the previous versions, this led to the creation of two new repositories: @jlam55555/veikk-driver and veikk-config.

"The new current solution"

We can solve a lot of problems by writing our own mapping daemon. This daemon can work at the evdev layer rather than at the X layer (and not be constrained to the 255-key limit of X keycodes). It will use uinput to send arbitrarily-complex key combinations, and it will use subprocess to spawn commands. systemd will manage starting the daemon at startup, and udev will be used to alert the daemon of new VEIKK devices being plugged in. All of this has been more or less covered in the previous post(s). dbus is used as IPC to communicate with the daemon, and pkexec is used to gain elevated privileges (through the proper dbus configuration).

The mess of components above was actually not too bad. The problems started coming after that. Again, I'm writing this half a year after working through these problems, so some of the details have faded since then, but I can still get the gist from the commit history. The driver implementation is almost exactly that of the v3-alpha-ff0a branch, since no changes need to be made on the driver side. The configuration tool is a Python package -- Python is the language of choice because it provides many convenient wrappers around low-level Linux API's, such as uinput or udev³.

Modeling the system

Databases class taught me the importance of modeling entities in complex relationships. The general entities in the system are:

VeikkDaemon Singleton class representing the mapper daemon.
VeikkDevice Class representing a single physical device plugged into the system.
VeikkConfig Class representing a complete mapping configuration (i.e., the pen transform, and a mapping from buttons to commands).
Command Abstract class that represents commands, which perform some action on some input (either pen or button events). This has three subclasses: PenTransformCommand (for pen events), KeyComboCommand, and ProgramCommand (for button events) subclass Command. There is also a trivial NoopCommand that represents an unmapped button.

The user-facing API mainly involves a configuration script, veikkctl. This will communicate with the daemon (veikk), which will be running in the background as a system process by systemd. These scripts, along with the proper dbus, udev, and systemd configurations, will be installed by the package manager⁴.

GUI complexity

There are many things to consider when creating a graphical interface: how much more user-friendly the program will be, programming speed, licensing, etc. The first time I built a GUI for the v1 driver, when I knew or cared little about these concerns, I chose to use GTK and C, but it was very messy. The next time, for v2, I chose to use Qt and C++, which was a relatively nice experience, but many users had problems with the Qt installation, and the Qt framework is frankly overpowered for the purposes of this configuration tool.

For simplicity, I tried to stay as much away from a graphical interface if possible. The configuration tool would then be primarily a CLI -- all of the configurations would be done via command-line arguments. The only option that doesn't fit this pattern well is a screen mapping -- it is much more easier and more intuitive to specify an area of the screen to map if the user can drag a rectangular area on the screen rather than having to enter coordinates or a transform matrix manually. As a result, a screen mapping tool involving a semi-transparent overlay and mouse dragging was implemented using the wxPython GUI library, and a fallback using xlib was implemented where the user can specify the rectangle bounds by typing them in.

Software licensing

I'm not too familiar with licensing, but this is one of the aspects that you have to consider when productionizing a program. As far as I know, most kernel and OS API's in Linux are licensed under GPL v2. Qt and wxPython are licensed under LGPL (which is compatible with GPL v2), and xlib is licensed under MIT (which is compatible with LGPL/GPL v2). Thus, it should be acceptable to license both the driver and the configuration tool under GPL v2.

Setting privileges with systemd/dbus

Creating virtual devices and capturing evdev devices requires superuser access. As a result, the daemon must be run as a superuser, and this cannot happen with user-level services. As a result, the daemon must be run as a system-level service (it is started after the multi-user target).

However, we wish for the user to be able to manipulate settings without sudo access -- otherwise it would be terribly inconvenient. Thus we would like veikkctl to be run as a regular user. dbus allows us to set permissions so that root services listening on the system bus can be communicated with from non-root programs. However, we may not want to do this, due to the danger of a malicious agent changing a macro to spawn a malicious program or key combination. See the next section.

Options for spawning programs

When compared to the KeyComboCommand, the ProgramCommand is more complicated, because there are more factors to consider when spawning a process than simply a keycode translation:

Setting effective user If the daemon is running as root and can launch an arbitrary program specified by a (non-root) user, then this VEIKK driver is essentially an attack vector that allows you to run any program on a computer. Thus, we need to specify a (likely non-root) which will be set as the effective uid when a command. For now, the effective user is specified along with the command. The veikk daemon should somehow authenticate that user -- how exactly that should be done, I'm not sure. Also, we cannot simply say that the program should be run as the "current logged-in user," as there may be multiple logged-in users, and the system-level daemon has no way to tell which user the device is associated with.
Running on a X display Again, the daemon is associated with the system and not a particular user, as is the VEIKK device, so it is difficult to know which user caused an event. As different users are running different X displays, the daemon may need to set the DISPLAY envvar in order to have the GUI run for the correct user.
Running in a terminal Some commands may not launch a GUI but may have command-line output. There should be an option to run commands in a terminal window. (subprocess.popen has an option to run in terminal.)
Key trigger event Should the program trigger on keyup or keydown? Presumably, keydown is the most intuitive option, but there may be cases to be made for keyup as well (or perhaps even the (repeated) keypress event).

Configuration file format

The choice of configuration file format is important. INI style files are common for Linux utilities; JSON, YAML, and increasingly TOML files are common for application-level software. YAML was chosen somewhat arbitrarily out of the last three -- it is simple and human readable.

Even when the format is chosen, there are still decisions to be made. The PyYAML package allows for options when exporting, such as whether YAML's "flow style" should be used or not. Additionally, custom (de)serializers have to be written for custom classes that should be dumped.

An example of this is that tuples with infused meaning should be clearly labeled with their intent; we encode pen transformations as a tuple. To serialize it with a custom label, we wrap the tuple in a special class (AffineTransform1D and AffineTransform2D) and implement custom (de)serializers on those classes.

What now?

In the past, I was always worrying about when I could finish the driver, so that the people who were using it wouldn't feel like they were using abandonware. Now, both due to a better sense of my own priorities and knowing that VEIKK has released an official driver for Linux, I do not have future plans to work on the driver. The sad thing is that almost the entire design is here and thought out in these blog posts; the rest is down to implementation, but that is tedious and puts me in a time deficit. This is the most interesting and rewarding project I have ever attempted; and yet, after going through the process of thinking through all the little details, I have more critical things to do with my time. If anyone reads these blog posts and wants to have a go at it, feel free to do so, and also feel free to ping me with questions. Although I don't expect that that will happen.

Footnotes

1. Is it hypocritical to say that when I'm spending the time to write this post? I hope not. My justification is that the time and effort spent to write this post will pay off more for myself and for readers than a half-attempt to productionize the code.

2. There is the argument that I should have no obligation in the first place, but I did receive tablets for free in order to develop a driver for them. The legalities are questionable.

3. This is one of the things that I really admire Python for -- it is very good as a high-level C for systems-level programming. This is in contrast to the spaghetti dependency-breaking broken-notebook data-science code that is all too often seen that tarnishes its reputation.

4. I didn't have time to work with the distribution system much. Since this is a Python package, it can be downloaded through pip, but I'm not sure about pip/distutils's conventions with installing files to system locations. It may be better to distribute this not through pip at all, but only through distribution-specific package managers like pacman or apt.