Jonathan Lam

Core Developer @ Hudson River Trading


Blog

Understanding the tty subsystem: Overview and architecture

On 4/7/2023, 5:30:04 PM

Return to blog


Preface 04/07/2023: This is my attempt at understanding terminal devices in Linux, both from a summary perspective and from an implementation perspective (for my OS project). This is the result of roughly two days of frantic Internet searching, and was originally written as part of my OS project documentation1.

Update 05/01/2023: I've corrected and extended bits of this post after working on the keyboard driver and better understanding the tty layer. In particular, I've updated parts related to line discipline and the event subsystem.

Update 05/04/2023: This blog post was getting too long, so I've split it into two parts. This is part 1, which summarizes the terminal subsystem and goes over its architecture. Part 2 describes the line discipline, which defines the behavior of the terminal device. Part 3 describes two of the buffering data structures used in the tty subsystem.


What is a terminal device?

The terminal is an abstraction used to manage I/O from a serial device. In its simplest form, let's assume we have a single keyboard, and a non-multitasking computer (only running one process at any time, and this process is in the foreground). Then, the terminal acts as the interface between the user and the process.

MASTER SIDE                          SLAVE SIDE
===========                          ==========

              --< output queue <-- (stdout)
      screen /                    \
user <-------       TERMINAL       --> process
    keyboard \                    /
              --> input queue --- (stdin)

This is all a terminal is. It has a "master" (interactive) and a "slave" (process) side, and two queues (one for input and one for output). The master side can have some additional functionality, such as sending signals to the slave. Like a pipe, both sides can write to the file: output on one side becomes input on the other. In particular, the master side writes keyboard input to the input buffer (queue) and reads from the output buffer (queue) to the screen; the slave side reads from the input buffer (stdin) and writes to the output buffer (stdout).

In the old days, there were physical terminals (teletype machines, or tty), which were machines used to input data to and display output from mainframes, and the upper and lower lines were two pairs of physical wires (RX/TX). Modern terminals only exist as a software abstraction, but the name tty persists.


Virtualization

The picture becomes muddier if we consider more complex usage scenarios. This usually involves virtualizing (multiplexing) one of the components in the previous simplified scenario.

Multiple serial input devices?

Introduce multiple terminal devices. Each terminal is implemented in Linux as a device file (in devfs), and processes can interface with any of these files as if they were regular files. For example, the standard keyboard input may live at /dev/tty*, serial ports at /dev/ttyS*, USB ports at /dev/USB*, etc.

Multiple process sessions for one screen?

Multiplex (virtualize) the screen over multiple terminals. The virtual terminal (vt) driver does this, multiplexing the screen over seven different terminals /dev/tty[1-7]), each running their own session (shell). The keyboard is only "connected" to the active terminal session.

Multiple processes or multiple devices attached to one terminal?

Multiplex the master and/or the slave side of a terminal. The GNU screen utility is the prototypical example of this, turning the single-channel semantics into a broadcast semantics when multiple processes connect to the same terminal device. In other words, input from terminal device(s) should show up in the input buffers for all slave process(es), and output from all slave process(es) should show up on the output buffer for all terminal device(s).

Emulated (non-physical) master device?

So far, the master side input comes directly from the keyboard (or some other physical serial device) and the output goes directly to the text-mode display. However, it may be useful to have master-side I/O not be connected to a physical device; it will be emulated by software instead. This is called a pseudoterminal, and is provided by the pty driver.

The standard use case for this is to create a terminal emulator (e.g., xterm) in a graphical environment such as X.org. A graphical application in X.org does not have direct access to physical serial devices (i.e., it cannot add a hook into the driver for the physical device). However, it does have indirect access to event streams through the Linux event subsystem (e.g., at /dev/event* via the evdev input event driver), and X.org will also send input events to the foreground application.

The way a terminal emulator works is by creating a master/slave terminal device pair by opening the special file /dev/ptmx. Then, it sends keyboard events to the master-side input buffer, and characters read from the master-side output buffer are displayed in the graphical application window.


Terminology

The words "terminal" and "console" are commonly (ab)used colloquially to mean most, if not all, of the below terms. Since the concept of a terminal or console has changed over time, these terms are somewhat fluid; however, I try to capture my modern interpretation of the terms below.


Overall architecture

The vt stack

The text-mode physical terminals (typically /dev/tty[1-6] on Linux) use the vt driver. Keyboard events are sent directly to the vt driver, which are then forwarded to the input buffer of the selected physical terminal. Data written to the output buffer is displayed on the screen using the text-mode console driver. Processes can read from and write to the slave side of the terminal device using the /dev/tty* file interface.

Text mode/physical terminal

+---------------+  +-----------------+
| /usr/bin/bash |  | print to screen |
| process       |  | console driver  |
+---------------+  +-----------------+
              ^             ^
 stdin/stdout |             | terminal output buffer
              |             |
+-------------v-------------|---------+
| +--------------+  +---------------+ |
| | /dev/tty5    |  | /dev/tty5     | |
| | slave driver |  | master driver | |
| +--------------+  +---------------+ |
|             vt driver        ^      |
+------------------------------|------+
                               |
               keyboard events |
                               |
                       +-----------------+
                       | /dev/event*     |
                       | input subsystem |
                       +-----------------+
                               ^
                     scancodes |
                               |
                       +---------------+
                       | Keyboard IRQ  |
                       +---------------+
The pty stack

Terminal emulators use the pty driver to obtain a terminal device (through the /dev/ptmx interface). The terminal emulator application receives keyboard events when it is the foreground application, and forwards these to the master-side input buffer. Data written to the output buffer are displayed in the terminal emulator window. Processes can read from and write to the slave side of the terminal device using the /dev/pts/* file interface.

Graphical mode/pseudoterminal

+---------------+           +--------------------+
| /usr/bin/bash |           | /usr/bin/xterm     |
| process       |           | terminal emulator  |
+---------------+           +--------------------+
              ^             ^  |                 ^
 stdin/stdout |    terminal |  | keyboard events |
              |     out buf |  |                 |
+-------------v-------------|--v------+  +-------------+
| +--------------+  +---------------+ |  | X.org event |
| | /dev/pts/3   |  | /dev/ptmx     | |  | layer       |
| | slave driver |  | master driver | |  +-------------+
| +--------------+  +---------------+ |    ^
|             pty driver              |    | keyboard events
+-------------------------------------+    |
                                         +-----------------+
                                         | /dev/event*     |
                                         | input subsystem |
                                         +-----------------+
                                           ^
                                           | scancodes
                                           |
                                         +---------------+
                                         | Keyboard IRQ  |
                                         +---------------+
A unified architecture and the introduction of ldisc

In the above diagrams for the vt and pty stacks, we envision the terminal layer as a driver. A driver's purpose is to abstract away hardware to a uniform software interface, so this would be appropriate if a terminal were simply a bidirectional serial channel with no extra semantics associated with it. This is how we've described a terminal so far.

However, we've been omitting the fact that the terminal does have special semantics associated with it. This functionality gives the terminal a behavior we're all used to today, such as echoing, line editing (canonical mode), and signal handling (e.g., Ctrl+C sends SIGTERM). Collectively, we call these line discipline, or ldisc for short.

Thus, we now envision the terminal layer as two separate entities: the terminal driver (e.g., vt or pty), which provide a uniform interface with master-side serial hardware input and output devices; and the tty core layer or line discipline, which provides a uniform interface for software readers and writers of the terminal. This provides a clear separation of mechanism and policy. The terminal driver provides a uniform interface to forward data from a serial input device to the line discipline, and write output to an output device (whether this be rendering to a text console or writing to a serial device). The line discipline interfaces between the terminal driver and the software slave. This is how an updated architecture might look for the vt driver.

+-----------+
| /bin/bash |
| shell     |
+-----------+
  ^
  | stdin/stdout
  v
+------------------------+
| /dev/tty5              |
| character device layer |
| in the filesystem      |
+------------------------+
  ^
  |
+-v-------------------+ slave side
| +-----------------+ |
| | n_tty           | |
| | line discipline | |
| +-----------------+ |
|    ^                |
|    |                | tty layer
|    v                |
| +-----------------+ |
| | vt              | |
| | terminal driver | |
| +-----------------+ |
+-^----------------\--+ master side
  |                 \
  | keyboard events  \ terminal output
  |                   v
+-----------------+  +----------------+
| keyboard driver |  | console driver |
+-----------------+  +----------------+

For the pseudoterminal case, we replace the vt driver with pty driver, and the master "hardware" devices are actually software sources.

I explore the line discipline in part 2 of this blog post.


Terminal driver design

For this section, we consider the case of a stereotypical vt-like terminal driver, which reads input from the keyboard and renders output to a screen. Sample code for the vt driver can be found in drivers/tty/vt/*.c.

Terminal driver interface

The terminal driver interface comprises one major interface: a write() function called by the line discipline to write output to the device. This is defined by the tty_operations data structure in include/linux/tty_driver.h.

Notably, we do not need to provide a read() interface; instead, any event generated by the input device should call the line discipline's receive_input() interface. In Linux, the tty core will manage buffering through the use of the flip buffer data structure, so the driver need not concern itself with this.

Keyboard input driver

The goal of the keyboard driver and input subsystem is to provide a sequence of keyboard events for applications to use. Each keyboard event may contain information such as timestamp, type of event (keydown, keypress (repeat), or keyup), keycode, the state of keyboard modifiers and toggle keys (e.g., Shft, Ctrl, CapsLock, etc.) and ASCII value.

For the purposes of the terminal, we only care about keydown and keypress events; keyup events are discarded.

The terminal also only cares about the sequence of ASCII values generated by the keys. Most keyboard events will have an ASCII value associated with it. Some examples (assuming a QWERTY keyboard layout):

The keyboard driver or input subsystem will use the modifier keys to generate an appropriate ASCII value. Sometimes, keys will not be associated with an ASCII value, such as the left cursor arrow key or the Insert key; these may be generate multi-byte escape sequences (beginning with the escape byte ^[). If only modifier keys are pressed, no ASCII sequence is generated.

A terminal driver such as vt or pty that takes a stream of keyboard events must transform it into a stream of ASCII characters to send to the line discipline.

Text console output driver

The console driver should be able to display text to the screen. In the simplest case, the BIOS graphics mode is set to VGA text mode, and text is rendered by writing to a character buffer at a fixed offset.

The console should probably manage a scrollback buffer used to back the video memory buffer, and implement scrolling once the cursor moves past the bottom of the screen. Additionally, scrolling may be implemented if the scrollback buffer is taller than the screen height.

Most characters will be printable characters in the range of ASCII 0x20-0x1F. However, some characters, notably ^J (line feed) moves the cursor down one row; ^M (carriage return) moves the cursor to the beginning of the line; and ^H (backspace) moves the cursor to the left, unless it is already at the left edge of the screen. How other nonprintable characters and non-ASCII characters (characters with the parity bit set) should be printed is up to the console.

A more advanced console driver should be able to handle additional drawing features and perform more efficient rendering. In particular, most terminals support the VT100 (ANSI) escape codes, which are multi-byte sequences that begin with the escape character ^[ that perform actions such as changing the text or background color. I don't know much about efficient rendering, but I imagine it means to defer work as necessary (as I imagine rendering can be an expensive operation, even for text).


In the Linux kernel

Use cases of tty

The tty interface can be used whenever you have a situation similar to the motivating scenario. We've already covered the classic vt and pty driver examples. Other prototypical examples include the ssh and telnet programs, which are interactive sessions to some external computer. In this case, the slave process is not a local process, but rather a shell process (ssh) or some process writing to a port (telnet) on a remote machine.

Source code

Resources


Footnotes

1. I originally wrote this in Markdown, and translated it to Pugjs (the templating system I use for my website) using this tool. Unfortunately, it introduced many errors that I had to hand-correct, so excuse any formatting mistakes here.

2. I used the Emacs notation here, where ^X indicates Ctrl+X.


© Copyright 2023 Jonathan Lam