Core Developer @ Hudson River Trading
On 5/4/2023, 5:47:27 PM
Return to blog
This is the second of a three-part series describing the tty subsystem. See the first part describing the subsystem and giving an overview of its architecture here. See the third part describing two relevant buffering data structures here.
Initially, I had thought that these "extra terminal behaviors" were unnecessary and not critical to my understanding and implementation of the terminal in Linux. I was very wrong; not only does line discipline make up many useful and familiar behaviors that create the look and feel of a terminal, but keeping it separate from the tty driver helps preserve the distinction between policy and mechanism in the kernel.
While tty devices are mostly a "dumb" device, acting as a bidirectional channel between keyboard/console (master) and process (slave), there are a number of useful special semantics that have evolved over the years that suit the asymmetric, interactive terminal interface. Some well-known examples include:
Overall, this behavior is called the line discipline, and it describes the behavior ("policy") of the terminal device. Recall from the earlier blog post that the other major component of the tty/terminal subsystem is the terminal driver, which provides an interface to the input and output serial hardware devices ("mechanism"). All operations on a terminal device go through the line discipline interface.
The line discipline interface comprises three functions for normal operation:
receive_buf(): called by the terminal driver to send input to the line discipline
read(): called by the slave to read from the line discipline (input) buffer
write(): called by the slave to write to the output buffer; should forward writes to the terminal driver
Additionally, the line discipline should be able to receive
ioctls to change its behavior, i.e., through the
termios interface described below. This interface is defined in
struct tty_ldisc_ops defined in
The interactions between the line discipline and the tty driver are summarized in the following diagram.
| process |
write() | | read()
| ldisc |
write() | | receive_buf()
| tty driver |
| input | | output |
| serial | | serial |
| device | | device |
Before going into specifics about line discipline behavior, we should review special ASCII characters, notation, and common keys.
In ASCII, there are 128 characters. Characters 0-31 are special, non-printable characters. Characters 32-127 are printable characters. Any character with the parity bit set (characters 128-255) are not valid ASCII and may be handled normally or filtered out using terminal settings (e.g.,
In this section we focus on the low 32 characters. Each of these characters can be entered with a control sequence; for example, ASCII 0x01 can be entered using Ctrl+A; we denote this using Emacs notation as ^A. Some keys on the keyboard are mapped to special keys, such as Enter being mapped to ^M. Important special characters are summarized in the below table; a more comprehensive table can be found here.
|line feed (LF)
|carriage return (CR)
|^M or Enter
|^[ or Esc
|^? or Bksp
The ^H, ^J, and ^M characters are understood by a terminal console driver; they are commands to move the cursor left, down, and to the beginning of a line, respectively. The ^J character signals the end of a line in canonical mode. The ^? character is used to delete backwards in canonical mode. The ^C character is used to send the SIGTERM signal when
isig is enabled.
There may be some confusion around the Enter (which produces a carriage return rather than a newline character) and Bksp (which produces a delete key rather than a backspace key). I believe it is mostly historical significance but am not too sure. The mixup between ^M (produced by Enter) and ^J (universally understood by Linux to mean end-of-line) is common enough that a common terminal setting exists to convert ^M to ^J called
In canonical mode (a.k.a., cooked mode), special characters may be used to provide editing within a line. Usually these are the erase (default ^? or Bksp) and kill (default ^U) keys, which erase the last character and the whole line, respectively.
Since you can edit a line, a read operation on a terminal in canonical mode will not complete until the end of line is reached (^J is sent). Similarly, no more than one line will be sent for any read command, no matter how many bytes are requested.
The opposite of cooked mode is called raw mode. In raw mode, reads return as soon as there is data (possibly throttled for performance), and the erase and kill characters have no special meaning.
The line discipline uses a 4KB ringbuffer (by default) to manage data. In canonical mode, data is not sent to the application until a line feed (^J) character is written to the input buffer.
One aspect of this behavior is that when a program reads input from a cooked-mode terminal, the read call doesn't finish until the LF character is sent. A call to
getchar() in libc would not instantly return once a character is inputted, unless the character was a line feed; instead, it would read the entire line and return the first byte of the terminal buffer.
Having a fixed-size line editing buffer also means that extra characters are discarded once the buffer is exhausted. If the input buffer is full, future characters are still processed (signals, echoing, etc.) but new characters will be lost.
termios(3) documents this behavior. We can observe this by entering more than 4096 characters of input1 for a program reading from stdin in cooked mode, and checking how many characters are actually received.
Note that this buffer overflow can also happen in raw mode if the buffer is not emptied quickly enough.
Usually, when interacting with a terminal, we are able to see each character that we type. This is called echoing; it works by "echoing" (copying) each byte from the input buffer to the output buffer, so that it gets displayed.
When we type characters into the terminal with echoing enabled, the characters are normally also written onto the output buffer and displayed onto the console. For printable characters, this does exactly what we expect. What happens for non-printable (control characters)?
Control characters will be printed out in Emacs notation (e.g., "^@" for Ctrl+2). The special characters are escaped before being echoed to the output buffer. The slave side receives the unescaped characters, and any special characters written to output buffer are not escaped automatically.
Some control keys will be handled specially in cooked mode and thus not be printed, such as ^?.
Echoing can also be disabled (e.g., when entering passwords) using the
Terminal devices in Linux can be configured using the
termios C interface. This interface exposes the
tcsetattr() functions to fetch and set the terminal configuration via
ioctl()s, respectively. The terminal configuration exists as a set of flags that define the terminal behavior; some sample flags from the
termios interface are shown below:
While the termios interface may be useful when writing a C program that manages terminal properties (e.g., if you are a program like
bash), then using the C interface directly is fine. However, the
stty interface is a useful shell utility to change terminal properties on command. For example, we can enable echoing using
stty echo, disable echoing using
stty -echo, enable raw mode using
stty raw, and enable cooked mode using
stty cooked. There are many more options available to match much of the
termios interface; see the manpages for
sh rather than
If you try and experiment with terminal features on your own using the
stty shell command, you may have unexpected results if using the
bash shell. At least, it will be unexpected if you don't understand what
bash does under-the-hood (as I didn't); most of the time, the good ol' Bourne shell
sh will give the expected result.
To give a simple illustration, try entering the following experiments in
sh. The following experiments are all done in raw mode, by first entering
stty raw Enter into the shell2 3.
cat -A Enter ^C
cat -A ^J ^C
Here are my results:
$ cat -A
$ cat -A^M^Cabc^?
$ cat -A
$ cat -A^J^C^Caabbcc^?^?
Phew! There's a lot of nuance here. Before going through each example, it'll be easier if I provide the overall reason for the differences upfront:
bash changes the terminal settings when prompting the user for a command. That is, it provides nice line-editing features via user-level software, and not via the terminal itself. However, before
exec-ing a program (e.g.,
cat, it restores the terminal settings.
sh doesn't provide any custom line editing semantics in the prompt, so we see the truer terminal behavior. To summarize,
bash overrides the terminal settings in the command prompt, while
sh doesn't4; however, both share the same behavior within a program executed by the shell.
Another thing we need to look into is exactly what
stty raw does, since it turns a number of terminal flags. Looking at the manpage for
stty(1), we see that the
raw option is shorthand for:
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -icanon -opost -isig -iuclc -ixany -imaxbel -xcase min 1 time 0
That's a handful, but the main options we care about are:
Setting just these options rather than
stty raw should provide (almost) identical output. That should be enough to go through these examples.
bash example, ^C still sends the SIGTERM signal despite the -isig flag5. A new prompt is entered, and the command
whoami is entered and executed by pressing Enter, which actually sends a ^M character that
bash translates to ^J despite the
-icrnl flag being set. Note that the following prompt is indented; this is due to the
-opost flag that is also set by
sh equivalent, the story is much simpler. The ^C does not send a signal and is not treated specially.
whoami is entered, followed by a ^M, which is also not treated specially. No command is executed becuase ^J is not sent.
This one is pretty clear.
bash implements its own editing semantics.
sh doesn't, and thus the ^? character that is sent when pressing Bksp is not treated as a special character.
Now, we introduce a subprocess spawned by the shell that will read from the terminal.
cat -A echoes special characters using the Emacs carat notation.
bash version, after pressing Enter we execute teh cat command, and it begins listening for input. The cat command doesn't implement any line editing like
bash, so it simply receives each character from the terminal and echoes it out. Since the terminal is in raw mode, the terminal returns characters one-by-one rather than waiting for the end of the line, hence the repeated characters; it also does not handle ^C and Bksp specially.
sh version, we might expect the same, except for one caveat: the Enter command sends ^M, not ^J, so we do not actually execute the
cat command. Recall that Enter doesn't send a newline character if
icrnl is disabled.
This is almost the same as the previous version, except that we explicitly send ^J rather than ^M/Enter.
I apologize for going into this much depth in this section, but
bash's behavior profoundly confused me at the beginning. My suggestion for messing around with terminal settings is to work in
cat, both of which will not implement line-editing behavior or change terminal settings.
include/linux/tty_ldisc.h: Defines critical data structures
struct tty_ldisc and
drivers/tty/n_tty.c: The default ldisc implementation.
drivers/tty/tty_ldisc.c: ldisc utility functions and wrapper code.
termios(3): C API to configure terminal settings using
stty(1): Shell command to modify terminal settings; usually simpler than using
N_TTY_BUF_SIZE == 4096 is the default ldisc buffer size.
1. 4096 characters is a lot of typing... easier to generate a long text file and copy-paste it into stdin.
2. You can also try them in cooked mode without doing
stty raw beforehand, although the results will probably be expected. Understanding how the shell interacts in raw mode was the difficult part for me.
3. I am following the Emacs notation for control keys to avoid any ambiguity in the examples shown, as the Ctrl+C could look like a sequence of three characters rather than a keyboard combination.
4. If you want to see exactly what
bash does, you can run any of the above examples
strace bash. Look for
ioctls being sent to the terminal device used to set terminal settings before handling prompt input, and to reset terminal settings before executing a command. When reading the command prompt,
bash enables raw mode and disables echoing for the prompt, and handles the raw-mode input directly.
5. Note that in the command prompt, ^C does not send any signal, since no program is currently being executed by the shell. Instead, it cancels the current prompt. In other words, shell programs set a SIGTERM signal handler to cancel the current prompt. This is not relevant to the question at hand, I just found it interesting because it was not something I had thought about previously.
© Copyright 2023 Jonathan Lam