Linuxgraphy by Strabo
- Unofficial FMI class Discord server 22.01.2023 || 18:00 - 21:00
- In front of a tiny audience of other uni students
- Historical context
- Structure of UNIX-likes
- UNIX-style file system
- Linux from the inside
- Files and directories
- Common configuration files
- System management
- Software licensing
- What makes up a useful OS
- Init system
- Package manager
- Desktop manager
- War on distros
- Demo - setup from zero
1. Historical context
1945 - 1955
You directly loaded instructions into memory and let it execute your “code” (processor instructions). Nothing else ran on the machine.
This process was labor intensive: a qualified operator loaded your program, dumped the memory contents, remove any external media, reset the machine and load the next job.
1956 - 1959
Computers became faster, but a lot of time was spend on managing jobs.
resident monitors: very small programs, which always resided in memory, and monitored what the state of the current job was.
Jobs were loaded in series (batches). When the current job finished, memory would be dumped and the next one would automatically be loaded and started.
Peripherals (tapes, punch cards, …) were extremely slow.
Multiprogramming: the current job is waiting for a peripheral, another job would be started.
Travel reservation system by American Airlines: where travel agents would search, price and book services. A computer system now had to support:
- User accounts with concurrent access
- Fast response time and ease of use
- Storage organization, also following the rise of disks
- Data communication links (often phone lines)
More and more business started using computers, thanks to minicomputers, so demand for OS software increased.
1964 - 1969
IBM’s System/360 line of computers, each with expansion capabilities, backwards compatibility, all under one instruction set and operating system.
Multics, an influential operating system that was designed for a General Electric mainframe. Some of it’s novel ideas include, but are not limited to:
- Time-sharing: sharing resources of a singular machine among many users (via multiprogramming and multi-tasking)
- Single level store (aka virtual memory): Processes would write into contiguous memory, but in reality it is scattered and managed by the operating system.
- Protection rings: Allowing different levels of access to different resources (programs). In general, this also often requires a CPU protected mode.
- Hierarchical file system: Arbitrary directories with subdirectories, rather than just having files next to each other or having directories which could only contain files.
- Command processor (that is separated from the OS): More on that later
1969: Death of Multics and birth of UNIX
Ken Thompson image by National Inventors Hall of Fame
Ken Thompson and Dennis Ritchie in 1973
Thompson still had some desire to work on operating systems, after Bell Labs pulled out. The tools he made while rewriting a video game he made on a PDP-7 lead to a whole operating system: UNIX.
Based, expanded on and improved upon many of Multic’s ideas, UNIX became the father of modern OSs.
2. Structure of UNIX-likes
- shell: A programmable replaceable command-line interpreter, with utilities for managing the whole system and support for pipelines and I/O redirection
- UNIX-style file system: Filesystem as a single rooted tree, objects (nodes) in the file system are inodes and an inode can be (at least) either a regular file, directory or devices. Permissions per users and group.
- kernel: Program that manages all communication and operations between the hardware and software
Command-line shells operate solely with text, usually with only ASCII characters. There are usually two terms used when talking about connecting to a computer (to the shell), those are “terminal” (“console”) and “teleprinter” (“teletype”).
Video displays became widely available in the late 1970s, so before that access was done with a “teleprinter”, a typewriter-printer combo.
Teletype Model 33 by Arnold Reinhold
Today we don’t usually use such specialised hardware, but the names persist with slightly different meanings.
Virtual Terminal (Console): on some modern UNIX-likes (like Linux), the kernel/OS provide special devices, where a console is directly implemented/simulated with the connected computer and display. Sometimes could be called “tty”.
- systems that don’t provide it usually expose another way to connect a console (now external hardware and/or software is required), like a serial port
Terminal Emulator: a special program that emulates a virtual terminal (within another display system)
Pseudoterminal (pty): a device that behaves, from the point of a user, like a terminal (emulator), but could do other additional things
- ssh, a program to remotely connect to a device’s shell, establishes a pseudoterminal. You use your (local) terminal to send and receive text to a remove terminal.
Most commands are regular programs, which exist in any folder (specified inside the variable “PATH”). Each command can be given parameters, separated by spaces, each (by convention) being either text values or options (switches).
Options usually start with a hyphen
-current-time) and often support abbreviations (
-a -b -c to
Newer conventions also allow start with two hyphens
--current-time), though if shown as-is indicate the rest is a string.
More rarely you might see
--OPTION=VALUE or just
UNIX shells always have at least these additional functionalities:
pipelines: being able to redirect the output from one command to another without limit
command1 [args...] | command 2 [args...] | ...
I/O redirection: redirection of input and output, allowing insertion of input from a file or storage of output into a file:
command [args...] < filepath command [args...] > filepath
2.2. UNIX-style file system
- Rooted singular tree of directories (the root folder of which is called “root”, denoted with a “/”)
- Each tree object (node) is an i(ndex)node, containing type, ownership, access, internal data and any other metadata, but not name!
- Directory inodes (files) contain a table with rows of names and inode numbers, mapping a file name to it’s contents. Each row is called a hard link. Every directory has a hard link to itself, called “.”, and a hard link to it’s parent, called “..”.
The main attributes that every inode has are as follows:
- file type: Each inode represents some sort of file (data), but a “file” can also be a directory or device, so we need to know how to handle it.
There are 7 main types (but there can exist more, depending on OS):
- regular: just a plain old file
- directory: as explained, a file containing hard links to other files. Each directory is allowed to appear only once in a single parent.
- symbolic link: points to any file (or directory). It contains the (relative) path to that object (as a simple string), so a symlink could even not be valid.
- FIFO special: aka named pipe, a special type of regular file, where at any moment only one process can read from it and one can write to it. It takes the concept of shell pipes to inter-process communication.
- block special: a device which is randomly accessible, like a hard drive or cdrom drive.
- character special: devices which are accessed via serial streams of input or output, like keyboards, mice, graphics cards and teletypes
- sockets: file for inter-process communication. Compared to FIFO specials, they can be used by more than two processes, used in both direction and support file descriptors and packets
Side-step into users and groups
A user in UNIX is essentially a small collection of data, most important of it being a unique ID (number), a name (string), a group id (number; files created by the user are in that group) and a password.
Groups are also small collections of data, but much simpler, comprised only of a unique ID (number), a name (string) and a list of users that are “in” the group. Their main purpose is to simplify access control.
User with ID 0 is called root, it is the “system administrator”, all actions made by the system itself are done as that user. Every user has their own “home” folder (under “/home/USERNAME/”, except for root, which is “/root/”), in which they store personal files, as well as user-specific configuration files.
file permissions: control the freedom of a user to read (navigate for directories), change and execute filesEach is represented by a bit in the order above (
File permissions are split into three sets: what the owner can do, what group members can do and what anyone else can do. Each set contains a mix of three permissions: read, write and execute, and each set stores some combination of them.
101means you can read, cannot write, can execute) and the overall permissions are often shown as a three digit number or character string.
111101100will be shown as
link count: the amount of hard links that point to the inode. Every directory inode contains (normally) exactly two: one from the parent and one from itself (“.”). A file is only deleted where there are no hard links left to it (or when there is only one for a directory).
User ID of owner, Group ID, file size and timestamps
The kernel has the core functionality of the operating system and bridges the gap between programs and hardware. Some important subsystems include scheduling, file, device, process and memory management.
To preserve the everything is a file methodology, devices and processes can be handled as files (inside
They (often, in modern kernels) aren’t actual files on a hard drive, but “virtual” files, where operations on them are handled in a different manner than normal by the kernel.
Linux is not an operating system, it is a kernel!!!
3. Linux from the inside
Enough theory, time to have some fun and learn Linux!
We’ll be using Linux Minimal Live (just a bootable ISO) because:
- very small, comprised of only the Linux kernel, GNU C library and Busybox
- made by a fellow Bulgarian
We’re going to explore and look around the following stuff:
- Shells: What a modern shell would often support on a Linux based OS
- Files and directories: Naming conventions and commands to work with them
- Common configuration files: Relevant configuration files, their purpose and syntax
- System management: Other commands with which to modify your system
If you’re following at home
Owing to it’s size, not everything we need comes out of the box, so I created a script which installs what’s needed. If you want to try it out yourself, after booting up Minimal Linux Live, type out and run:
wget -q https://syndamia.com/talks/linuxgraphy-by-strabo/mll-set.sh && chmod +x mll-set.sh && ./mll-set.sh
If you get the message
TLS certificate validation not implemented, ignore it.
In summary, it installs bash, coreutils and shadow “packages” (from http://s.minos.io/archive/bifrost/x86_64) to /usr/local/bin, creates /etc/passwd, /etc/bashrc and /etc/profile with some values in them and starts bash.
Job control: Management of currently running groups of processes
&: A process can be ran in the background with an ampersand (&) at the end of the command:
command [args...] &
Ctrl+z: suspend (stop but leave in the background) the currently running (foreground) job
bg: start a suspended job in the background
fg: resume the last job to be put in the background and make it the current job with which we’re interacting
jobs: list all active (background) jobs
- Every job is identified with a Job ID (number), which you can use by prepending a percentage sign. Example, to resume the job with JID 4:
- For managing everything, usually there is a job table.
Upon shell termination, the shell tells all jobs in that table to terminate and waits for them.
disown: remove a job from the job table
kill: send a signal to the process or job. Then that process will have to handle it accordingly.
Common ones are
-KILLto immediately stop the process,
-QUITto quit it,
-ABRTto cancel the current action,
-TERMshut down (orderly),
-STOPshut down (forcefully)
heredoc: File literal, meaning it is a user “string” which is interpreted as a file
- start with
<< NAME, where NAME can be anything you want, it is used to mark the beginning and end of a heredoc.
<<is also a redirection symbol
- on every new line write out your text, all characters will be preserved
- to end it, write out
NAMEon the beginning of a new empty line. Example:
cat << MYFILE This is some text MYFILE
- start with
control operators: Control what command is executed, depending on exit status
expr1 && expr2: Run expr2, only if expr1 exited successfully
expr1 || expr2: Run expr2 if expr1 exited unsuccessfully
expr1 ; expr2: Run expr2 after expr1 finished (no matter how)
- Every program returns an exit status (number) to the shell, zero being a successful closing, and all others indicating failure.
redirection: Outside of having
>for I/O redirection, often there is also
>>: Acts like
>, putting text into a file, but rather than overwriting everything, it appends it
<<: As explained, for heredocs
<<<: To the right is a string (herestring), and it is interpreted as a file
cat <<< "Hello World!"
- Standard input, standard output and standard error are all files with which the shell works.
Typing text in the shell puts it into stdin, command output are put into stdout (and shown to the user) and errors are put into stderr (also shown to the user).
Each one of them is numbered from 0 to 2, and you can specify redirection by appending or prepending it to the redirection symbol.
N>outputfile: redirects the output from standard stream N into outputfile
N>&M: redirects the output from standard stream N into standard stream M
N<inputfile: redirects input file contents to standard stream N
N<&M: redirects standard stream M to standard stream N
- built-in test command: With
testyou can do conditional expressions. You can also often use brackets
[ args... ]instead of
- Some of the available file checks
-e FILENAME: FILENAME exists
-d FILENAME: if FILENAME is a directory
-h FILENAME: if FILENAME is a symbolic link
-w FILENAME: you can write in FILENAME
- Some of the available string checks
-n STR: string STR has nonzero length
-z STR: string STR has length zero
STR1 = STR2and
STR1 != STR2: self explanatory
- Some of the available number checks
INT1 -eq INT2: equal integers
INT1 -gt INT2: INT1 > INT2
INT1 -gt INT2: INT1 >= INT2
-a- binary AND,
-o- binary OR, parentheses for grouping (escaped with
scripting support: Rather than running a series of commands manually, you can put them all in a file and run that
#: Everything between a # and the end of the line denotes a comment
#!: If on the first line of the file, is treated as a shebang, which specifies what interpreter to be used for the script
- Each command (line) is executed in series, from top to bottom
compound commands: Shells also often offer more advanced structured, with which to more easily write complex expressions
3.2. Files and directories
/bin: main command programs (binaries), used by users
/etc: system configuration files
/sbin: command programs, used for the system operation
/usr: other system resources
/usr/bin: all other user command binaries
/usr/lib: library files
/usr/local: local, system software
/usr/share: architecture independent data, like manuals
/var: variable data, like logs and cache
/home: home directories of all users
/lib: libraries and kernel modules
/mnt: mounted temporary filesystems
/proc: process information
/root: home directory for the root user
General file system commands
mkdir: create directories
rm: remove a (hardlink to) file or directory
shred: overwrite file contents
mv: move file (or directory) to another directory
cp: copy a file
ln: create hard (or soft, with
-s) links to files (inodes)
ls: list directory contents
dd: do many things with files, like modifying and duplicating data across devices and partitions, (basic) data recovery and generating files
du: estimate file usage
stat: display status for file system or file system
chwon: change owner of an inode
chgrp: change group of an inode
chmod: change file modes
find: search for a file inside a directory (recursively)
Working with text (files)
echo: prints a given string to standard output
cat: sequentially prints files to standard output
less: view files with support for scrolling
grep: searches for matches in a file, using simplified regular expressions. Often there is
egrep, which allows for extended regex and
fgrepwhere strings are matched as-is.
sed: filter and transform text from files, according to regex
tr: translate, map (more than one to one for example) and delete characters
cut: remove parts of text from each line
head: print the first lines (or characters) of a file
tail: print the last lines (or characters) of a file
strings: find embedded text inside binary files
truncate: shrink or extend the size of a file
wc: count words inside file
diff: compare files and show their differences
Working with devices
mount: bind another file system to a location in the file system
umount: remove binding to a file system
3.3 Common configuration files
/etc/group: the data for all groups is stored here, in the format
/etc/passwd: the data for all users is stored here, in the format
/etc/shadow: stores other user data, especially the password as an encrypted string, in the format
/etc/fstab: mountings of devices to the file system, read on boot by mount
/etc/hosts: list of host names (domain names) and their corresponding IP address, when a DNS server doesn’t do the job
/etc/bashrc: global defaults and aliases used by the bash shell
3.4. System management
- System information
date: display current date and time
hostname: show (or set) the systems hostname (network name)
free: display free memory on the system
- Users and groups
useradd: creates a new user (with a password)
userdel: remove a user and all of his files
usermod: modify a user account, including his name and groups
passwd: change password of a user. With
-luser could be locked (impossible to log in)
groupadd: create a new group
groupdel: remove a group
groupmod: modify an existing group
w: show who is logged in and what they are doing
who: show who is logged in
whomai: print username
login: establish a new connection to the system
4. Software licensing
It is important to note that most software in Linux, including the kernel itself, is under a variety of open licenses. An open license is a license which allows others to reuse the original work, under some restrictions.
For software, such a license is applied to the source code from which the original application was made. Some commonly used ones, with a (NON-LEGALLY BINDING) summary include:
- MIT: the author is not responsible for anything that the code does
- BSD-3-Clause: use the same license (copyleft), the author is not responsible for anything that the code does
- GNU GPL: disclose the original project and use the same license (copyleft)
This is important, since pretty much everything can be freely modified by anyone (for the better or worse, but generally better).
5. What makes up a useful OS
We talked about the modularity of a Linux-based operating system. Overall, these are the main elements that make up a desktop operating system you might want to use:
- Bootloader: load the kernel and operating system
- Init system: well managed way to start everything inside the OS
- Package manager: way to manage our binaries
- Desktop manager: graphical shell
When a computer boots, it needs to start an operating system, but you can’t expect it to know how to boot any operating system. Skipping over the details, the computer starts a small program called the bootloader, which handles loading the OS.
On Linux, the most used one is called GRUB. With it you can also specify which kernel to load, with which parameters to load the kernel and which OS to start.
5.2. Init system
In Minimal Linux Live, the whole system is started by some hard written scripts, this isn’t very modular and extensible, that is why we need a special system.
Generally, the init system loads a series of services, where each service is just a file that specifies what and how something needs to be ran.
The most common init systems are
5.3. Package manager
Running a program is as simple as putting the executable in a nice folder, like
However this process isn’t very convenient if you’re only using a shell and more tedious if using a graphical shell.
The solution is to have all programs in one location, where you can download, install and potentially update a program with one simple command, just using the program’s name. Other good features include installing dependencies and checking if a package is tampered with.
There is another “type” of package managers, those similar to
portage, where packages are actually instructions on how to download and compile a specific version of the program.
5.4. Window systems
A widow system is your graphical shell: the graphical way to interact with your computer (the kernel). It’s main components are:
- Window manager: the overall system with which you create, remove, resize, etc. windows. They are generally separated into two types:
- Stacked: this is what you know, a windows are just rectangles floating about, and can be put one over each other
- Tiled: windows take up a predefined (but changeable) portion of the screen
- Compositor: windows have an unseen buffer, allowing them to be be processed beforehand. This allows for adding transparency, animation effects and more.
- Session manager: application that manages the window managers, allowing to use different managers or for multiple users to start their own graphical sessions.
6. War on distros
A distribution is a complete set of of all applications that we’ve talked so far.
Overall, there are five distributions from which 90% of all other distros are based upon:
- Slackware: as the oldest still maintained distro, it served as inspiration for a lot of other distros. It aims to be as simple and as close to UNIX as possible.
- Debian (and Ubuntu): it is the most popular distribution (if we also include Ubuntu, since they’re so similar). Ubuntu aims to be modern and fancy, while Debian aims to be as stable as possible, which makes it the most popular in server usage.
- Gentoo: the distribution which defined non-binary software distribution
- Arch Linux: a very popular and modern distribution with the goal of providing the latest and greatest software.
We haven’t touched too much, but Linux distributions aren’t limited to desktop or server computers. Two more uncommon ones are:
- AOSP: the base of the whole Android operating system
- OpenWRT: a router operating system
7. Demo - setup from zero
Now for the best part: we’ll be installing a Linux distribution from a laptop running Windows 10!
I’ve chosen to install Linux Mint, since it is generally targeted towards newcomers and will feel familiar enough to Windows users.
Currently there won’t be an official recording of the process, you should’ve been here live!
Thank you for your time
All of these contain, at most, small paraphrased sentences in the slides. They mostly served as personal educational/reference tools.
- wikipedia and stackexchange has been used extensively and linked where appropriate.
- CS416 - Operating Systems Design by Paul Krzyzanowski at Rutgers State University
- Origins and History of UNIX, 1969-1995 from The Art of Unix Programming by Eric Steven Raymond
- Ken Thompson interviewed by Brian Kernighan at VCF East 2019 by Vintage Computer Federation
- Hard links and Unix file system nodes from DAT 2330 Introduction to Operating Systems by Ian D. Allen
- /usr hierarchy from Filesystem Hierarchy Standard by Linux Foundation, Daniel Quinlan, Paul ‘Rusty’ Russell, Christopher Yeoh
- User & Group Names from Linux Standard Base Core Specification by Free Standards Group
- Managing User Accounts from Running Linux, Third Edition by Matt Welsh, Matthias Kalle Dalheimer, and Lar Kaufman
Images on slides “2.1. shell” and “2.2. UNIX-style file system” are made by me and licensed as content.
The script mll-set.sh is licensed under MIT (though I don’t see it as substatial enough to be licensed under anything).