File Descriptor

File descriptor is where the program manage its stdin, stdout, stderror or any different input/output. It basically a file pointer to another source.

Why?:

  • This provide a layer of abstraction that your application doesn't need to know where it should write to.
  • Provide the ability for redirection input/output

Example

When a program launch, it will be in /proc/<pid>. In here, the file descriptor is in fd.

These are the symlink to the actual file. For example

❯ ls -l /proc/316783/fd
total 0
lrwx------ 1 austin austin 64 Jan 26 22:50 0 -> /dev/pts/1
l-wx------ 1 austin austin 64 Jan 26 22:50 1 -> /dev/pts/1
lrwx------ 1 austin austin 64 Jan 26 22:50 2 -> /dev/pts/1

All of these are going to terminal.

  • 0: stdin
  • 1: stdout
  • 2: stderr

For example, consider the following program:

import os
import time

print(f"pid: {os.getpid()}", file=sys.stderr)
time.sleep(10)
print("Test")
❯ python3 fd.py > output.txt
pid: 327471

In here, we map stdout to a file. And print the pid to stderr. As a result, we have:

❯ ls -l /proc/327471/fd
total 0
lrwx------ 1 austin austin 64 Jan 26 22:53 0 -> /dev/pts/3
l-wx------ 1 austin austin 64 Jan 26 22:53 1 -> /home/austin/projects/learn/os/output.txt
lrwx------ 1 austin austin 64 Jan 26 22:53 2 -> /dev/pts/3

fd/1 becomes a symlink to our file. The same thing happen if we map error as well:

import os
import time

print(f"pid: {os.getpid()}")
time.sleep(10)
print("Test")
❯ python3 fd.py 2> output.txt
pid: 335823

2> means map fd/2 (error) to output.txt. As a result, our program looks like this:

❯ ls -l /proc/335823/fd
total 0
lrwx------ 1 austin austin 64 Jan 26 22:56 0 -> /dev/pts/3
lrwx------ 1 austin austin 64 Jan 26 22:56 1 -> /dev/pts/3
l-wx------ 1 austin austin 64 Jan 26 22:56 2 -> /home/austin/projects/learn/os/output.txt

[!note]
We normally see 2>&1 means map fd/2 to whatever fd/1 is mapping

Similarly, for stdin:

❯ python3 fd.py < output.txt
pid: 342234
Test
❯ ls -l /proc/342234/fd
total 0
lr-x------ 1 austin austin 64 Jan 26 22:58 0 -> /home/austin/projects/learn/os/output.txt
lrwx------ 1 austin austin 64 Jan 26 22:58 1 -> /dev/pts/3
lrwx------ 1 austin austin 64 Jan 26 22:58 2 -> /dev/pts/3

If we open a file in our application, that would create a new filedescriptor for that file

import os
import time

print(f"pid: {os.getpid()}")

with open("test_file", "r") as file:
    time.sleep(10)
    content = file.read()
    print(content)

print("Test")
❯ python3 fd.py
pid: 349720
hello world
Test
❯ ls -l /proc/349720/fd
total 0
lrwx------ 1 austin austin 64 Jan 26 23:00 0 -> /dev/pts/3
lrwx------ 1 austin austin 64 Jan 26 23:00 1 -> /dev/pts/3
lrwx------ 1 austin austin 64 Jan 26 23:00 2 -> /dev/pts/3
lr-x------ 1 austin austin 64 Jan 26 23:00 3 -> /home/austin/projects/learn/os/test_file

The number could be 3, or it could be any random number.