- Overview
- Demo
- Features
- How It Works
- Project Structure
- Build
- Usage
- Error Handling
- OS Concepts Deep-Dive
- Allowed Functions (42 Subject)
- Testing
- Key Concepts Mastered
- Author
pipex is a C program that replicates the behavior of the Unix shell pipeline:
< infile cmd1 | cmd2 > outfileBuilt as part of the 42 School curriculum, this project dives deep into inter-process communication (IPC) by wiring together four fundamental Unix system calls:
| System Call | Purpose |
|---|---|
pipe() |
Creates a unidirectional kernel buffer connecting two file descriptors |
fork() |
Spawns a child process that inherits the parent's file descriptor table |
dup2() |
Redirects stdin/stdout by replacing a file descriptor with another |
execve() |
Replaces the current process image with a new program |
Together, these four primitives are the engine behind every Unix shell pipeline you have ever typed.
- Shell-faithful piping β reproduces
< infile cmd1 | cmd2 > outfileexactly - Two child processes connected through a kernel pipe buffer
- PATH resolution β searches every directory in
PATHto locate the command binary - Empty command handling β gracefully manages blank or whitespace-only commands
- Permission & access checks β validates
infile/outfilepermissions before executing - Descriptive error messages β zsh-style output for "command not found" and missing files
- Clean memory management β all allocated arrays and strings are freed before exit
- Embedded libraries β ships with its own
libft,ft_printf, andget_next_line
argv[1] argv[2] argv[3] argv[4]
infile βββΊ cmd1 βββΊ cmd2 βββΊ outfile
(child 1) (child 2)
β β²
βββββ pipe βββββββ
pipe_fd[1] pipe_fd[0]
(write) (read)
- The parent process calls
pipe()to obtain two file descriptors:pipe_fd[0](read end) andpipe_fd[1](write end). fork()is called twice to create two child processes.- Child 1 (
cmd1):- Closes
pipe_fd[0](it will only write). - Redirects
stdoutβpipe_fd[1]withdup2(). - Opens
infileand redirectsstdinβinfile_fdwithdup2(). - Calls
execve()to runcmd1.
- Closes
- Child 2 (
cmd2):- Closes
pipe_fd[1](it will only read). - Redirects
stdinβpipe_fd[0]withdup2(). - Opens
outfileand redirectsstdoutβoutfile_fdwithdup2(). - Calls
execve()to runcmd2.
- Closes
- The parent closes both ends of the pipe and calls
waitpid()on both children.
Parent process
β
ββ pipe(pipe_fd) # kernel allocates read/write FD pair
β
ββ fork() βββββββββββββββΊ Child 1
β β close(pipe_fd[0])
β β dup2(pipe_fd[1], STDOUT) # write β pipe
β β dup2(infile_fd, STDIN) # read β infile
β β execve(cmd1, ...) # become cmd1
β
ββ fork() βββββββββββββββΊ Child 2
β β close(pipe_fd[1])
β β dup2(pipe_fd[0], STDIN) # read β pipe
β β dup2(outfile_fd, STDOUT) # write β outfile
β β execve(cmd2, ...) # become cmd2
β
ββ close(pipe_fd[0])
ββ close(pipe_fd[1])
ββ waitpid(child1, ...)
ββ waitpid(child2, ...)
Why close pipe ends in the parent?
If the parent keepspipe_fd[1]open,cmd2will never see EOF on its stdin and will hang forever waiting for more data.
pipex/
βββ README.md
βββ assests/
β βββ baseImage.png
β βββ DemoImage.png
βββ subject/ # 42 project subject PDF
βββ Project/
βββ Makefile
βββ pipex.h # header β includes & prototypes
βββ main.c # argument validation + entrypoint
βββ pipex.c # core pipe/fork/dup2/execve logic
βββ pipex_utils.c # execute_commands + norm helpers
βββ parsing.c # PATH resolution & command lookup
βββ helperFunctions.c # small utility functions
βββ Include/
βββ libft/ # custom C standard library
βββ ft_printf/ # custom printf implementation
βββ get_next_line/ # custom line reader
cd Project
makeThis compiles everything (including the embedded libraries) and produces:
./pipex
| Command | Effect |
|---|---|
make clean |
Remove object files |
make fclean |
Remove object files and the binary |
make re |
Full rebuild from scratch |
./pipex <infile> <cmd1> <cmd2> <outfile>| Argument | Description |
|---|---|
infile |
Source file β read as stdin for cmd1 |
cmd1 |
First command (with optional arguments) |
cmd2 |
Second command (with optional arguments) |
outfile |
Destination file β receives stdout of cmd2 (created/truncated) |
# Count lines containing 'error' in a log file
./pipex server.log "grep error" "wc -l" result.txt
# Sort unique words from a text file
./pipex words.txt "cat" "sort -u" sorted.txt
# Extract fields from a CSV
./pipex data.csv "cut -d, -f2" "sort" output.txtEquivalent shell command:
< infile cmd1 | cmd2 > outfile- Each command is passed as a single quoted argument on the shell (e.g.
"grep -i error"). - Internally, the program splits on spaces (
ft_split(argv[i], ' ')), so complex shell quoting (nested quotes, glob expansions, variable substitutions) is not supported β consistent with the 42 subject specification.
| Situation | Behaviour |
|---|---|
infile does not exist |
Prints zsh: no such file or directory: <infile> and stops |
infile not readable |
Prints permission error and stops |
outfile not writable |
Prints permission error and stops |
Command not found in PATH |
Prints zsh: command not found: <cmd> for each missing command |
PATH missing from environment |
Reports invalid environment and exits cleanly |
| Empty command string | Handled gracefully β passes data through unchanged |
Blank infile/outfile argument |
Detected at startup; reports no such file or directory |
In Unix/Linux, everything is a file β regular files, devices, sockets, and pipes all share the same unified I/O interface. Each open resource is represented by an integer called a file descriptor (FD).
Every process starts with three FDs pre-opened:
| FD | Name | Default connection |
|---|---|---|
| 0 | stdin |
keyboard |
| 1 | stdout |
terminal |
| 2 | stderr |
terminal |
dup2(oldfd, newfd) atomically replaces newfd with a copy of oldfd, which is exactly how pipex rewires the standard streams of child processes before calling execve().
When the OS creates a process via fork(), it allocates a Process Control Block (PCB) in the kernel:
| Field | Description |
|---|---|
| PID | Unique process identifier |
| PPID | Parent's PID |
| Process state | Running / Ready / Blocked / Zombie |
| Program counter | Address of the next instruction |
| CPU registers | Saved context for context switching |
| Memory maps | Code, stack, heap regions |
| Open file table | List of open FDs (including pipe FDs!) |
| Scheduling info | Priority, CPU time used |
fork() creates an almost-identical copy of the parent PCB. After execve(), the process image is fully replaced β but the file descriptor table survives, which is the key mechanism that makes pipe redirection work.
The kernel manages processes in logical queues:
- Ready queue β waiting for CPU time
- Wait/blocked queue β waiting on I/O or an event
- Priority / multilevel feedback queues β advanced schedulers
Common algorithms: FCFS Β· SJF Β· Round Robin Β· Priority Β· Multilevel Feedback Queue
A pipe is a kernel-managed, in-memory, circular buffer that lets two processes communicate without any disk I/O. It exposes two file descriptors:
[write end fd[1]] βββββββββββββββββββΊ [read end fd[0]]
(producer writes) kernel buffer (consumer reads)
Key properties:
- Unidirectional β data flows in one direction only
- FIFO β bytes are read in the exact order they were written
- Blocking β
write()blocks when the buffer is full;read()blocks when it is empty - Auto-EOF β the read end receives EOF when all write ends are closed (this is why unused pipe ends must always be closed!)
Unnamed pipe (pipe()) |
Named pipe / FIFO (mkfifo()) |
|
|---|---|---|
| Created by | pipe(int fd[2]) |
mkfifo(path, mode) |
| Visible in filesystem | β No | β Yes (appears as a special file) |
| Usable between unrelated processes | β Requires a common ancestor | β Any process can open by path |
| Lifetime | Until all FDs are closed | Until explicitly deleted |
| Typical use | Parent β child IPC | Long-lived inter-process channels |
pipex uses unnamed pipes β the classic approach for short-lived parentβchild communication.
The 42 subject restricts which library functions may be used. The key system calls exercised in this project are:
open close read write # file I/O
malloc free # memory management
perror strerror # error reporting
access # permission checking
dup dup2 # FD duplication / redirection
pipe # IPC pipe creation
fork wait waitpid # process creation & synchronization
execve # process image replacement
Run pipex and compare its output directly against the shell:
echo "hello world" > infile
./pipex infile "cat" "wc -w" out_pipex
< infile cat | wc -w > out_shell
diff out_pipex out_shell # should produce no output (files are identical)# grep + wc
./pipex /etc/passwd "grep root" "wc -l" result.txt
< /etc/passwd grep root | wc -l
# cat + head
./pipex /etc/hosts "cat" "head -3" result.txt
< /etc/hosts cat | head -3
# Command not found (verify error message)
./pipex infile "nonexistentcmd" "wc -l" result.txt
# Missing infile (verify error message)
./pipex no_such_file "cat" "wc -l" result.txt
# Permission denied on outfile
touch locked && chmod 000 locked
./pipex infile "cat" "wc -l" lockedfor cmd1 in "cat" "grep a" "sort"; do
for cmd2 in "wc -l" "wc -w" "head -1"; do
./pipex infile "$cmd1" "$cmd2" /tmp/out_pipex
eval "< infile $cmd1 | $cmd2 > /tmp/out_shell"
diff /tmp/out_pipex /tmp/out_shell \
&& echo "PASS: $cmd1 | $cmd2" \
|| echo "FAIL: $cmd1 | $cmd2"
done
doneBy completing this project, the following low-level Unix concepts were implemented from scratch:
- Inter-process communication (IPC) via anonymous pipes
- File descriptor manipulation with
dup2()for I/O redirection - Process forking and the parent/child relationship model
- PATH environment variable parsing for dynamic binary resolution
execve()semantics β the process image replacement model- Zombie prevention with
waitpid() - Deadlock avoidance β closing all unused pipe ends so EOF propagates correctly
- Resource cleanup β freeing every heap allocation before process exit

