CS341 Notes

请注意，本文编写于 102 天前，最后修改于 36 天前，其中某些信息可能已经过时。

file descriptor:

0 → standard input (stdin)
1 → standard output (stdout)
2 → standard error (stderr)

dup() and dup2()

dup(fd) → creates a new file descriptor number pointing to the same
file as fd.
dup2(fd, newfd) → same, but you choose the new descriptor number
(closes it first if open).

File Descriptor

It’s just a number the OS gives you when you open a file.

Think of it as a ticket number to access the file.

int fd = open("data.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
// fd might be 3 because 0,1,2 are stdin, stdout, stderr
write(fd, "Hello\n", 6); // writes to the file via fd

dup(fd)

Makes a copy of the descriptor → new number, same file.

int newfd = dup(fd);  // creates newfd (e.g., 4) same file as fd
write(newfd, "World\n", 6); // continues writing at same position

dup2(fd, newfd)

Copies fd into exactly newfd.

If newfd was open, it closes it first.

dup2(fd, STDOUT_FILENO); // now printf goes into fd's file
printf("This goes into the file!\n");

unistd.h

It declares many system call interfaces for Unix-like operating systems (Linux, macOS, BSD, etc.).

Useful functions:

#include <string.h> strlen()

open(...)

flags → control the behavior of opening the file (read/write mode, create, truncate, append, etc.).

mode_t → sets the permissions of the file only when it’s created.

How to read a file

#include "source.h"
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
// read a file line by line and then identify a specific text in ""
// remember to free() after using sscanf's buffer
void find_movie_score() {
    char* buffer = NULL;
    size_t capacity = 0;
    FILE* fp = fopen("movies.csv", "r");
    int year;
    int score;
    char movie[30];
    getline(&buffer, &capacity, fp);
    while (getline(&buffer, &capacity, fp) != -1) {
        // Read up to 29 characters, stopping if a " is seen.
        // in this case 29 means 1 character less than the 30 length long buffer,  which is safer
        int result = sscanf(buffer, "%d,%d,\"%29[^\"]\"", &year, &score, movie);
        // == is not safe
        if (strcmp(movie, "New_York,_New_York") == 0) {
            printf("%d", score);
            
            break;
        }
    }
    free(buffer);
    fclose(fp);
    
    return;
}

Network

Open Source Interconnection 7 layer model (OSI Model)
- Layer 1: The Physical Layer. These are the actual waves that carry the bauds across the wire. As an aside, bits don’t cross the wire because in most mediums you can alter two characteristics of a wave – the amplitude and the frequency – and get more bits per clock cycle.
- Layer 2: The Link Layer. This is how each of the agents reacts to certain events (error detection, noisy channels, etc). This is where Ethernet and WiFi live.
- Layer 3: The Network Layer. This is the heart of the Internet. The bottom two protocols deal with communication between two different computers that are directly connected. This layer deals with routing packets from one endpoint to another.
- Layer 4: The Transport Layer. This layer specifies how the slices of data are received. The bottom three layers make no guarantee about the order that packets are received and what happens when a packet is dropped. Using different protocols, this layer can.
- Layer 5: The Session Layer. This layer makes sure that if a connection in the previous layers is dropped, a new connection in the lower layers can be established, and it looks like nothing happened to the end-user.
- Layer 6: The Presentation Layer. This layer deals with encryption, compression, and data translation. For example, portability between different operating systems like translating newlines to windows newlines.
- Layer 7: The Application Layer. HTTP and FTP are both defined at this level. This is typically where we define protocols across the Internet. As programmers, we only go lower when we think we can create algorithms that are more suited to our needs than all of the below.

int required_family = AF_INET; // Change to AF_INET6 for IPv6, The "AF" stands for Address Family

struct ifaddrs *myaddrs, *ifa; // struct ifaddrs is a structure that holds information about a network interface (e.g., eth0, wlan0) 

// This function retrieves a list of all network interfaces (both active and inactive) on your system.
getifaddrs(&myaddrs);
char host[256], port[256];

for (ifa = myaddrs; ifa != NULL; ifa = ifa->ifa_next) {
  int family = ifa->ifa_addr->sa_family;
  if (family == required_family && ifa->ifa_addr) {
    int ret = getnameinfo(ifa->ifa_addr,
    (family == AF_INET) ? sizeof(struct sockaddr_in) :
    sizeof(struct sockaddr_in6),
    host, sizeof(host), port, sizeof(port)
    , NI_NUMERICHOST | NI_NUMERICSERV)
    if (0 == ret) {
      puts(host);
    }
  }
}

getnameinfo() converts a binary network address into a text string.

Parameters:

ifa->ifa_addr: the raw socket address structure.

The second argument: size of the structure — IPv4 and IPv6 have different sizes.

host: buffer to store the IP address.

port: buffer to store the port number.

Flags NI_NUMERICHOST | NI_NUMERICSERV mean:

NI_NUMERICHOST: return the numeric IP (like "192.168.1.10") instead of a hostname.

NI_NUMERICSERV: return the numeric port (like "80") instead of service name (like "http").

To get your IP Address from the command line use ifconfig or Windows’ ipconfig.

To grab the IP Address of a remote website, The function getaddrinfo can convert a human-readable domain name (e.g. www.illinois.edu) into an IPv4 and IPv6 address. It will return a linked-list of addrinfo structs:

struct addrinfo {
  int              ai_flags;
  int              ai_family;
  int              ai_socktype;
  int              ai_protocol;
  socklen_t        ai_addrlen;
  struct sockaddr *ai_addr;
  char            *ai_canonname;
  struct addrinfo *ai_next;
};

First, use getaddrinfo to build a linked-list of possible connections. Secondly, use getnameinfo to convert the binary address of one of those into a readable form.

 hints.ai_family = AF_INET; // AF_INET means IPv4 only addresses

  // Get the machine addresses
  int result = getaddrinfo("www.bbc.com", NULL, &hints, &infoptr);
  if (result) {
    fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(result));
    exit(1);
  }

  struct addrinfo *p;
  char host[256];

  for(p = infoptr; p != NULL; p = p->ai_next) {
    // Get the name for all returned addresses
    getnameinfo(p->ai_addr, p->ai_addrlen, host, sizeof(host), NULL, 0, NI_NUMERICHOST);
    puts(host);
  }

The socket call creates a network socket and returns a descriptor that can be used with read and write. In this sense, it is the network analog of open that opens a file stream – except that we haven’t connected the socket to anything yet!

int socket(int domain, int socket_type, int protocol);

connect to an address

struct addrinfo current, *result;
memset(&current, 0, sizeof(struct addrinfo));
current.ai_family = AF_INET;
current.ai_socktype = SOCK_STREAM;

getaddrinfo(info->hostname, info->port, &current, &result);

connect(sock_fd, result->ai_addr, result->ai_addrlen)

freeaddrinfo(result);

Simple TCP server :

socket, bind, listen, accept

socket programming (server and client)

Process memory layout

OS Kernel Space
Stack
...
Heap
BSS
Data
Text
...

OS Kernel Space

Reserved for the operating system kernel and device drivers.

User processes cannot access this space directly (for protection and stability).

Contains critical code and data structures needed to manage the system.

Stack

~~~Grows downwards in memory (toward lower addresses).~~~

Stores function call information: return addresses, local variables, and function parameters.

Managed automatically by the compiler and runtime.

Each thread in a process gets its own stack.

Heap

Grows upwards in memory (toward higher addresses).

Used for dynamically allocated memory (e.g., via malloc in C, new in C++).

Managed by the programmer (or garbage collector in managed languages).

Can fragment over time due to frequent allocations and deallocations.

BSS (Block Started by Symbol)

Contains uninitialized global and static variables.

Typically zero-initialized at program startup.

For example: static int count; (without an initializer).

Data Segment

Contains initialized global and static variables.

Divided into:

Read/write section: for variables that can change during execution.

Read-only section: for constants that must not change.

Example: static int count = 5;

Text Segment (Code Segment)

Stores the compiled program instructions (machine code).

Usually marked as read-only to prevent accidental modification of code.

May be shared among processes running the same program to save memory.

VIM

Press Esc (goes back to Normal mode).
Then type one of these commands (they start with :):

:w → Save (write) the file.
:q → Quit.
:wq → Save and quit.
:q! → Quit without saving.

Arrow keys → work as expected.
Or use the classic Vim keys:

h → move left
l → move right
j → move down
k → move up

Moving by words

w → jump forward to the beginning of the next word
e → jump to the end of the current/next word
b → jump backward to the beginning of a word

Moving by lines

0 (zero) → move to the start of the line
^ → move to the first non-space character of the line
$ → move to the end of the line

Moving by text blocks

gg → go to the top of the file
G → go to the bottom of the file
:n → jump to line n (example: :15 goes to line 15)

Searching

/word → search forward for “word”
?word → search backward for “word”
n → repeat the last search in the same direction
N → repeat the last search in the opposite direction

Scrolling

Ctrl + d → scroll half a page down
Ctrl + u → scroll half a page up
Ctrl + f → scroll one full page down
Ctrl + b → scroll one full page up

Split Verticle or Horizontal:

vim -O main.c utils.h
vim -o main.c utils.h
:vsplit utils.h // inside vim
vim -p main.c utils.h // taps alternative

Moving Between Splits

Ctrl-w h → move to left split

Ctrl-w l → move to right split

Ctrl-w j → move to split below

Ctrl-w k → move to split above

Ctrl-w w → cycle through all windows

Resizing Splits

Ctrl-w = → equalize all splits

Ctrl-w > → increase width

Ctrl-w < → decrease width

Ctrl-w + → increase height

Ctrl-w - → decrease height

GDB

break main // set break point to function main()
delete // clean all breakpoints
layout src // open interface
info breakpoint // check breakpoint information

Cursor Movement

Keys    Action
Ctrl-a    Move to beginning of line
Ctrl-e    Move to end of line
Ctrl-b    Move backward one character (like ←)
Ctrl-f    Move forward one character (like →)
Alt-b    Move backward one word
Alt-f    Move forward one word

Editing Text

Keys    Action
Ctrl-d    Delete character under cursor
Backspace    Delete character before cursor
Ctrl-k    Kill (cut) text from cursor to end of line
Ctrl-u    Kill (cut) text from cursor to beginning of line
Alt-d    Kill (cut) word after cursor
Alt-Backspace    Kill (cut) word before cursor

History Navigation

Keys    Action
Ctrl-p    Previous command (like ↑)
Ctrl-n    Next command (like ↓)
Ctrl-r    Reverse search in history
Ctrl-g    Cancel search

Misc

Keys    Action
Ctrl-l    Clear the screen (like clear)
Ctrl-y    Yank (paste) last killed text
Ctrl-t    Swap character before cursor with current one

Get started with your assignment

// open session for: fa25-cs341-276.cs.illinois.edu
cd fa25_cs341_NETID
git remote add release https://github.com/illinois-cs-coursework/fa25_cs341_.release.git
git pull release main --allow-unrelated-histories --no-rebase
git push origin main

git add submission.txt
git commit -m "My Submission" (commit your work to git).
git push origin main (push your committed work to Github – Required for your changes to be visible to the Autograder).

WSL

If you want to visit some files in your WSL via Windows file explorer, use

\\wsl$\Ubuntu\home\

Valgrind

valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes ./shell
valgrind --leak-check=full --show-leak-kinds=all ./your_program

Notes

cs341, notes, shell, vim, virtual machine, gdb, shortcut