Homework 4 (Due Mar. 28, 2016)
XV6
In this homework, you will gain a broad knowledge of xv6, based
on the original UNIX, Sixth Edition (also called Version 6 UNIX).
This was the first versoin of UNIX to be widely released outside
Bell Labs. It was released in May, 1975, for the DEC PDP-11 minicommputer.
You can refer to the
hyper-referenced Doxygen
copy.
One other pointer that will help you is that it often pays to read the
(relatively short) .h file before reading the corresponding .c file. The
.h file provides the data structures and the names of functions that
manipulate them.
Your job is to read the source code, and answer the following questions.
For those questoins of the form "Where is", you must indicate:
- the filename;
- the line number within that file; and
- the data structure or function name
PROCESSES:
- Where is the data structure for the process table?
- When there is a context switch from one process to another, where
are the values of the registers of the old process saved?
- What are the possible states of a process? Also, give a brief
phrase describing the purpose of each state.
- What is the function that does a context switch between two
processes?
- Explain how the context switch function works.
- What function calls the context switch function, and explain in
detail what the calling function does. (The doxygen hyper-linking is
not perfect here. You may have to use 'grep' on
/course/cs3650/unix-xv6/* )
PROCESS STARTUP:
- Suppose a new program is going to start. This requires a call to the
system call, exec(). On what lines does the operating system create the
first call frame, to be used by the user process's main()?
- The first call frame must have local variables argc and argv. Where
is the value of argv found in the exec() call?
- On what lines does the function create the process table entry for
the new process?
SYSTEM CALLS
In class, we discussed how a system call (e.g., open()) is really a function
in the C runtime library, libc.so, and that function then calls 'syscall()'
with the integer for the 'open' system call. This is similar to when you
use 'syscall'' in the MARS assembler, and you put the system call number in
register $v0 before the call.
In these questions, we will follow the chain of control from a user program
that calls 'open()' to the code in 'libc' to the syscall in the kernel, and
finally to the function in the kernel that actually does the work of the
'open()' system call.
- The file grep.c makes a call to 'open()'. The definition of 'open()'
is inside 'usys.S'. It makes use of the macro 'SYSCALL'.
Note that a macro, '$SYS_ ## name', will expand to the concatenation
of 'SYS_' and the value of the macro parameter, "name".
The assembly instruction 'int' is the interrupt instruction in
x86 assembly. The 'int' assembly instruction takes as an argument
an integer, 'T_SYSCALL'.
The code in usys.S is complex because it uses the C preprocessor.
But, roughly, SYSCALL(open) will expand to the assembly code
in lines 4 though 9 of usys.S, where the (x86 assembly) instruction:
"movl $SYS_ ## name, %eax"
expands to:
"movl $SYS_open, %eax"
The value of SYS_open can be found in the include file, "syscall.h".
The instruction:
"int $T_SYSCALL"
uses information from "traps.h". The "int" instruction is an
"interrupt" instruction. It interrupts the kernel at the address
for interrupt number 64 (found in traps.).
If you do
"grep SYS_open /course/cs5600sp16/resources/unix-xv6-source/*
it will lead you to:
/course/cs5600sp16/resources/unix-xv6-source/syscall.c
That will define the "syscalls" array, which is used by the
function "syscall".
Finally, here is the question:
Give the full details of how a call to 'open()' in grep.c will
call the function 'sys_open()' in sysfile.c, inside the operating
system kernel.
FILES AND FILE DESCRIPTORS:
In class, we've talked about files and file descriptors. We have
not yet discussed i-nodes. For these questions, you can think of
an i-node as a location on disk that has the "table of contents"
for all information about a file.
In these questions, we will follow the chain of control from
open() to a file descriptor, to a "file handle" (including the
offset into the file", to the i-node.
- The function 'sys_open()' returns a file descriptor 'fd'.
To do this, it opens a new file (new i-node) with 'filealloc()',
and it allocates a new file descriptor with 'fdalloc()'.
Where is the file descriptor allocated? Also, you will see that
the file descriptor is one entry in an array. What is the algorithm
used to choose which entry in the array to use for the new file descriptor?
[ Comment: The name 'NOFILE' means "file number". "No." is sometimes
used as an abbreviation for the word "number". ]
- As you saw above, the file descriptor turned out to be an index
in an array. What is the name of the array for which the file
descriptor is an index? Also, what is the type of one entry in
that array.
- The type that you saw in the above question is what I was calling
a "file handle" (with an offset into the file, etc.).
What is the name of the field that holds the offset into the file?
We saw it in the function 'sys_open()'.
- Remember when we mentioned a call to 'filealloc()' above?
Since the return value of 'filealloc()' is only a file handle,
we need to initialize it. Presumably, we will initialize it with
a file offset of 0. What is the line number in 'sys_open()' where
we initialize the file offset to 0?
- The file handle type was initialized to 'FD_INODE'. What are the
other types that it could have been initialized to?
- Suppose a file handle had been initialized to FD_PIPE. Find the
'struct' that hold sthe information about a pipe. For each field
in that struct, Explain briefly (just a phrase) the purpose of that
field.
- By examining the function 'sys_dup()', you can discover how a
system call to 'dup()' will manipulate both a file descriptor
and a "file handle". Describe what it does in each of the
two cases.