![]() |
|||
Homework 1: Simple UNIX programsThis assignment will make you more familiar with how to build simple Unix programs, create Makefiles, look at assembly code, and use the GDB debugger. The things you learn will prepare you to handle a more complex environment of the xv6 kernel required for the future homework. You can do this assignment on any operating system that supports the Unix API (Linux Openlab machines, your laptop that runs Linux or Linux VM, and even MacOS, etc.). You don't need to set up xv6 for this assignment Submit your programs and the shell through Gradescope (see instructions at the bottom of this page). For Mac / OSX> users. The support of 32 bit applications is depricated in the latest version of your system. So if you already updated your system to macOS Catalina or have updated your XCode then we recommend you to do the homework at the Openlab machines. Part 1: Simple UNIX programsDownload the main.c, and look it over. This is a skeleton for a simple UNIX program. To compile main.c, you need a C compiler, such as gcc. On Openlab machines, you can compile the skeleton with the following command: $ gcc main.cThis will produce an a.out file, which you can run: $ ./a.out Alternatively you can pass an additional option to gcc to give a more meaningful name to the compiled binary, like $gcc main.c -o hello Here gcc will compile your program as hello. In the rest of this part of the assignment you will explore how to automate program development with Makefiles, learn how debug your code with GDB, and disassemble the program to verify your understanding of assembly language. Part 2: Simple MakefilesThis part of the homework is adapted from https://opensource.com/article/18/8/what-how-makefile and http://mrbook.org/blog/tutorials/make/ It aims to introduce you to basics of Makefiles and the make tool that provides a way to compile complex software projects like xv6 and Linux kernel.If you want to run or update a task when certain files are updated, the make utility can come in handy. The make utility requires a file, Makefile (or makefile), which defines set of tasks to be executed. You may have used make to compile a program from source code. Most open source projects use make to compile a final executable binary, which can then be installed using make install. We'll explore make and Makefile using basic and advanced examples. Before you start, ensure that make is installed in your system. Note: we will create three different makefiles (Makefile1, Makefile2, and Makefile3) in this part. Let's start by printing the classic "Hello World" on the terminal. Create a empty directory myproject containing a file Makefile with this content: say_hello: echo "Hello World" Now run the file by typing make inside the directory myproject. The output will be: $ make echo "Hello World" Hello World In the example above, say_hello behaves like a function name, as in any programming language. This is called the target. The prerequisites or dependencies follow the target. For the sake of simplicity, we have not defined any prerequisites in this example. The command echo "Hello World" is called the recipe. The recipe uses prerequisites to make a target. The target, prerequisites, and recipes together make a rule. To summarize, below is the syntax of a typical rule: target: prerequisites <TAB> recipe As an example, a target might be a binary file that depends on prerequisites (source files). On the other hand, a prerequisite can also be a target that depends on other dependencies: final_target: sub_target final_target.c Recipe_to_create_final_target sub_target: sub_target.c Recipe_to_create_sub_target It is not necessary for the target to be a file; it could be just a name for the recipe, as in our example. We call these "phony targets." Going back to the example above, when make was executed, the entire command echo "Hello World" was displayed, followed by actual command output. We often don't want that. To suppress echoing the actual command, we need to start echo with @: say_hello: @echo "Hello World" Now try to run make again. The output should display only this: $ make Hello World Let's add a few more phony targets: generate and clean to the Makefile: say_hello: @echo "Hello World" generate: @echo "Creating empty text files..." touch file-{1..10}.txt clean: @echo "Cleaning up..." rm *.txt If we try to run make after the changes, only the target say_hello will be executed. That's because only the first target in the makefile is the default target. Often called the default goal, this is the reason you will see all as the first target in most projects. It is the responsibility of all to call other targets. We can override this behavior using a special phony target called .DEFAULT_GOAL. Let's include that at the beginning of our makefile: .DEFAULT_GOAL := generateThis will run the target generate as the default: $ make Creating empty text files... touch file-{1..10}.txt As the name suggests, the phony target .DEFAULT_GOAL can run only one target at a time. This is why most makefiles include all as a target that can call as many targets as needed. Let's include the phony target all and remove .DEFAULT_GOAL: all: say_hello generate say_hello: @echo "Hello World" generate: @echo "Creating empty text files..." touch file-{1..10}.txt clean: @echo "Cleaning up..." rm *.txt Before running make, let's include another special phony target, .PHONY, where we define all the targets that are not files. make will run its recipe regardless of whether a file with that name exists or what its last modification time is. Here is the complete makefile: .PHONY: all say_hello generate clean all: say_hello generate say_hello: @echo "Hello World" generate: @echo "Creating empty text files..." touch file-{1..10}.txt clean: @echo "Cleaning up..." rm *.txt The make should call say_hello and generate $ make Hello World Creating empty text files... touch file-{1..10}.txt It is a good practice not to call clean in all or put it as the first target. clean should be called manually when cleaning is needed as a first argument to make: $ make clean Cleaning up... rm *.txt Now that you have an idea of how a basic makefile works and how to write a simple makefile, let's look at some more advanced examples. Something more realNow lets try to create a simple Makefile that we can use to compile our programs: all: gcc main.c -o hello Now you can run: makeInstead of running GCC manually, the Makefile lets you compile main.c into the hello program. In our example the only target in the Makefile is all. The make utility will try to resolve this target if no other targets are specified. Finally, make compiles the program according to the command line we gave it. When you submit your work, rename this makefile into Makefile1. You can always pass a custom name to make with the -f make -f Makefile1 Using dependenciesSometimes it is useful to use different targets. It makes the Makefile more modular and allows assembling a complex project from multiple pieces. all: hello hello: main.o gcc main.o -o hello main.o: main.c gcc -c main.c clean: rm *.o hello Now we see that the target all has only one dependency (i.e., hello), but no system commands. In order for make to execute correctly, it has to meet all the dependencies of the called target. Each of the dependencies are searched through all the targets available and executed if found. In this example we see the target called clean. It is useful to have such target if you want to have a fast way to get rid of all the object files and executables. make clean Using variables and commentsYou can also use variables when writing Makefiles. It comes in handy in situations where you want to change the compiler, or compiler options (create this makefile and submit it as Makefile3). # This is how you write comments # Use gcc as a compiler CC=gcc # CFLAGS will be the options we'll pass to the compiler CFLAGS=-Wall all: hello hello: main.o $(CC) $(CFLAGS) main.o -o hello main.o: main.c $(CC) -c $(CFLAGS) main.c clean: rm *.o hello Variables can be very useful. To use them, just assign a value to a variable before you start writing your targets. After that, you can just use them with the dereference operator $(VAR). If you want to know more...With this brief introduction to Makefiles, you can create some very sophisticated mechanisms for compiling your projects. However, this is just the tip of the iceberg. More documentation is available here: Make documentation. Here is an example of a more automated Makefile that you might use in one of your projects (not required for this homework). CC=gcc # Compiler flags CFLAGS=-Wall # Linker flags LDFLAGS= # You can add multiple source files here separated with spaces SOURCES=main.c # Replace .c with .o creating a list of object files OBJECTS=$(SOURCES:.c=.o) # Name executable EXECUTABLE=hello all: $(SOURCES) $(EXECUTABLE) $(EXECUTABLE): $(OBJECTS) $(CC) $(LDFLAGS) $(OBJECTS) -o $@ # Rule that tells make how to make an object file out of a .c file .c.o: $(CC) -c $(CFLAGS) $< -o $@ clean: rm *.o $(EXECUTABLE) What to submitThree makefiles (Makefile1, Makefile2, and Makefile3) you created to compile your hello examplePart 3: Debugging programs with GDBOn UNIX systems the main debugger is GDB (GNU debugger). To be able to comfortably debug your code compile it with the -g option which will instruct the compiler to generate debug symbols (variable names, source lines, etc.) for the program. For example, change your Makefile to haveCFLAGS=-Wall -g -m32 -fno-picThis will compile your hello program with debugging symbols (-g flag), as a 32bit x86 executable (-m32 flag), and for simplicity avoid generating position independent code ( -fno-pic flag). Then you can start you program under control of gdb gdb helloThis starts gdb ready to execute your hello program. To get it running type the run command in the GDB command prompt (or r -- short for run): gdb\> runNow the program runs and finished printing "Hello world". GDB is a feature-rich debugger, and it will take you some time to learn all the features. Here is a couple of starting points: GDB tutorial, GDB intro and GDB Cheat Sheet. Probably, the best resource for this homework is Operating Systems from 0 to 1. Chapter 6. Runtime Inspection and Debug (it is strongly recommended to read this chapter). At a high level you need only two main things: 1) breakpoints and 2) ability to examine data. Breakpoints can be set with the "b" command inside gdb. Breakpoints and single steppingJust to make debugging a bit more realistic lets add another function to our simple program. Lets change it to compute a sum of numbers from 0 to n. You can do this by implementing the following functionunsigned long sum(int n) { int i; unsigned long sum = 0; for (i = 0; i < n; i++) { sum = sum + i; } return sum; }and calling it from main() int main(void) { unsigned long s; s = sum(100); printf("Hello world, the sum:%ld\n", s); return 0; }
Running the programs on its own is not that useful. Lets try setting a breakpoint on the "main" function to examine what the program is actually doing.
For that type break (gdb) break main Breakpoint 1 at 0x56b: file main.c, line 26. (gdb) r Starting program: ... Breakpoint 1, main () at main.c:26 26 s = sum(100); (gdb) The debugger stopped at the beginning of the main function (line 26 of main.c. You can examine the source code of the program by typing list or l (gdb) list 21 22 int main(void) { 23 24 unsigned long s; 25 26 s = sum(100); 27 printf("Hello world, the sum:%ld\n", s); 28 return 0; 29 } 30 Now you can execute the program line by line by typing next (execute next line, or n for short), and step (s) to step into a function. Try stepping into the sum function by running step (gdb) s sum (n=100) at main.c:13 13 unsigned long sum = 0; Here I'm inside the sum function, where I type l to list the source code, and then type n to execute it line by line (I type n once, and then simply hit "Enter" asking GDB to execute the last command for me (gdb) l 8 #include TUI: Graphical User InterfaceThe second most useful feature is the TUI mode that turns GDB into a real modern debugger. Here is a couple of links to TUI.You can switch into TUI by pressing Ctrl-X and then "1", or start gdb in TUI mode right away gdb hello -tuiOr by typing this in the gdb command prompt (this command doesn't work on Openlab, so you'll have to do Ctrl-X and then 1, but normally it works) (gdb) tui enable Start the program from the begginging and single step it with n and s. The source code of the program will be scrolling in the TUI window in the top part of the screen. Examining dataYou can print values of variables with "print", e.g., print the values of i and sum gdb\> p i gdb\> p sum Conditional breakpointsWhile debugging programs it's often useful to see what the program is doing right before it crashes. One way to do this is to step through, one at a time, every statement of the program until we get to the point of execution where we want to see what is happening. This works, but sometimes you may want to just run to a particular section of code and stop execution at that point so you can examine data at that location.GDB allows you to set conditional breakpoints. For example, lets break inside the loop of the sum function when the index i is equal to 50. I first list the source code to get the exact source lines and then set a brakpoint inside the main.c file at line 16 with break main.c:16 (gdb) l 11 unsigned long sum(int n) { 12 int i; 13 unsigned long sum = 0; 14 15 for (i = 0; i < n; i++) { 16 sum = sum + i; 17 } 18 19 return sum; 20 } (gdb) break main.c:16 Breakpoint 2 at 0x56555543: file main.c, line 16.This breakpoint will trigger for every iteration of the loop. So if we want it to fire only when i is 50 we add the following condition. I use 2 as this is the number of the breakpoint that I set at line 16. (gdb) condition 2 i==50 I now continue execution of the program with the continue or c command. (gdb) c Continuing. Breakpoint 2, sum (n=100) at main.c:16 16 sum = sum + i; When the breakpoint is hit I check that the value of i is really 50 (gdb) p i $1 = 50 (gdb) Exploring crashesNow, lets take a look at how you can use GDB to debug your crashing programs. First, lets generate a program that crashes. Add a global variable a[32] to your program (it's an array of 32 integers).int a[32];Lets then add a function that makes an out of bounds array access unsigned long crash_array(int n) { int i; unsigned long sum = 0; for (i = 0; i < n; i++) { sum = sum + a[i]; } return sum; }If you invoke this function with n larger than 31 it will crash (Note, you might get lucky and it will not crash, i.e., not all out of bounds accesses cause a crash in C programs). To be safe lets invoke it with n equal to 10,000. s = crash_array(100000); printf("crash array sum:%ld\n", s);If you make this program and run it, it will crash ./hello Hello world, the sum:4950 Segmentation fault (core dumped)Now, to understand the crash you can run it under gdb: (gdb) r Starting program: /home/aburtsev/doc/OS_Stuff/Flux/git/personal/classes/os-class/cs143a/hw/hello Hello world, the sum:4950 Program received signal SIGSEGV, Segmentation fault. 0x56555566 in crash_array (n=100000) at main-full.c:18 18 sum = sum + a[i];You can use the backtrace (bt) command to look at the backtrace (a chain of function invocations leading to the crash): (gdb) bt #0 0x56555566 in crash_array (n=100000) at main-full.c:18 #1 0x565555ec in main () at main-full.c:45Here, the GDB tells you that crash_array got a segfault at line 18 in main-full.c. You see that there are two stack frames available (0 for main and 1 for crash_array). You can use the frame (gdb) f 0 #0 0x56555566 in crash_array (n=100000) at main-full.c:18 18 sum = sum + a[i]; (gdb) l 13 unsigned long crash_array(int n) { 14 int i; 15 unsigned long sum = 0; 16 17 for (i = 0; i < n; i++) { 18 sum = sum + a[i]; 19 } 20 21 return sum; 22 }We know that line 18 is the crashing line. We can print the values of the local variable i (gdb) p i $1 = 35824It is equal to 35824. This should give you enough information for why you crashed. Now fix the crash_array function to prevent the crash. What to submitThe Makefile and the updated main.c program.Part 4: Exploring assembly codeIn this part of the homework you will explore assembly code of the sum function.Re-start GDB and set the breakpoint on the sum function (gdb) b sum Breakpoint 1 at 0x533: file main.c, line 13. (gdb) r Starting program: /home/aburtsev/doc/OS_Stuff/Flux/git/personal/classes/os-class/cs143a/hw/hello Breakpoint 1, sum (n=100) at main.c:13 13 unsigned long sum = 0;GDB can disassemble the code of the program with the disassemble (or disas) command (gdb) disas Dump of assembler code for function sum: 0x5655552d <+0>: push %ebp 0x5655552e <+1>: mov %esp,%ebp 0x56555530 <+3>: sub $0x10,%esp => 0x56555533 <+6>: movl $0x0,-0x4(%ebp) 0x5655553a <+13>: movl $0x0,-0x8(%ebp) 0x56555541 <+20>: jmp 0x5655554d Unfortunately, the default syntax for the disassembly is AT&T. So the first thing you do is switch to Intel syntax we were using in class. Invoke the set disassembly-flavor intel command and disassembly the function again (gdb) set disassembly-flavor intel (gdb) disas Dump of assembler code for function sum: 0x5655552d <+0>: push ebp 0x5655552e <+1>: mov ebp,esp 0x56555530 <+3>: sub esp,0x10 => 0x56555533 <+6>: mov DWORD PTR [ebp-0x4],0x0 0x5655553a <+13>: mov DWORD PTR [ebp-0x8],0x0 0x56555541 <+20>: jmp 0x5655554dYour assignment is to explain every line of the assembly dump and submit it as a text file. Note, while you can understand what the assembly code is doing, you can also use GDB to step through the assembly code of the program. You can simply type (gdb) layout asmto switch TUI to the assmebly mode. Then you can use two new GDB commands: nexti (ni) and stepi (si) for stepping through the assembly instructions (i stands for "instruction", e.g., next instruction and step instrucion). (gdb) si You can do the split layout to show both assembly and the C source code windows at the same time. (gdb) layout splitYou can also show the registers on the screen (don't forget the list of TUI commands --- they can be very handy). (gdb) layout regs What to submitA text file that explains every line of the assembly dump.Part 5: Exploring calling conventionsIn this final part of the homework you will explore the calling conventions for passing the arguments on the stack.Re-start GDB and set the breakpoint on the sum function, and run the program. (gdb) b sum Breakpoint 1 at 0x533: file main.c, line 13. (gdb) r Starting program: /home/aburtsev/doc/OS_Stuff/Flux/git/personal/classes/os-class/cs143a/hw/hello Breakpoint 1, sum (n=100) at main.c:13 13 unsigned long sum = 0; (gdb)Use the info command to see the content of the registers (gdb) info regs Undefined info command: "regs". Try "help info". (gdb) info reg eax 0xf7fb0dd8 -134541864 ecx 0xffffc8c0 -14144 edx 0xffffc8e4 -14108 ebx 0x0 0 esp 0xffffc874 0xffffc874 ebp 0xffffc884 0xffffc884 esi 0xf7faf000 -134549504 edi 0x0 0 eip 0x56555533 0x56555533Use the x command to inspect the stack (gdb) x/24x $esp 0xffffc874: 0xf7faf000 0x00000000 0xf7e0760b 0xf7faf3fc 0xffffc884: 0xffffc8a8 0x56555572 0x00000064 0x00000001 0xffffc894: 0xffffc954 0xffffc95c 0x565555c1 0xf7fe59b0 0xffffc8a4: 0xffffc8c0 0x00000000 0xf7defe81 0xf7faf000 0xffffc8b4: 0xf7faf000 0x00000000 0xf7defe81 0x00000001 0xffffc8c4: 0xffffc954 0xffffc95c 0xffffc8e4 0x00000001 Explain every value from the dump that you get (similar to the one above but generated on your machine), i.e., which one is a local variable inside sum, which is the frame pointer, what are the arguments to sum and so on. The more values you can exaplain the better. Hints: start from main, i.e., set a breakpoint on main and see how the values on the stack are changing, step through main and see what gets pushed on the stack. Note: GCC generates the code that keeps the stack aligned at 16 bytes, hence instead of allocating 4 bytes for the local variable in sum it allocates 16. What to submitSubmit a text file explaining every value on the stack.Submit your workSubmit your solution through Gradescope Gradescope CS143A Principles of Operating Systems. Place each part of the assignment into folders with name part2, part3, part4 or part5, then pack it into a zip archive and submit it. Please name makefiles as Makefile1, Makefile2, Makefile3 for the part 2. Please submit .txt files for parts 4 and 5. You can resubmit as many times as you wish. If you have any problems with the structure the autograder will tell you. Part 4 and 5 of the assignment will be graded manually after the deadline. The structure of the zip file should be the following: / - /part2 - Makefile1 - Makefile2 - Makefile3 - main.c - /part3 - Makfile - main.c - /part4 - explanation.txt - /part5 - explanation.txt |
|||
![]() |
|||
Updated: October, 2019
|