CSE 130-02 计算机原则

Assignment 03
CSE 130-02: Principles of Computer System Design, Spring 2021
Due: Thursday, June 3 at 11:59PM
Goals
The goals for Assignment 3 are to create an HTTP reverse proxy with cache. That means that
your program will act as an HTTP server but will actually forward received requests to the
requested object’s origin server in case the object is not cached locally, and then forward the
response from the origin server to the original client. In order to avoid sending requests to
servers and thus reduce response time and traffic on the network, the proxy cache (or proxy
server) will keep copies of objects returned from previous GET requests in a local cache in
memory. If a future GET request is made for a cached object, the proxy will return the cached
copy — as long as the cached copy is recent enough. Thus your program will act both as server
and a client.
We will provide you with a working HTTP server executable that you can use to test your proxy
server, and we recommend you use this instead of your Assignment 1 HTTP server
implementation (the provided executable implements more features). Your main goal is to
implement the reverse proxy. That said, your reverse proxy server will have to handle the same
types of requests that you have seen before (GET, HEAD and PUT), and must support
persistent connections.
As usual, you must have a design document along with your README.md in your git repository.
Your code must build an executable named httpproxy using make.
Programming assignment: HTTP reverse proxy
Design document
Before writing code for this assignment, as with every other assignment, you must write up a
design document. Your design document must be called DESIGN.pdf, and must be in PDF
format (you can easily convert other document formats, including plain text, to PDF).
Your design document should describe the design of your code in enough detail that a
knowledgeable programmer could duplicate your work. This includes descriptions of the data
structures you use, non-trivial algorithms and formulas, and a description of each function with
its purpose, inputs, outputs, and assumptions it makes about inputs or outputs.
Write your design document before you start writing code. It’ll make writing code a lot
easier. It will help you think about what you need to do for this assignment, and it can help you
identify possible problems with your planned implementation before you have invested hours in
it. Also, if you want help with your code, the first thing we’re going to ask for is your design
document. We’re happy to help you with the design, but we can’t debug code without a design
any more than you can.
Since a lot of the system in Assignment 3 is similar to Assignments 1 and 2, we expect you’re
going to “copy” a good part of your design from your previous designs. This is fine, as long as
it’s your previous assignment you’re copying from. This will let you focus on the new stuff in
Assignment 3.
Start early on the design. This program can be built independently of previous assignments, but
if you didn’t get the previous assignments to work, you will probably need help. Please see the
course staff ASAP for help in that case.
TESTING AND ASSIGNMENT QUESTION
In the design document, you will also describe the testing you did on your program and answer
any short questions below. The testing can be unit testing (testing of individual functions or
smaller pieces of the program) or whole-system testing, which involves running your code in
particular scenarios.
For Assignment 3, please answer the following questions:
● Using a large file (e.g. 100 MiB — adjust according to your computer’s capacity) and the
provided HTTP server:
○ Start the server with only one thread in the same directory as the large file (so
that it can provide it to requests);
○ Start your proxy with no cache and request the file ten times. How long does it
take?
○ Now stop your proxy and start again, this time with cache enabled for that file.
Request the same file ten times. How long does it take?
● Aside from caching, what other uses can you consider for a reverse proxy?
Program functionality
You may not use standard libraries for HTTP; you have to implement this yourself. You may use
standard networking (and file system) system calls, but not any FILE * calls except for printing
to the screen (e.g., error messages). Note that string functions like sprintf() and sscanf()
aren’t FILE * calls.
Your code must be in C and be compiled with no errors or warnings using the following flags:
-Wall -Wextra -Wpedantic -Wshadow
Once again your program will take a port number as a parameter, but this time it will be followed
by another port number to identify the address of the HTTP server. Unlike Assignment 2, your
reverse proxy does not need to be multithreaded. Those two parameters can be
accompanied by three optional parameters that configure the cache: “c”, a non-negative integer
specifying the capacity of the cache (the number of items that can be stored); “m”, a
non-negative integer specifying the maximum file size to be stored in the cache; “u”, a flag
option that enables Least Recently Used (LRU) replacement policy — the default replacement
policy will be First In First Out (FIFO).. The default values for “c” and “m” will be 3 and 65536,
respectively. The following examples are then valid:
● ./httpproxy 9090 8080 -c 4 -u
○ Starts httpproxy on port 9090
○ Communicates with a server running on port 8080
○ Cache can hold four files
○ Each file in cache can have at most 65536 bytes
○ The replacement policy is LRU
● ./httpproxy 8181 1234
○ Starts httpproxy on port 8181
○ Communicates with a server running on port 1234
○ Cache can hold three files
○ Each file in cache can have at most 65536 bytes
○ The replacement policy defaults to FIFO (no “u” option is given)
● ./httpproxy -m 100 7373 2525
○ Starts httpproxy on port 7373
○ Communicates with a server running on port 2525
○ Cache can hold three files
○ Each file in cache can have at most 100 bytes
○ The replacement policy defaults to FIFO (no “u” option is given)
● ./httpproxy 8383 -c 1 -u 3434 -m 100000000
○ Starts httpproxy on port 8383
○ Communicates with a server running on port 3434
○ Cache can hold one file
○ Each file in cache can have at most 100000000 bytes
○ The replacement policy is LRU
● ./httpproxy 7654 -m 512 -c 4 1234
○ Starts httpproxy on port 7654
○ Communicates with a server running on port 1234
○ Cache can hold four files
○ Each file in cache can have at most 512 bytes
○ The replacement policy defaults to FIFO (no “u” option is given)
Proxying
You are implementing a reverse proxy, which means that the client is not aware that it is
communicating through a proxy. That means your proxy will receive requests from the client in
the same way that your previous server received them. The proxy will forward requests to the
server when the requested object is not cached locally, then receive the corresponding
responses from the server and forward them to the client. When the proxy receives a GET
request for a resource that is cached, it should verify that the cached copy is not obsolete by
sending a HEAD request to the server and checking the Last-Modified header line, whose value
is a date/time for when the requested object was last modified. The provided HTTP server
already implements this header so you can use it directly. An example of the Last-Modified
header is given below.
Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT
Your reverse proxy should compare the last modified date with the age of the stored object. If
the stored object is the same age or is newer, the proxy can respond directly to the client
without forwarding to the server (it will still have to check if the file is not obsolete, however —
see Caching). Otherwise the proxy should forward the request to the server as usual.
Caching
Your proxy has a number of parameters defining how the cache will work. The first parameter,
specified by the option “s”, defines how many items the cache can hold. That is just the number
of files that can exist in the cache at once. If another file were to be added to the cache once it
reaches the maximum number of files, then the new file should replace one of the existing files
in the cache, according to a replacement policy.
The second parameter, specified by the option “m”, defines the maximum size of a file to be
cached. That is the size in bytes, as present in the Content-Length header line of a response to
a GET request. Your proxy should only add to its cache files that are equal in size or smaller
than this value. What if the file is larger? In that case the file will not be stored in the cache.
The third parameter is the replacement policy. The default policy is First In First Out (FIFO),
meaning that if a file has to be replaced, the file that was the oldest to be added to the cache
should be replaced. If the flag “u” is provided when starting the proxy, the policy will be instead
Least Recently Used (LRU), meaning that the file to be replaced is the one that has spent the
most time in the cache without being requested.
As an example of how these two replacement policies differ, consider three consecutives
requests for files A, B, and A again, with caching of both files. If a fourth request for file C
arrives, and the cache can only hold two files, FIFO will result in A leaving the cache, because it
was added first, while LRU will result in B leaving the cache, because A was requested more
recently.
Cached files should be held in memory, without creating any files in disk.
Testing your code
You should test your code on your own system. You can run the server and the proxy on
localhost using a port number above 1024 (e.g., 8888). Come up with requests you can
make of your server, and try them using curl(1). curl takes a URL and will make requests
for those. By default these requests are of the GET type. Some useful options:
● -T : makes curl send a PUT request, sending the contents of . does
not need to match the resource name in the URL;
● -I: makes curl send a HEAD request;
● -v: runs curl in verbose mode. By default curl will only print the body of the received
response (or the full response if it sent a HEAD request). In verbose mode curl will also
print the request and the response headers, identified by a ‘>’ if curl is sending it and a
‘<’ if curl is receiving it;
● -o curl saves the output to ;
curl can also send multiple requests if it has multiple URLs as parameters. As with the
previous assignment, connections are persistent and any connection may contain multiple
requests. Note that you’ll need to run your server in one terminal, and make requests using
curl in a separate terminal. You can see examples of curl commands in the Hints section. For
more on curl check https://everything.curl.dev/h...
Remember that in this assignment you are working on a proxy that will communicate
with a server, so your curl requests should be directed towards the proxy. That is, if the
server is running on port 8080 and the proxy is on port 9090, you would use the command
● curl http://localhost:9090/01234ab...
to request file 01234abcde01234 through the proxy.
You might also consider cloning a new copy of your repository (from GitLab) to a clean directory
to see if it builds properly, and runs as you expect. That’s an easy way to tell if your repository
has all of the right files in it. You can then delete the newly-cloned copy of the directory on your
local machine once you’re done with it.
README
As for previous assignments, your repository must include (README.md). The README.md file
should be short, and contain any instructions necessary for running your code. You should also
list limitations or issues in README.md, telling a user if there are any known issues with your
code.
Submitting your assignment
All of your files for Assignment 1 must be in the asgn1 directory in your git. When you push
your repository to GitLab@UCSC, make sure to include the following:
● There are no “bad” files in the asgn3 (i.e., object files).
● Your assignment builds in asgn3 using make to produce httpproxy.
● All required files (source files, DESIGN.pdf, README.md) are present in asgn3. Note
that you do not have to write a client program nor a server, only the proxy.
After pushing your submission to GitLab, submit your commit id to this Google Form:
https://forms.gle/QSh7EnfxuN4...
Hints
● Start early on the design. This program can be built independently of previous
assignments, but if you didn’t get the previous assignments to work, you will probably
need help. Please see the course staff ASAP for help in that case.
● Reuse your code from assignments 1 and 2. (No need to cite this)
● We have updated the skeleton code for this assignment. It now has one function to
create a socket as a client. You can use it, or you can just copy the new function to the
code base you have already built.
● Aggressively check for and report errors via a response. Transfers may be aborted on
errors. However, the server doesn’t exit on an error; it deals with the error appropriately
(sending the corresponding error code for the client if possible) and ends the connection
in that thread, leaving the thread free to handle another connection.
● Use getopt(3) to parse options from the command line. Read the man pages and see
examples on how it’s used. Ask the course staff if you have difficulty using it after
reading this material.
● Your commit must contain the following files:
○ README.md
○ DESIGN.pdf
○ Makefile
○ source file(s) for the server
It may not contain any .o files or other compiled files. It may not contain data files that
you create for testing either. You may, if you wish, include the “source” files for your
DESIGN.pdf in your repo, but you don’t have to. After running make, your directory must
contain httpproxy. Your source files must be .c files (and the corresponding headers,
if needed).
● You can use the strptime(3) function to parse the content of the Last-Modified header
line. To use it in your code, define __USE_XOPEN and then include time.h, in this order.
That will look like this in your code:

define __USE_XOPEN

include

● If you need help, use online documentation such as man pages and documentation on
Makefiles. If you still need help, ask the course staff.
Grading
As with all of the assignments in this class, we will be grading you on all of the material you turn
in, with the approximate distribution of points as follows: design document and answer to
assignment question (30%); coding practices (10%); functionality (60%).
Your code must compile to be graded. A submission that cannot compile may receive a
maximum grade of as low as 5%.

你可能感兴趣的:(后端)