Programming Assignment III: OpenMP Programming

The purpose of this assignment is to familiarize yourself with OpenMP programming.

Get the source code:

$ wget https://nctu-sslab.github.io/PP-s21/HW3/HW3.zip
$ unzip HW3.zip -d HW3
$ chmod -R 700 HW3 # avoid cheating
$ cd HW3

1. Part 1: Parallelizing Conjugate Gradient Method with OpenMP

Conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations. It is often used to solve partial differential equations, or applied on some optimization problems. You may get more information on Wikipedia.

Please enter the part1 folder:

$ cd part1

You can build and run the CG program by:

$ make; ./cg

In this assignment, you are asked to parallelize a serial implementation of the conjugate gradient method using OpenMP. The serial implementation is in cg.c and cg_impl.c. Please refer to the README file for more information about the code.

In order to parallelize the conjugate gradient method, you may want to use profiling tools to help you explain the performance difference before and after your modification. As it probably just works ineffectively if you simply add a parallel-for directive for each for loop, you may want to use profiling tools, such as perf, time, gprof, Valgrind (not limited to these tools), to profile your program to better understand how much the improvement is from the code snippets you have modified.

The following instructions may help you finish this part:

See the requirements to finish this part.

You can also run our grader via: ./cg_grader, which reports the correctness of your program and a performance score.

Part 2: Parallelizing PageRank Algorithm with OpenMP

image

In this part, you will implement a simple implementation of page rank. A good implementation of this assignment will be able to run these algorithms on graphs containing hundreds of millions of edges on a multi-core machine in only seconds.

Please enter the part2 folder:

$ cd part2

Read the following sections and accomplish the requirements of this part.

2.1 Background: Representing Graphs

The starter code operates on directed graphs, whose implementation you can find in common/graph.h and common/graph_internal.h. We recommend you begin by understanding the graph representation in these files. A graph is represented by an array of edges (both outgoing_edges and incoming_edges), where each edge is represented by an integer describing the id of the destination vertex. Edges are stored in the graph sorted by their source vertex, so the source vertex is implicit in the representation. This makes for a compact representation of the graph, and also allows it to be stored contiguously in memory. For example, to iterate over the outgoing edges for all nodes in the graph, you’d use the following code which makes use of convenient helper functions defined in common/graph.h (and implemented in common/graph_internal.h):

for (int i=0; i<num_nodes(g); i++) {
    // Vertex is typedef'ed to an int. Vertex* points into g.outgoing_edges[]
    const Vertex* start = outgoing_begin(g, i);
    const Vertex* end = outgoing_end(g, i);
    for (const Vertex* v=start; v!=end; v++)
        printf("Edge %u %u\n", i, *v);
}

2.2 Task: Implementing Page Rank

As a simple warm up exercise to get comfortable using the graph data structures, and to get acquainted with a few OpenMP basics, we’d like you to begin by implementing a basic version of the well-known page rank algorithm.

Please take a look at the pseudocode provided to you in the function pageRank(), in the file page_rank/page_rank.cpp. You should implement the function, parallelizing the code with OpenMP. Just like any other algorithm, first identify independent work and any necessary synchronization.

You can run your code, checking correctness and performance against the staff reference solution using:

./pr <PATH_TO_GRAPH_DIRECTORY>/com-orkut_117m.graph 

If you are working on our workstation machines, we’ve located a copy of the graph and some other graphs at /HW3/graphs/. You can also download the graphs from http://sslab.cs.nctu.edu.tw/~acliu/all_graphs.tgz. (But be careful, this is a 3GB download. Do not replicate the data on the workstations or download the zipped tar file to the workstations for conservation of storage space.)

Some interesting real-world graphs include:

Some useful synthetic, but large graphs include:

There are also some very small graphs for testing. If you look in the /tools directory of the starter code, you’ll notice a useful program called graphTools.cpp that can be used to make your own graphs as well.

By default, the pr program runs your page rank algorithm with an increasing number of threads (so you can assess speedup trends). However, since runtimes at low core counts can be long, you can explicitly specify the number of threads to only run your code under a single configuration.

./pr %GRAPH_FILENAME% 8

Your code should handle cases where there are no outgoing edges by distributing the probability mass on such vertices evenly among all the vertices in the graph. That is, your code should work as if there were edges from such a node to every node in the graph (including itself). The comments in the starter code describe how to handle this case.

You can also run our grader via: ./pr_grader <PATH_TO_GRAPH_DIRECTORY>, which reports the correctness of your program and a performance score for four specific graphs.

3. Requirements

  1. Part 1: You will modify only cg_impl.c to improve the performance of the CG program, such as inserting OpenMP pragmas to parallelize parts of the program. Of course, you may add any other variables or functions if necessary.
  2. Part 2: You will modify only page_rank.cpp to implement and parallelize the algorithms with OpenMP.

4. Grading Policies

NO CHEATING!! You will receive no credit if you are found cheating.

We will judge your code with the following aspects:

Total of 100%:

5. Evaluation Platform

Your program should be able to run on UNIX-like OS platforms. We will evaluate your programs on the workstations dedicated for this course. You can access these workstations by ssh with the following information.

The workstations are based on Ubuntu 18.04 with Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz processors. g++-10, clang++-11, and libomp5 have been installed.

IP Port User Name Password
140.113.215.195 37076 ~ 37080, 37091~37094 {student_id} {Provided by TA}

ATTENTION: Never touch 37095. It is for NIS and NFS.

Login example:

$ ssh <student_id>@140.113.215.195 -p <port>

You can run the graders (built from the source code) on the workstation. We will use the same grader to judge your code:

$ ./cg_grader
$ ./pr_grader  /HW3/graphs/

6. Submission

All your files should be organized in the following hierarchy and zipped into a .zip file, named HW3_xxxxxxx.zip, where xxxxxxx is your student ID.

Directory structure inside the zipped file:

Zip the files with the following command:

$ zip HW3_xxxxxxx.zip cg_impl.c page_rank.cpp

You can use the testing script test_hw3 to check your answer for reference only. Run test_hw3 in a dictionary that contains your HW3_XXXXXXX.zip file on the workstation. test_hw3 checks if the zip file is correct, and runs each graders.

Be aware that test_hw3 runs very heavy work which results in high CPU usage, we recommend you first use ./cg_grader and ./pr_grader to check your programs at local. Only use test_hw3 before your submission.

$ test_hw3

You will get NO POINT if your ZIP’s name is wrong or the ZIP hierarchy is incorrect.

Be sure to upload your zipped file to new E3 e-Campus system by the due date.

Due Date: 2021/04/23 00:00

7. References