Project Specification and Proposal:
Proposal Due Wed. Mar. 9, 2011, 1:30pm
Final Project Delivery Due Monday, May 9th 9am.
Specification
Learning Objective:
The project uses a parallel program, targeted in any language and on any of the parallel architectures we have at our disposal, as a means of applying ideas from class about parallel algorithms and parallel programming/computation. The problem domain, computation, implementation, and evaluation are not constrained, so you can explore and be creative. Focus on what interests you, and try hard to put the parallelism ideas into practice.
Project Goals:
Your goals in the project are (a) to challenge yourself in designing and implementing a parallel computation, and (b) to reveal to me (mostly in the concluding report) much of what you have learned in class. Remember that we have no final in this class, and so the project and report must serve for me to assess your overall learning and mastery of the material.
Tasks:
- Select a computation that interests you and that would benefit from parallelism. A few topics are listed below, and there are many similar topics to be found on the Web, but the best topic is one that interests you and about which you may have some special knowledge. You can use some of the ideas in Chapter 11 of the book on Capstone Project Ideas as well.
- Select a language to write the program in. At this point, we have covered basic material on PThreads and MPI. We will have units on OpenMP and CUDA/GPGPU programming after Spring Break. You can venture in another direction, but we may have to work to get the resources needed.
- Write an initial program for the problem; call it P1. The purpose of P1 is to have a solution from which you can revise and improve the computation. Do not be ambitious, but get a solution working quickly for the core computation. Accept a possibly naïve parallel solution. (A sequential solution is unacceptable except for unimportant parts of the computation like initialization.) Avoid embellishments and fancy I/O; accept constraints on the solution ("n is a power of 2").
- Using the CTA performance model presented in the book, your understanding of parallel computers, your knowledge of parallel algorithms, and your general CS smarts critique the P1 program. That is, identify places where there are inefficiencies. Improve P1 to create P2, or for some projects, create a competitive P2.
- Gather evidence about the performance of P1 and P2 to test your understanding of whether the "improvement" actually improved the program. Generally, this evidence will involve running your program on the cluster machine or other parallel processor.
- Write a report describing what you did, how you analyzed your program (Task 4), how you improved it and why, and what the experimental evidence was. Include a listing of your commented program. This should be complete and well written and include a bibliography and supporting performance evaluation. Your grade will come primarily from the report, although if you have some demonstration, I will consider that as well.
Proposal
Your proposal should be roughly two pages long (not including the bibliography), and should include the following information:
- Project Description: At least 2-3 paragraphs describing the goal of your project, including the problem domain, and what problem you are trying to solve. From the taxonomy of capstone ideas in the book, categorize your project as implementing and/or comparing implementations of one or more existing parallel algorithms, or competing with standard benchmarks, or in developing some new parallel computation.
- Plan of Attack: How are you going to execute through the six weeks against your project? You can use the books suggested list of components/steps for the categories of capstone projects as a guide in enumerating your strategy.
- Milestones: Indicate deliverables through the second half of the semester that you (and I) can use to determine if you are on track. These must be concrete and measurable, and you need at least two intermediate points. On these dates, you will submit brief (1 page) reports on your progress with specifics on how you fared against your milestones.
- Related Work/Literature Search: Based on your project, there will be related work. Even if, by the time of the proposal, you have not read and/or digested the related work, you should be able to at least give an annotated bibliography of papers, web sites, books, etc. that you will use in learning enough about the problem domain and existing parallel algorithms that pertain to your project.
Possible Topic Areas
Your Topic Here
The best project topic is one that interests you. If you have a topic you like, think about how a project might go, then send me an email outline of what you'd like to try. Look at the list of titles for Table 11.1 in the book to see if one of them sparks your interest.
Find an Application from one of the Sciences
The Professors in Physics, Chemistry, and Biology often have problems of significant computational size that could benefit from a parallel solution. They may already have a sequential solution that you can work to parallelize. Or they could have a parallel solution that you can evaluate and then create improvements for.
Commonly Cited Parallel Applications
The online literature is filled with examples that are generally thought to be good candidates for parallel solution: MPEG compression, Smith-Waterman genome matching computation, many body (gravitation) simulation, etc. The examples usually involve large amounts of data or computation, or both. The experiments needed to assess P1 versus P2 do not have to be large, only large enough to demonstrate whatever point is being made.
Game Searches
Because board games have a succinct description, they are a common example of a work queue approach; moreover, searching is a task that is often improved by parallelism. If you have an interest in games, implement a search for a board configuration with a certain property.
Graph Computations
The All Pairs Shortest Path was an easy computation in ZPL. Find a computation on a graph and develop a ZPL solution. For example, the closest pair of points (Euclidean) is a computation that often uses a k-d tree partitioning of the point space. A regular k-d tree is a structure that can easily be imposed on a linear array of points. Once partitioned, the points can be moved to individual processors with remap, and the closest computation performed locally, and with neighbors for points close to the boundaries.
Compete Against a Benchmark
There are a variety of parallel benchmark suites to be found on the Web, such as the NAS Parallel Benchmarks (NPB), SPEC HPC2002, Cray’s Application Kernel Matrix (AKM), or Stanford's SPLASH (including a Barnes-Hut N-Body). Some of these computations can be large, but one approach is to formulate your parallel solution using the principles from class, and lift the scalar code from the benchmark (assuming a compatible base language). In creating your P1 and P2 programs, you need to apply a significant, new idea that is not part of published examples of the benchmark. In addition to comparing your P1 and P2 performance, compare your result to a solution from the suite’s site.