16 episodes

Virtually all semiconductor market domains, including PCs, game consoles, mobile handsets, servers, supercomputers, and networks, are converging to concurrent platforms. There are two important reasons for this trend. First, these concurrent processors can potentially offer more effective use of chip space and power than traditional monolithic microprocessors for many demanding applications. Second, an increasing number of applications that traditionally used Application Specific Integrated Circuits (ASICs) are now implemented with concurrent processors in order to improve functionality and reduce engineering cost. The real challenge is to develop applications software that effectively uses these concurrent processors to achieve efficiency and performance goals.

The aim of this course is to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. In general, we refer to a processor as massively parallel if it has the ability to complete more than 64 arithmetic operations per clock cycle. Many commercial offerings from NVIDIA, AMD, and Intel already offer such levels of concurrency. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future implementations for these processors.

Visit the CS193G companion website for course materials.

Programming Massively Parallel Processors with CUDA Stanford University

- Technology

- 9 JUN 2010
- video
16. Parallel Sorting (April 20, 2010)

16. Parallel Sorting (April 20, 2010)

Michael Garland, of NVIDIA Research, discusses sorting methods in order to make searching, categorization, and building of data structures in parallel easier. (April 20, 2010)
- 3 sec
- 9 JUN 2010
- video
15. Optimizing Parallel GPU Performance (May 20, 2010)

15. Optimizing Parallel GPU Performance (May 20, 2010)

John Nicholis discusses how to optimize with Parallel GPU Performance. (May 20, 2010)
- 4 sec
- 9 JUN 2010
- video
14. Path Planning System on the GPU (May 18, 2010)

14. Path Planning System on the GPU (May 18, 2010)

Avi Bleiweiss delivers a lecture on the path planning system on the GPU. (May 18, 2010)
- 3 sec
- 27 MAY 2010
- video
6. Parallel Patterns I (April 15, 2010)

6. Parallel Patterns I (April 15, 2010)

Students are taught how to effectively program massively parallel processors using the CUDA C programming language. Students also develop familiarity with the language itself and are exposed to the architecture of modern GPUs. (April 15, 2010)
- 2 sec
- 26 MAY 2010
- video
12. NVIDIA OptiX: Ray Tracing on the GPU (May 11, 2010)

12. NVIDIA OptiX: Ray Tracing on the GPU (May 11, 2010)

Steven Parker, Director of High Performance Computing and Computational Graphics at NVIDIA, speaks about ray tracing. (May 11, 2010)
- 4 sec
- 26 MAY 2010
- video
13. Future of Throughput (May 13, 2010)

13. Future of Throughput (May 13, 2010)

William Dally guest-lectures on the end of denial architecture and the rise of throughput computing. (May 13, 2010)
- 4 sec