16 episodes

Virtually all semiconductor market domains, including PCs, game consoles, mobile handsets, servers, supercomputers, and networks, are converging to concurrent platforms. There are two important reasons for this trend. First, these concurrent processors can potentially offer more effective use of chip space and power than traditional monolithic microprocessors for many demanding applications. Second, an increasing number of applications that traditionally used Application Specific Integrated Circuits (ASICs) are now implemented with concurrent processors in order to improve functionality and reduce engineering cost. The real challenge is to develop applications software that effectively uses these concurrent processors to achieve efficiency and performance goals.



The aim of this course is to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. In general, we refer to a processor as massively parallel if it has the ability to complete more than 64 arithmetic operations per clock cycle. Many commercial offerings from NVIDIA, AMD, and Intel already offer such levels of concurrency. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future implementations for these processors.




Visit the CS193G companion website for course materials.

Programming Massively Parallel Processors with CUDA Stanford

    • Technology
    • 4.0 • 1 Rating

Virtually all semiconductor market domains, including PCs, game consoles, mobile handsets, servers, supercomputers, and networks, are converging to concurrent platforms. There are two important reasons for this trend. First, these concurrent processors can potentially offer more effective use of chip space and power than traditional monolithic microprocessors for many demanding applications. Second, an increasing number of applications that traditionally used Application Specific Integrated Circuits (ASICs) are now implemented with concurrent processors in order to improve functionality and reduce engineering cost. The real challenge is to develop applications software that effectively uses these concurrent processors to achieve efficiency and performance goals.



The aim of this course is to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. In general, we refer to a processor as massively parallel if it has the ability to complete more than 64 arithmetic operations per clock cycle. Many commercial offerings from NVIDIA, AMD, and Intel already offer such levels of concurrency. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future implementations for these processors.




Visit the CS193G companion website for course materials.

    • video
    1. Introduction to Massively Parallel Computing (March 30, 2010)

    1. Introduction to Massively Parallel Computing (March 30, 2010)

    Jared Hoberock of NVIDIA gives the introductory lecture to CS 193G: Programming Massively Parallel Processors. (March 30, 2010)

    • 1 hr 7 min
    • video
    2. Introduction to CUDA (April 1, 2010)

    2. Introduction to CUDA (April 1, 2010)

    science, technology, computer science, CS, software engineering, programming, parallel processors, CUDA, language, code, Computers, coding, MP0, MP1, hardware, software, memory management, GPU, CPU, memory, parallel code, kernel, threads, launch, thread b

    • 1 hr 15 min
    • video
    3. CUDA Threads & Atomics (April 6, 2010)

    3. CUDA Threads & Atomics (April 6, 2010)

    Atomic operations in CUDA and the associated hardware are discussed. (April 6, 2010)

    • 49 min
    • video
    4. CUDA Memories (April 8, 2010)

    4. CUDA Memories (April 8, 2010)

    Jared Hoberock of NVIDIA lectures on CUDA memory spaces for CS 193G: Programming Massively Parallel Processors. (April 8, 2010)

    • 57 min
    • video
    5. Performance Considerations (April 13, 2010)

    5. Performance Considerations (April 13, 2010)

    Lukas Biewald of Delores Labs, discusses performance considerations including: memory coalescing, shared memory bank conflicts, control-flow divergence, occupancy, and kernel launch overheads. (April 13, 2010)

    • 59 min
    • video
    6. Parallel Patterns I (April 15, 2010)

    6. Parallel Patterns I (April 15, 2010)

    Students are taught how to effectively program massively parallel processors using the CUDA C programming language. Students also develop familiarity with the language itself and are exposed to the architecture of modern GPUs. (April 15, 2010)

    • 37 min

Customer Reviews

4.0 out of 5
1 Rating

1 Rating

Listeners Also Subscribed To

More by Stanford