<PREV> <INDEX> <NEXT>

Massively Parallel Computers - GPU Computing

Hardware architecture (Nvidia, AMD)

fpg1.math - NVIDIA Tesla C2075:

cpu-gpu


GPU architecture:

Streaming Processors (SP): share control logic and instruction cache ->
Streaming Multiplocessors (SM) ->
building block

gpu

GPU RAM is different from system RAM: memory not shared between system and GPU


Software

Programming

CUDA:

Steps:

Memory model:
memory


Compile and execute

c++  -c -> get the objects
nvcc -O -> compile the device kernel
c++ -> final binary



<PREV> <INDEX> <NEXT>