Configurations of Shared Memory
- Each SM has 64 KB of on-chip memory. The shared memory and L1 cache share this hardware resource.
- For Kepler devices, L1 cache is used for register spills.
Per-device Configuration
- API:
cudaDeviceSetCacheConfig
Per-kernel Configuration
- A per-kernel configuration can also override the device-wide setting.
- API:
cudaFuncSetCacheConfig
Configuration of Bank Size
- A large bank size may yield higher bandwidth for shared memory access, but may result in more bank conflicts depending on the application’s shared memory access patterns.
- You can use the following function to set a new bank size on devices with configurable shared memory banks:
cudaDeviceSetSharedMemConfig()