4、TORCH.CUDA

这个包增加了对 CUDA 张量类型的支持,它实现了与 CPU 张量相同的功能,但它们利用 GPU 进行计算。

它被延迟初始化,因此您可以随时导入它,并使用 is_available() 来确定您的系统是否支持 CUDA。CUDA 语义有更多关于使用 CUDA 的细节。

StreamContext

Context-manager that selects a given stream.

 

can_device_access_peer

Checks if peer access between two devices is possible.

 

current_blas_handle

Returns cublasHandle_t pointer to current cuBLAS handle

 

current_device

Returns the index of a currently selected device.

 

current_stream

Returns the currently selected Stream for a given device.

 

default_stream

Returns the default Stream for a given device.

 

device

Context-manager that changes the selected device.

 

device_count

Returns the number of GPUs available.

 

device_of

Context-manager that changes the current device to that of given object.

 

get_arch_list

Returns list CUDA architectures this library was compiled for.

 

get_device_capability

Gets the cuda capability of a device.

 

get_device_name

Gets the name of a device.

 

get_device_properties

Gets the properties of a device.

 

get_gencode_flags

Returns NVCC gencode flags this library was compiled with.

 

init

Initialize PyTorch’s CUDA state.

 

ipc_collect

Force collects GPU memory after it has been released by CUDA IPC.

 

is_available

Returns a bool indicating if CUDA is currently available.

 

is_initialized

Returns whether PyTorch’s CUDA state has been initialized.

 

set_device

Sets the current device.

 

set_stream

Sets the current stream.This is a wrapper API to set the stream.

 

stream

Wrapper around the Context-manager StreamContext that selects a given stream.

 

synchronize

Waits for all kernels in all streams on a CUDA device to complete.

Random Number Generator

 

get_rng_state

Returns the random number generator state of the specified GPU as a ByteTensor.

 

get_rng_state_all

Returns a list of ByteTensor representing the random number states of all devices.

 

set_rng_state

Sets the random number generator state of the specified GPU.

 

set_rng_state_all

Sets the random number generator state of all devices.

 

manual_seed

Sets the seed for generating random numbers for the current GPU.

 

manual_seed_all

Sets the seed for generating random numbers on all GPUs.

 

seed

Sets the seed for generating random numbers to a random number for the current GPU.

 

seed_all

Sets the seed for generating random numbers to a random number on all GPUs.

 

initial_seed

Returns the current random seed of the current GPU.

Communication collectives

comm.broadcast

Broadcasts a tensor to specified GPU devices.

comm.broadcast_coalesced

Broadcasts a sequence tensors to the specified GPUs.

comm.reduce_add

Sums tensors from multiple GPUs.

comm.scatter

Scatters tensor across multiple GPUs.

comm.gather

Gathers tensors from multiple GPU devices.

Streams and events

 

Stream

Wrapper around a CUDA stream.

 

Event

Wrapper around a CUDA event.

Memory management

 

empty_cache

Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other GPU application and visible in nvidia-smi.

 

list_gpu_processes

Returns a human-readable printout of the running processes and their GPU memory use for a given device.

 

memory_stats

Returns a dictionary of CUDA memory allocator statistics for a given device.

 

memory_summary

Returns a human-readable printout of the current memory allocator statistics for a given device.

 

memory_snapshot

Returns a snapshot of the CUDA memory allocator state across all devices.

 

memory_allocated

Returns the current GPU memory occupied by tensors in bytes for a given device.

 

max_memory_allocated

Returns the maximum GPU memory occupied by tensors in bytes for a given device.

 

reset_max_memory_allocated

Resets the starting point in tracking maximum GPU memory occupied by tensors for a given device.

 

memory_reserved

Returns the current GPU memory managed by the caching allocator in bytes for a given device.

 

max_memory_reserved

Returns the maximum GPU memory managed by the caching allocator in bytes for a given device.

 

set_per_process_memory_fraction

Set memory fraction for a process.

 

memory_cached

Deprecated; see memory_reserved().

 

max_memory_cached

Deprecated; see max_memory_reserved().

 

reset_max_memory_cached

Resets the starting point in tracking maximum GPU memory managed by the caching allocator for a given device.

 

reset_peak_memory_stats

Resets the “peak” stats tracked by the CUDA memory allocator.

NVIDIA Tools Extension (NVTX)

nvtx.mark

Describe an instantaneous event that occurred at some point.

nvtx.range_push

Pushes a range onto a stack of nested range span.

nvtx.range_pop

Pops a range off of a stack of nested range spans.

 

 

 

你可能感兴趣的:(pytorch)