torch.cuda.comm.broadcast_coalesced#

torch.cuda.comm.broadcast_coalesced(tensors, devices, buffer_size=10485760)[原始碼]#

將一系列 tensor 廣播到指定的 GPU。

較小的 tensor 會先合併到一個緩衝區中，以減少同步次數。

引數

tensors (sequence) – 要廣播的 tensor。必須在同一裝置上，可以是 CPU 或 GPU。
devices (Iterable[torch.device, str 或 int]) – GPU 裝置的可迭代物件，將在其中進行廣播。
buffer_size (int) – 用於合併的緩衝區的最大大小

返回

一個包含 tensor 副本的元組，放置在 devices 上。

文件

訪問全面的 PyTorch 開發者文件

檢視文件

為初學者和高階開發者提供深入的教程

檢視教程

查詢開發資源並讓您的問題得到解答

檢視資源