Tensor 基本概念#

支撐 PyTorch 的 ATen Tensor 庫是一個簡單的 Tensor 庫，它直接在 C++17 中暴露了 Torch 的 Tensor 操作。ATen 的 API 是從 PyTorch 使用的相同宣告自動生成的，因此這兩個 API 將隨著時間的推移而保持同步。

Tensor 型別是動態解析的，因此 API 是通用的，不包含模板。也就是說，只有一個 Tensor 型別。它可以包含 CPU Tensor 或 CUDA Tensor，並且 Tensor 可以是 Double、Float、Int 等型別。這種設計使得編寫通用程式碼變得容易，而無需對所有內容進行模板化。

有關提供的 API，請參閱 https://pytorch.com.tw/cppdocs/api/namespace_at.html#functions。節選

Tensor atan2(const Tensor & other) const;
Tensor & atan2_(const Tensor & other);
Tensor pow(Scalar exponent) const;
Tensor pow(const Tensor & exponent) const;
Tensor & pow_(Scalar exponent);
Tensor & pow_(const Tensor & exponent);
Tensor lerp(const Tensor & end, Scalar weight) const;
Tensor & lerp_(const Tensor & end, Scalar weight);
Tensor histc() const;
Tensor histc(int64_t bins) const;
Tensor histc(int64_t bins, Scalar min) const;
Tensor histc(int64_t bins, Scalar min, Scalar max) const;

還提供了原地操作，並且這些操作總是以 _ 作為字尾，以表明它們將修改 Tensor。

高效訪問 Tensor 元素#

在使用 Tensor 範圍的操作時，動態分派的相對成本非常小。然而，在某些情況下，尤其是在您自己的核心中，需要高效的逐元素訪問，而在逐元素迴圈中進行動態分派的成本非常高。ATen 提供了訪問器，這些訪問器透過一次動態檢查來建立，以確保 Tensor 的型別和維度數量正確。然後，訪問器公開了一個 API，用於高效地訪問 Tensor 元素。

訪問器是 Tensor 的臨時檢視。它們僅在其所檢視的 Tensor 的生命週期內有效，因此應僅在函式內部區域性使用，類似於迭代器。

請注意，訪問器在核心函式中不相容 CUDA Tensor。相反，您必須使用打包訪問器，它的行為方式相同，但會複製 Tensor 元資料而不是指向它。

因此，建議對 CPU Tensor 使用訪問器，對 CUDA Tensor 使用打包訪問器。

CPU 訪問器#

torch::Tensor foo = torch::rand({12, 12});

// assert foo is 2-dimensional and holds floats.
auto foo_a = foo.accessor<float,2>();
float trace = 0;

for(int i = 0; i < foo_a.size(0); i++) {
  // use the accessor foo_a to get tensor data.
  trace += foo_a[i][i];
}

CUDA 訪問器#

__global__ void packed_accessor_kernel(
    torch::PackedTensorAccessor64<float, 2> foo,
    float* trace) {
  int i = threadIdx.x;
  gpuAtomicAdd(trace, foo[i][i]);
}

torch::Tensor foo = torch::rand({12, 12});

// assert foo is 2-dimensional and holds floats.
auto foo_a = foo.packed_accessor64<float,2>();
float trace = 0;

packed_accessor_kernel<<<1, 12>>>(foo_a, &trace);

除了 PackedTensorAccessor64 和 packed_accessor64 之外，還有相應的 PackedTensorAccessor32 和 packed_accessor32，它們使用 32 位整數進行索引。這在 CUDA 上可以更快，但可能會導致索引計算溢位。

請注意，模板可以包含其他引數，例如指標限制和用於索引的整數型別。有關訪問器和打包訪問器的完整模板描述，請參閱文件。

使用外部建立的資料#

如果您已經在記憶體（CPU 或 CUDA）中分配了 Tensor 資料，可以在 ATen 中將該記憶體視為 Tensor。

float data[] = { 1, 2, 3,
                 4, 5, 6 };
torch::Tensor f = torch::from_blob(data, {2, 3});

這些 Tensor 不能調整大小，因為 ATen 不擁有記憶體，但否則它們的行為與普通 Tensor 相同。

標量和零維 Tensor#

除了 Tensor 物件之外，ATen 還包括表示單個數字的 Scalar。與 Tensor 一樣，Scalar 是動態型別的，並且可以包含 ATen 的任何一種數字型別。Scalar 可以從 C++ 數字型別隱式構造。需要 Scalar 是因為某些函式（如 addmm）接受數字和 Tensor，並期望這些數字與 Tensor 具有相同的動態型別。它們也用於 API 中，以指示函式將始終返回 Scalar 值的位置，例如 sum。

namespace torch {
Tensor addmm(Scalar beta, const Tensor & self,
             Scalar alpha, const Tensor & mat1,
             const Tensor & mat2);
Scalar sum(const Tensor & self);
} // namespace torch

// Usage.
torch::Tensor a = ...
torch::Tensor b = ...
torch::Tensor c = ...
torch::Tensor r = torch::addmm(1.0, a, .5, b, c);

除了 Scalar 之外，ATen 還允許 Tensor 物件為零維。這些 Tensor 包含一個單一值，並且它們可以是較大 Tensor 中單個元素的引用。它們可以被用在任何期望 Tensor 的地方。它們通常由 select 等操作建立，這些操作會減小 Tensor 的維度。

torch::Tensor two = torch::rand({10, 20});
two[1][2] = 4;
// ^^^^^^ <- zero-dimensional Tensor