torch.nn.init#

創建於: 2019年6月11日 | 最後更新於: 2022年7月7日

警告

本模組中的所有函式都旨在用於初始化神經網路引數，因此它們都在 torch.no_grad() 模式下執行，並且不會被 autograd 考慮在內。

torch.nn.init.calculate_gain(nonlinearity, param=None)[source]#

返回給定非線性函式的推薦增益值。

值如下

非線性函式	增益
Linear / Identity	$1$
Conv{1,2,3}D	$1$
Sigmoid	$1$
Tanh	$\frac{5}{3}$
ReLU	$\sqrt{2}$
Leaky Relu	$\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}$
SELU	$\frac{3}{4}$

警告

為了實現 Self-Normalizing Neural Networks，您應該使用 nonlinearity='linear' 而不是 nonlinearity='selu'。這使得初始權重具有 1 / N 的方差，這對於在正向傳播中產生穩定的不動點是必需的。相比之下，SELU 的預設增益會犧牲歸一化效果，以換取更穩定的矩形層梯度流。

引數

nonlinearity (Literal['linear', 'conv1d', 'conv2d', 'conv3d', 'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d', 'sigmoid', 'tanh', 'relu', 'leaky_relu', 'selu']) – 非線性函式（nn.functional 名稱）
param (Optional[Union[int, float]]) – 非線性函式的可選引數

返回型別

浮點數

示例

>>> gain = nn.init.calculate_gain(
...     "leaky_relu", 0.2
... )  # leaky_relu with negative_slope=0.2

torch.nn.init.uniform_(tensor, a=0.0, b=1.0, generator=None)[source]#

使用從均勻分佈中抽取的數值填充輸入張量。

$\mathcal{U}(a, b)$ .

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
a (float) – 均勻分佈的下界
b (float) – 均勻分佈的上界
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.uniform_(w)

torch.nn.init.normal_(tensor, mean=0.0, std=1.0, generator=None)[source]#

使用從正態分佈中抽取的數值填充輸入張量。

$\mathcal{N}(\text{mean}, \text{std}^2)$ .

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
mean (float) – 正態分佈的均值
std (float) – 正態分佈的標準差
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.normal_(w)

torch.nn.init.constant_(tensor, val)[source]#

用值 $\text{val}$ 填充輸入張量。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
val (float) – 用於填充張量的值

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.constant_(w, 0.3)

torch.nn.init.ones_(tensor)[source]#

用標量值 1 填充輸入張量。

引數: tensor (Tensor) – 一個 n 維 torch.Tensor
返回型別: 張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.ones_(w)

torch.nn.init.zeros_(tensor)[source]#

用標量值 0 填充輸入張量。

引數: tensor (Tensor) – 一個 n 維 torch.Tensor
返回型別: 張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.zeros_(w)

torch.nn.init.eye_(tensor)[source]#

用單位矩陣填充二維輸入 Tensor。

在 Linear 層中保留輸入的同一性，儘可能多地保留輸入。

引數: tensor (Tensor) – 一個二維 torch.Tensor
返回型別: 張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.eye_(w)

torch.nn.init.dirac_(tensor, groups=1)[source]#

用狄拉克 delta 函式填充 {3, 4, 5} 維的輸入 Tensor。

在 Convolutional 層中保留輸入的同一性，儘可能多地保留輸入通道。如果 groups>1，每個通道組都保持同一性。

引數

tensor (Tensor) – 一個 {3, 4, 5} 維的 torch.Tensor
groups (int, optional) – 卷積層的組數（預設：1）

返回型別

張量

示例

>>> w = torch.empty(3, 16, 5, 5)
>>> nn.init.dirac_(w)
>>> w = torch.empty(3, 24, 5, 5)
>>> nn.init.dirac_(w, 3)

torch.nn.init.xavier_uniform_(tensor, gain=1.0, generator=None)[source]#

使用 Xavier 均勻分佈填充輸入 Tensor 的值。

該方法在 Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010) 中進行了描述。生成的張量將從中取樣值 $\mathcal{U}(-a, a)$ ，其中

a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in} + \text{fan\_out}}}

也稱為 Glorot 初始化。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
gain (float) – 可選的縮放因子
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain("relu"))

torch.nn.init.xavier_normal_(tensor, gain=1.0, generator=None)[source]#

使用 Xavier 正態分佈填充輸入 Tensor 的值。

該方法在 Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010) 中進行了描述。生成的張量將從中取樣值 $\mathcal{N}(0, \text{std}^2)$ ，其中

\text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan\_in} + \text{fan\_out}}}

也稱為 Glorot 初始化。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
gain (float) – 可選的縮放因子
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.xavier_normal_(w)

torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu', generator=None)[source]#

使用 Kaiming 均勻分佈填充輸入 Tensor 的值。

該方法在 Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015) 中進行了描述。生成的張量將從中取樣值 $\mathcal{U}(-\text{bound}, \text{bound})$ ，其中

\text{bound} = \text{gain} \times \sqrt{\frac{3}{\text{fan\_mode}}}

也稱為 He 初始化。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
a (float) – 此層之後使用的整流器的負斜率（僅當使用 'leaky_relu' 時使用）
mode (Literal['fan_in', 'fan_out']) – 'fan_in'（預設）或 'fan_out'。選擇 'fan_in' 可保持前向傳播中權值方差的大小。選擇 'fan_out' 可保持反向傳播中的大小。
nonlinearity (Literal['linear', 'conv1d', 'conv2d', 'conv3d', 'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d', 'sigmoid', 'tanh', 'relu', 'leaky_relu', 'selu']) – 非線性函式（nn.functional 名稱），建議僅與 'relu' 或 'leaky_relu'（預設）一起使用。
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.kaiming_uniform_(w, mode="fan_in", nonlinearity="relu")

注意

請注意，fan_in 和 fan_out 的計算假定權重矩陣以轉置方式使用（即，在 Linear 層中為 x @ w.T，其中 w.shape = [fan_out, fan_in]）。這對於正確的初始化很重要。如果您打算使用 x @ w，其中 w.shape = [fan_in, fan_out]，請傳入一個轉置的權重矩陣，例如 nn.init.kaiming_uniform_(w.T, ...)。

torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu', generator=None)[source]#

使用 Kaiming 正態分佈填充輸入 Tensor 的值。

該方法在 Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015) 中進行了描述。生成的張量將從中取樣值 $\mathcal{N}(0, \text{std}^2)$ ，其中

\text{std} = \frac{\text{gain}}{\sqrt{\text{fan\_mode}}}

也稱為 He 初始化。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
a (float) – 此層之後使用的整流器的負斜率（僅當使用 'leaky_relu' 時使用）
mode (Literal['fan_in', 'fan_out']) – 'fan_in'（預設）或 'fan_out'。選擇 'fan_in' 可保持前向傳播中權值方差的大小。選擇 'fan_out' 可保持反向傳播中的大小。
nonlinearity (Literal['linear', 'conv1d', 'conv2d', 'conv3d', 'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d', 'sigmoid', 'tanh', 'relu', 'leaky_relu', 'selu']) – 非線性函式（nn.functional 名稱），建議僅與 'relu' 或 'leaky_relu'（預設）一起使用。
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.kaiming_normal_(w, mode="fan_out", nonlinearity="relu")

注意

torch.nn.init.trunc_normal_(tensor, mean=0.0, std=1.0, a=-2.0, b=2.0, generator=None)[source]#

使用截斷正態分佈填充輸入張量的值。

這些值實際上是從正態分佈 $\mathcal{N}(\text{mean}, \text{std}^2)$ 中抽取的，並在超出 $[a, b]$ 範圍的值被重新繪製直到它們在範圍內。生成隨機值的方法在 $a \leq \text{mean} \leq b$ 時效果最好。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
mean (float) – 正態分佈的均值
std (float) – 正態分佈的標準差
a (float) – 最小截斷值
b (float) – 最大截斷值
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.trunc_normal_(w)

torch.nn.init.orthogonal_(tensor, gain=1, generator=None)[source]#

用（半）正交矩陣填充輸入 Tensor。

在 Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe, A. et al. (2013) 中進行了描述。輸入張量必須至少有 2 個維度，對於大於 2 個維度的張量，其後面的維度將被展平。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor，其中 $n \geq 2$
gain (float) – 可選的縮放因子
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.orthogonal_(w)

torch.nn.init.sparse_(tensor, sparsity, std=0.01, generator=None)[source]#

將二維輸入 Tensor 填充為稀疏矩陣。

非零元素將從正態分佈 $\mathcal{N}(0, 0.01)$ 中抽取，如 Deep learning via Hessian-free optimization - Martens, J. (2010) 所述。

引數

tensor (Tensor) – 一個 n 維 torch.Tensor
sparsity (float) – 每列中設定為零的元素的比例
std (float) – 用於生成非零值的正態分佈的標準差
generator (Optional[Generator]) – 用於取樣的 torch Generator（預設：None）

返回型別

張量

示例

>>> w = torch.empty(3, 5)
>>> nn.init.sparse_(w, sparsity=0.1)

torch.nn.init#

文件

教程

資源