解碼/編碼影像和影片¶

torchvision.io 模組提供了用於解碼和編碼影像和影片的實用工具。

影像解碼¶

Torchvision 目前支援解碼 JPEG、PNG、WEBP、GIF、AVIF 和 HEIC 影像。JPEG 解碼也可以在 CUDA GPU 上進行。

主要入口點是 `decode_image()` 函式，您可以將其用作 `PIL.Image.open()` 的替代方法。它會將影像直接解碼為影像 Tensor，從而為您節省轉換時間，並允許您在 Tensor 上原生執行變換/預處理。

from torchvision.io import decode_image

img = decode_image("path_to_image", mode="RGB")
img.dtype  # torch.uint8

# Or
raw_encoded_bytes = ...  # read encoded bytes from your file system
img = decode_image(raw_encoded_bytes, mode="RGB")

`decode_image()` 將自動檢測影像格式，並呼叫相應的解碼器（HEIC 和 AVIF 影像除外，詳見 `decode_avif()` 和 `decode_heic()`）。您也可以使用更低級別的、特定於格式的解碼器，它們可能更強大，例如，如果您想在 CUDA 上編碼/解碼 JPEG。

`decode_image`(input[, mode, ...])	將影像解碼為 uint8 Tensor，可以從檔案路徑或原始編碼位元組中讀取。
`decode_jpeg`(input[, mode, device, ...])	將 JPEG 影像解碼為 3D RGB 或灰度 Tensor，可在 CPU 或 CUDA 上執行。
`decode_png`(input[, mode, apply_exif_orientation])	將 PNG 影像解碼為 3 維 RGB 或灰度 Tensor。
`decode_webp`(input[, mode])	將 WEBP 影像解碼為 3 維 RGB[A] Tensor。
`decode_avif`(input[, mode])	將 AVIF 影像解碼為 3 維 RGB[A] Tensor。
`decode_heic`(input[, mode])	將 HEIC 影像解碼為 3 維 RGB[A] Tensor。
`decode_gif`(input)	將 GIF 影像解碼為 3 或 4 維 RGB Tensor。

`ImageReadMode`(value)

允許在解碼時自動轉換為 RGB、RGBA 等。

已棄用的解碼函式

`read_image`(path[, mode, apply_exif_orientation])

[已棄用] 請改用 `decode_image()`。

影像編碼¶

對於編碼，支援 JPEG（CPU 和 CUDA）和 PNG。

`encode_jpeg`(input[, quality])	將 RGB Tensor 編碼為原始編碼的 JPEG 位元組，可在 CPU 或 CUDA 上執行。
`write_jpeg`(input, filename[, quality])	將 CHW 佈局的輸入 Tensor 儲存到 JPEG 檔案。
`encode_png`(input[, compression_level])	將 CHW 佈局的輸入 Tensor 編碼並返回其對應 PNG 檔案的內容緩衝區。
`write_png`(input, filename[, compression_level])	將 CHW 佈局的輸入 Tensor（對於灰度影像為 HW）儲存到 PNG 檔案。

IO 操作¶

`read_file`(path)	以 uint8 1D Tensor 的形式返回檔案的位元組內容。
`write_file`(filename, data)	將 uint8 1D Tensor 的內容寫入檔案。

影片 - 已棄用¶

警告

已棄用：torchvision 的所有影片解碼和編碼功能從 0.22 版本開始已棄用，並將在 0.24 版本中移除。我們建議您遷移到 TorchCodec，我們將在其中整合 PyTorch 未來未來的解碼/編碼功能。

`read_video`(filename[, start_pts, end_pts, ...])	[已棄用] 從檔案中讀取影片，返回影片幀和音訊幀。
`read_video_timestamps`(filename[, pts_unit])	[已棄用] 列出影片幀的時間戳。
`write_video`(filename, video_array, fps[, ...])	[已棄用] 將 [T, H, W, C] 格式的 4D Tensor 寫入影片檔案。

精細化影片 API

除了 `read_video` 函式外，我們還提供了一個高效能的底層 API，與 `read_video` 函式相比，該 API 提供了更精細化的控制。它在完全支援 torchscript 的同時實現了這一切。

`VideoReader`(src[, stream, num_threads])

[已棄用] 精細化影片讀取 API。