目錄

快捷方式

VideoDecoder¶

class torchcodec.decoders.VideoDecoder(source: Union[str, Path, RawIOBase, BufferedReader, bytes, Tensor], *, stream_index: Optional[int] = None, dimension_order: Literal['NCHW', 'NHWC'] = 'NCHW', num_ffmpeg_threads: int = 1, device: Optional[Union[str, device]] = 'cpu', seek_mode: Literal['exact', 'approximate'] = 'exact')[原始碼]¶

一個單流影片解碼器。

引數:

source (str, Pathlib.path, bytes, torch.Tensor 或類檔案物件) –
影片的來源
- 如果為 str：本地路徑或影片檔案的 URL。
- 如果為 Pathlib.path：本地影片檔案的路徑。
- 如果為 bytes 物件或 torch.Tensor：原始編碼影片資料。
- 如果為類檔案物件：我們會按需從該物件讀取影片資料。該物件必須公開 read(self, size: int) -> bytes 和 seek(self, offset: int, whence: int) -> bytes 方法。更多資訊請參閱：透過類檔案支援流式傳輸資料。
stream_index (int, 可選) – 指定要從中解碼幀的影片流。請注意，此索引對於所有媒體型別都是絕對的。如果未指定，則使用最佳流。

dimension_order (str, 可選) –

解碼幀的維度順序。可以是“NCHW”（預設）或“NHWC”，其中 N 是批次大小，C 是通道數，H 是高度，W 是幀的寬度。 .. note

Frames are natively decoded in NHWC format by the underlying
FFmpeg implementation. Converting those into NCHW format is a
cheap no-copy operation that allows these frames to be
transformed using the `torchvision transforms
<https://pytorch.com.tw/vision/stable/transforms.html>`_.

num_ffmpeg_threads (int, 可選) – 用於解碼的執行緒數。使用 1 進行單執行緒解碼，這可能是執行多個 VideoDecoder 例項的並行最佳選擇。使用更高的數字進行多執行緒解碼，這可能是執行單個 VideoDecoder 例項的最佳選擇。傳遞 0 會讓 FFmpeg 決定執行緒數。預設值：1。
device (str 或 torch.device, 可選) – 用於解碼的裝置。預設值：“cpu”。
seek_mode (str, 可選) – 決定幀訪問是“精確”還是“近似”。精確模式保證請求幀 i 總是返回幀 i，但這需要對檔案進行初始掃描。近似模式更快，因為它避免了掃描檔案，但準確性較低，因為它使用檔案的元資料來計算 i 的可能位置。預設值：“exact”。更多關於此引數的資訊請參閱：精確與近似查詢模式：效能和準確性比較

變數:

metadata (VideoStreamMetadata) – 影片流的元資料。
stream_index (int) – 此解碼器從中檢索幀的流索引。如果在初始化時提供了流索引，則此值與該值相同。如果未指定，則這是最佳流。

使用 VideoDecoder 的示例

精確與近似搜尋模式：效能和準確性比較

精確與近似搜尋模式：效能和準確性比較

使用 CUDA 和 NVDEC 在 GPU 上加速影片解碼

使用 CUDA 和 NVDEC 在 GPU 上加速影片解碼

使用 VideoDecoder 解碼影片

使用 VideoDecoder 解碼影片

透過類檔案物件流式傳輸資料

透過類檔案物件流式傳輸資料

並行影片解碼：多程序與多執行緒

並行影片解碼：多程序與多執行緒

如何取樣影片片段

如何取樣影片片段

__getitem__(key: Union[Integral, slice]) → Tensor[原始碼]¶

以張量形式返回給定索引處的幀或幀。

注意

如果要解碼多幀，我們建議使用批次方法，因為它們速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at() 和 get_frames_played_in_range()。

引數:: key (int 或 slice) – 要檢索的幀的索引或範圍。
返回:: 給定索引或範圍處的幀或幀。
返回型別:: torch.Tensor

get_frame_at(index: int) → Frame[原始碼]¶

返回給定索引處的單個幀。

注意

如果要解碼多幀，我們建議使用批次方法，因為它們速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at()、get_frames_played_in_range()。

引數:: index (int) – 要檢索的幀的索引。
返回:: 給定索引處的幀。
返回型別:: Frame

get_frame_played_at(seconds: float) → Frame[原始碼]¶

返回在給定時間戳（以秒為單位）播放的單個幀。

注意

如果要解碼多幀，我們建議使用批次方法，因為它們速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at()、get_frames_played_in_range()。

引數:: seconds (float) – 幀播放的時間戳（以秒為單位）。
返回:: 在 seconds 時間播放的幀。
返回型別:: Frame

get_frames_at(indices: list[int]) → FrameBatch[原始碼]¶

返回給定索引處的幀。

引數:: indices (int 列表) – 要檢索的幀的索引。
返回:: 給定索引處的幀。
返回型別:: FrameBatch

get_frames_in_range(start: int, stop: int, step: int = 1) → FrameBatch[原始碼]¶

在給定索引範圍處返回多幀。

幀在 [start, stop) 範圍內。

引數:

start (int) – 要檢索的第一幀的索引。
stop (int) – 索引範圍的結束（排除，遵循 Python 約定）。
step (int, 可選) – 幀之間的步長。預設值：1。

返回:

指定範圍內的幀。

返回型別:

get_frames_played_at(seconds: list[float]) → FrameBatch[原始碼]¶

返回在給定時間戳（以秒為單位）播放的幀。

引數:: seconds (float 列表) – 幀播放的時間戳（以秒為單位）。
返回:: 在 seconds 時間播放的幀。
返回型別:: FrameBatch

get_frames_played_in_range(start_seconds: float, stop_seconds: float) → FrameBatch[原始碼]¶

返回指定範圍內的多幀。

幀在半開區間 [start_seconds, stop_seconds) 內。返回的每幀的 pts（以秒為單位）都包含在半開區間內。

引數:

start_seconds (float) – 範圍開始的時間（以秒為單位）。
stop_seconds (float) – 範圍結束的時間（以秒為單位）。作為半開區間，結束時間被排除。

返回:

指定範圍內的幀。

返回型別:

文件

訪問全面的 PyTorch 開發者文件

檢視文件

教程

為初學者和高階開發者提供深入的教程

檢視教程

資源

查詢開發資源並讓您的問題得到解答

檢視資源