Int8DynamicActivationInt8WeightConfig¶

class torchao.quantization.Int8DynamicActivationInt8WeightConfig(layout: Optional[Layout] = PlainLayout(), act_mapping_type: Optional[MappingType] = MappingType.SYMMETRIC, weight_only_decode: bool = False, set_inductor_config: bool = True)[原始碼]¶

用於將 int8 動態對稱的每個 token 啟用和 int8 每個通道權重量化應用於線性層的配置。

引數:

layout – Optional[Layout] = PlainLayout() - 量化權重的張量佈局。控制量化資料的儲存和訪問方式。
act_mapping_type – Optional[MappingType] = MappingType.SYMMETRIC - 啟用量化的對映型別。SYMMETRIC 使用圍繞零的對稱量化。
weight_only_decode – bool = False - 如果為 True，則僅在前向傳遞期間量化權重，並在解碼操作期間將啟用保持原始精度。
set_inductor_config – bool = True - 如果為 True，則將 torchinductor 設定調整為推薦值，以獲得此量化方案更好的效能。

Int8DynamicActivationInt8WeightConfig¶

文件

教程

資源