| A | |
|---|---|
1 | nyNPU-RTL/Master/gemma3NE2B_int4_quantization.py |
2 | 총 3개의 safetensors 파일을 변환 시작합니다... |
3 | |
4 | [model-00002-of-00003.safetensors] 처리 중... |
5 | -> [양자화 O] model.audio_tower.conformer.0.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
6 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
7 | -> [양자화 O] model.audio_tower.conformer.0.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
8 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
9 | -> [양자화 O] model.audio_tower.conformer.0.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
10 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
11 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
12 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
13 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
14 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
15 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
16 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
17 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
18 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
19 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
20 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
21 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
22 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
23 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
24 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
25 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
26 | -> [양자화 X (원본유지)] model.audio_tower.conformer.0.norm.weight : 원본 형태 torch.Size([1536]) |
27 | -> [양자화 O] model.audio_tower.conformer.1.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
28 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
29 | -> [양자화 O] model.audio_tower.conformer.1.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
30 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
31 | -> [양자화 O] model.audio_tower.conformer.1.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
32 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
33 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
34 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
35 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
36 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
37 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
38 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
39 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
40 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
41 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
42 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
43 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
44 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
45 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
46 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
47 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
48 | -> [양자화 X (원본유지)] model.audio_tower.conformer.1.norm.weight : 원본 형태 torch.Size([1536]) |
49 | -> [양자화 O] model.audio_tower.conformer.2.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
50 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
51 | -> [양자화 O] model.audio_tower.conformer.2.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
52 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
53 | -> [양자화 O] model.audio_tower.conformer.2.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
54 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
55 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
56 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
57 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
58 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
59 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
60 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
61 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
62 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
63 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
64 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
65 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
66 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
67 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
68 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
69 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
70 | -> [양자화 X (원본유지)] model.audio_tower.conformer.2.norm.weight : 원본 형태 torch.Size([1536]) |
71 | -> [양자화 O] model.audio_tower.conformer.3.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
72 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
73 | -> [양자화 O] model.audio_tower.conformer.3.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
74 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
75 | -> [양자화 O] model.audio_tower.conformer.3.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
76 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
77 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
78 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
79 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
80 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
81 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
82 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
83 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
84 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
85 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
86 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
87 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
88 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
89 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
90 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
91 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
92 | -> [양자화 X (원본유지)] model.audio_tower.conformer.3.norm.weight : 원본 형태 torch.Size([1536]) |
93 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
94 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
95 | -> [양자화 X (원본유지)] model.audio_tower.subsample_conv_projection.conv_0.conv.weight : 원본 형태 torch.Size([128, 1, 3, 3]) |
96 | -> [양자화 X (원본유지)] model.audio_tower.subsample_conv_projection.conv_0.norm.weight : 원본 형태 torch.Size([128]) |
97 | -> [양자화 X (원본유지)] model.audio_tower.subsample_conv_projection.conv_1.conv.weight : 원본 형태 torch.Size([32, 128, 3, 3]) |
98 | -> [양자화 X (원본유지)] model.audio_tower.subsample_conv_projection.conv_1.norm.weight : 원본 형태 torch.Size([32]) |
99 | -> [양자화 O] model.audio_tower.subsample_conv_projection.input_proj_linear.weight : 원본 형태 (1536, 1024) |
100 | -> [양자화 O] model.language_model.altup_projections.0.weight : 원본 형태 (2048, 2048) |
101 | -> [양자화 O] model.language_model.altup_projections.1.weight : 원본 형태 (2048, 2048) |
102 | -> [양자화 O] model.language_model.altup_projections.2.weight : 원본 형태 (2048, 2048) |
103 | -> [양자화 X (원본유지)] model.language_model.altup_unembed_projections.0.weight : 원본 형태 torch.Size([2048, 2048]) |
104 | -> [양자화 X (원본유지)] model.language_model.altup_unembed_projections.1.weight : 원본 형태 torch.Size([2048, 2048]) |
105 | -> [양자화 X (원본유지)] model.language_model.altup_unembed_projections.2.weight : 원본 형태 torch.Size([2048, 2048]) |
106 | -> [양자화 X (원본유지)] model.language_model.embed_tokens_per_layer.weight : 원본 형태 torch.Size([262144, 7680]) |
107 | -> [양자화 X (원본유지)] model.language_model.layers.26.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
108 | -> [양자화 X (원본유지)] model.language_model.layers.26.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
109 | -> [양자화 X (원본유지)] model.language_model.layers.26.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
110 | -> [양자화 X (원본유지)] model.language_model.layers.26.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
111 | -> [양자화 X (원본유지)] model.language_model.layers.26.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
112 | -> [양자화 X (원본유지)] model.language_model.layers.26.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
113 | -> [양자화 X (원본유지)] model.language_model.layers.26.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
114 | -> [양자화 X (원본유지)] model.language_model.layers.26.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
115 | -> [양자화 X (원본유지)] model.language_model.layers.26.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
116 | -> [양자화 O] model.language_model.layers.26.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
117 | -> [양자화 X (원본유지)] model.language_model.layers.26.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
118 | -> [양자화 X (원본유지)] model.language_model.layers.26.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
119 | -> [양자화 X (원본유지)] model.language_model.layers.26.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
120 | -> [양자화 X (원본유지)] model.language_model.layers.26.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
121 | -> [양자화 X (원본유지)] model.language_model.layers.26.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
122 | -> [양자화 X (원본유지)] model.language_model.layers.26.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
123 | -> [양자화 X (원본유지)] model.language_model.layers.27.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
124 | -> [양자화 X (원본유지)] model.language_model.layers.27.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
125 | -> [양자화 X (원본유지)] model.language_model.layers.27.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
126 | -> [양자화 X (원본유지)] model.language_model.layers.27.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
127 | -> [양자화 X (원본유지)] model.language_model.layers.27.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
128 | -> [양자화 X (원본유지)] model.language_model.layers.27.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
129 | -> [양자화 X (원본유지)] model.language_model.layers.27.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
130 | -> [양자화 X (원본유지)] model.language_model.layers.27.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
131 | -> [양자화 X (원본유지)] model.language_model.layers.27.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
132 | -> [양자화 O] model.language_model.layers.27.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
133 | -> [양자화 O] model.language_model.layers.27.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
134 | -> [양자화 O] model.language_model.layers.27.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
135 | -> [양자화 X (원본유지)] model.language_model.layers.27.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
136 | -> [양자화 X (원본유지)] model.language_model.layers.27.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
137 | -> [양자화 X (원본유지)] model.language_model.layers.27.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
138 | -> [양자화 X (원본유지)] model.language_model.layers.27.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
139 | -> [양자화 X (원본유지)] model.language_model.layers.27.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
140 | -> [양자화 X (원본유지)] model.language_model.layers.27.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
141 | -> [양자화 X (원본유지)] model.language_model.layers.27.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
142 | -> [양자화 O] model.language_model.layers.27.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
143 | -> [양자화 O] model.language_model.layers.27.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
144 | -> [양자화 X (원본유지)] model.language_model.layers.27.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
145 | -> [양자화 O] model.language_model.layers.27.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
146 | -> [양자화 O] model.language_model.layers.27.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
147 | -> [양자화 X (원본유지)] model.language_model.layers.28.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
148 | -> [양자화 X (원본유지)] model.language_model.layers.28.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
149 | -> [양자화 X (원본유지)] model.language_model.layers.28.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
150 | -> [양자화 X (원본유지)] model.language_model.layers.28.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
151 | -> [양자화 X (원본유지)] model.language_model.layers.28.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
152 | -> [양자화 X (원본유지)] model.language_model.layers.28.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
153 | -> [양자화 X (원본유지)] model.language_model.layers.28.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
154 | -> [양자화 X (원본유지)] model.language_model.layers.28.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
155 | -> [양자화 X (원본유지)] model.language_model.layers.28.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
156 | -> [양자화 O] model.language_model.layers.28.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
157 | -> [양자화 O] model.language_model.layers.28.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
158 | -> [양자화 O] model.language_model.layers.28.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
159 | -> [양자화 X (원본유지)] model.language_model.layers.28.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
160 | -> [양자화 X (원본유지)] model.language_model.layers.28.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
161 | -> [양자화 X (원본유지)] model.language_model.layers.28.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
162 | -> [양자화 X (원본유지)] model.language_model.layers.28.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
163 | -> [양자화 X (원본유지)] model.language_model.layers.28.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
164 | -> [양자화 X (원본유지)] model.language_model.layers.28.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
165 | -> [양자화 X (원본유지)] model.language_model.layers.28.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
166 | -> [양자화 O] model.language_model.layers.28.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
167 | -> [양자화 O] model.language_model.layers.28.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
168 | -> [양자화 X (원본유지)] model.language_model.layers.28.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
169 | -> [양자화 O] model.language_model.layers.28.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
170 | -> [양자화 O] model.language_model.layers.28.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
171 | -> [양자화 X (원본유지)] model.language_model.layers.29.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
172 | -> [양자화 X (원본유지)] model.language_model.layers.29.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
173 | -> [양자화 X (원본유지)] model.language_model.layers.29.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
174 | -> [양자화 X (원본유지)] model.language_model.layers.29.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
175 | -> [양자화 X (원본유지)] model.language_model.layers.29.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
176 | -> [양자화 X (원본유지)] model.language_model.layers.29.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
177 | -> [양자화 X (원본유지)] model.language_model.layers.29.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
178 | -> [양자화 X (원본유지)] model.language_model.layers.29.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
179 | -> [양자화 X (원본유지)] model.language_model.layers.29.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
180 | -> [양자화 O] model.language_model.layers.29.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
181 | -> [양자화 O] model.language_model.layers.29.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
182 | -> [양자화 O] model.language_model.layers.29.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
183 | -> [양자화 X (원본유지)] model.language_model.layers.29.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
184 | -> [양자화 X (원본유지)] model.language_model.layers.29.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
185 | -> [양자화 X (원본유지)] model.language_model.layers.29.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
186 | -> [양자화 X (원본유지)] model.language_model.layers.29.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
187 | -> [양자화 X (원본유지)] model.language_model.layers.29.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
188 | -> [양자화 X (원본유지)] model.language_model.layers.29.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
189 | -> [양자화 X (원본유지)] model.language_model.layers.29.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
190 | -> [양자화 O] model.language_model.layers.29.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
191 | -> [양자화 O] model.language_model.layers.29.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
192 | -> [양자화 X (원본유지)] model.language_model.layers.29.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
193 | -> [양자화 O] model.language_model.layers.29.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
194 | -> [양자화 O] model.language_model.layers.29.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
195 | -> [양자화 X (원본유지)] model.language_model.norm.weight : 원본 형태 torch.Size([2048]) |
196 | -> [양자화 X (원본유지)] model.language_model.per_layer_model_projection.weight : 원본 형태 torch.Size([7680, 2048]) |
197 | -> [양자화 X (원본유지)] model.language_model.per_layer_projection_norm.weight : 원본 형태 torch.Size([256]) |
198 | 저장 완료: Master/gemma3NE4B_INT4_Q/quantized_model-00002-of-00003.safetensors |
199 | |
200 | [model-00003-of-00003.safetensors] 처리 중... |
201 | -> [양자화 O] model.audio_tower.conformer.10.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
202 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
203 | -> [양자화 O] model.audio_tower.conformer.10.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
204 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
205 | -> [양자화 O] model.audio_tower.conformer.10.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
206 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
207 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
208 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
209 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
210 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
211 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
212 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
213 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
214 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
215 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
216 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
217 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
218 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
219 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
220 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
221 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
222 | -> [양자화 X (원본유지)] model.audio_tower.conformer.10.norm.weight : 원본 형태 torch.Size([1536]) |
223 | -> [양자화 O] model.audio_tower.conformer.11.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
224 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
225 | -> [양자화 O] model.audio_tower.conformer.11.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
226 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
227 | -> [양자화 O] model.audio_tower.conformer.11.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
228 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
229 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
230 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
231 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
232 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
233 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
234 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
235 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
236 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
237 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
238 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
239 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
240 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
241 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
242 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
243 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
244 | -> [양자화 X (원본유지)] model.audio_tower.conformer.11.norm.weight : 원본 형태 torch.Size([1536]) |
245 | -> [양자화 O] model.audio_tower.conformer.4.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
246 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
247 | -> [양자화 O] model.audio_tower.conformer.4.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
248 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
249 | -> [양자화 O] model.audio_tower.conformer.4.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
250 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
251 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
252 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
253 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
254 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
255 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
256 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
257 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
258 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
259 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
260 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
261 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
262 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
263 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
264 | -> [양자화 X (원본유지)] model.audio_tower.conformer.4.norm.weight : 원본 형태 torch.Size([1536]) |
265 | -> [양자화 O] model.audio_tower.conformer.5.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
266 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
267 | -> [양자화 O] model.audio_tower.conformer.5.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
268 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
269 | -> [양자화 O] model.audio_tower.conformer.5.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
270 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
271 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
272 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
273 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
274 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
275 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
276 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
277 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
278 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
279 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
280 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
281 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
282 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
283 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
284 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
285 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
286 | -> [양자화 X (원본유지)] model.audio_tower.conformer.5.norm.weight : 원본 형태 torch.Size([1536]) |
287 | -> [양자화 O] model.audio_tower.conformer.6.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
288 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
289 | -> [양자화 O] model.audio_tower.conformer.6.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
290 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
291 | -> [양자화 O] model.audio_tower.conformer.6.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
292 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
293 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
294 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
295 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
296 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
297 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
298 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
299 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
300 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
301 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
302 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
303 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
304 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
305 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
306 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
307 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
308 | -> [양자화 X (원본유지)] model.audio_tower.conformer.6.norm.weight : 원본 형태 torch.Size([1536]) |
309 | -> [양자화 O] model.audio_tower.conformer.7.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
310 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
311 | -> [양자화 O] model.audio_tower.conformer.7.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
312 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
313 | -> [양자화 O] model.audio_tower.conformer.7.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
314 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
315 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
316 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
317 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
318 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
319 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
320 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
321 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
322 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
323 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
324 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
325 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
326 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
327 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
328 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
329 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
330 | -> [양자화 X (원본유지)] model.audio_tower.conformer.7.norm.weight : 원본 형태 torch.Size([1536]) |
331 | -> [양자화 O] model.audio_tower.conformer.8.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
332 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
333 | -> [양자화 O] model.audio_tower.conformer.8.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
334 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
335 | -> [양자화 O] model.audio_tower.conformer.8.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
336 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
337 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
338 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
339 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
340 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
341 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
342 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
343 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
344 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
345 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
346 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
347 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
348 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
349 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
350 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
351 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
352 | -> [양자화 X (원본유지)] model.audio_tower.conformer.8.norm.weight : 원본 형태 torch.Size([1536]) |
353 | -> [양자화 O] model.audio_tower.conformer.9.attention.attn.k_proj.weight : 원본 형태 (1536, 1536) |
354 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.attention.attn.per_dim_scale : 원본 형태 torch.Size([192]) |
355 | -> [양자화 O] model.audio_tower.conformer.9.attention.attn.q_proj.weight : 원본 형태 (1536, 1536) |
356 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.attention.attn.relative_position_embedding.pos_proj.weight : 원본 형태 torch.Size([1536, 1536]) |
357 | -> [양자화 O] model.audio_tower.conformer.9.attention.attn.v_proj.weight : 원본 형태 (1536, 1536) |
358 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.attention.post.weight : 원본 형태 torch.Size([1536, 1536]) |
359 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.attention.post_norm.weight : 원본 형태 torch.Size([1536]) |
360 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.attention.pre_attn_norm.weight : 원본 형태 torch.Size([1536]) |
361 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_end.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
362 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_end.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
363 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_end.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
364 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_end.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
365 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_start.ffw_layer_1.weight : 원본 형태 torch.Size([6144, 1536]) |
366 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_start.ffw_layer_2.weight : 원본 형태 torch.Size([1536, 6144]) |
367 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_start.post_layer_norm.weight : 원본 형태 torch.Size([1536]) |
368 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.ffw_layer_start.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
369 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.lconv1d.conv_norm.weight : 원본 형태 torch.Size([1536]) |
370 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.lconv1d.depthwise_conv1d.weight : 원본 형태 torch.Size([1536, 1, 5]) |
371 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.lconv1d.linear_end.weight : 원본 형태 torch.Size([1536, 1536]) |
372 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.lconv1d.linear_start.weight : 원본 형태 torch.Size([3072, 1536]) |
373 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.lconv1d.pre_layer_norm.weight : 원본 형태 torch.Size([1536]) |
374 | -> [양자화 X (원본유지)] model.audio_tower.conformer.9.norm.weight : 원본 형태 torch.Size([1536]) |
375 | -> [양자화 X (원본유지)] model.embed_audio.embedding.weight : 원본 형태 torch.Size([128, 1536]) |
376 | -> [양자화 X (원본유지)] model.embed_audio.embedding_projection.weight : 원본 형태 torch.Size([2048, 1536]) |
377 | -> [양자화 X (원본유지)] model.embed_audio.hard_embedding_norm.weight : 원본 형태 torch.Size([1536]) |
378 | -> [양자화 X (원본유지)] model.embed_audio.soft_embedding_norm.weight : 원본 형태 torch.Size([1536]) |
379 | -> [양자화 X (원본유지)] model.embed_vision.embedding.weight : 원본 형태 torch.Size([128, 2048]) |
380 | -> [양자화 X (원본유지)] model.embed_vision.embedding_projection.weight : 원본 형태 torch.Size([2048, 2048]) |
381 | -> [양자화 X (원본유지)] model.embed_vision.hard_embedding_norm.weight : 원본 형태 torch.Size([2048]) |
382 | -> [양자화 X (원본유지)] model.embed_vision.soft_embedding_norm.weight : 원본 형태 torch.Size([2048]) |
383 | 저장 완료: Master/gemma3NE4B_INT4_Q/quantized_model-00003-of-00003.safetensors |
384 | |
385 | [model-00001-of-00003.safetensors] 처리 중... |
386 | -> [양자화 X (원본유지)] model.language_model.embed_tokens.weight : 원본 형태 torch.Size([262400, 2048]) |
387 | -> [양자화 X (원본유지)] model.language_model.layers.0.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
388 | -> [양자화 X (원본유지)] model.language_model.layers.0.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
389 | -> [양자화 X (원본유지)] model.language_model.layers.0.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
390 | -> [양자화 X (원본유지)] model.language_model.layers.0.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
391 | -> [양자화 X (원본유지)] model.language_model.layers.0.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
392 | -> [양자화 X (원본유지)] model.language_model.layers.0.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
393 | -> [양자화 X (원본유지)] model.language_model.layers.0.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
394 | -> [양자화 X (원본유지)] model.language_model.layers.0.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
395 | -> [양자화 X (원본유지)] model.language_model.layers.0.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
396 | -> [양자화 O] model.language_model.layers.0.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
397 | -> [양자화 O] model.language_model.layers.0.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
398 | -> [양자화 O] model.language_model.layers.0.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
399 | -> [양자화 X (원본유지)] model.language_model.layers.0.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
400 | -> [양자화 X (원본유지)] model.language_model.layers.0.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
401 | -> [양자화 X (원본유지)] model.language_model.layers.0.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
402 | -> [양자화 X (원본유지)] model.language_model.layers.0.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
403 | -> [양자화 X (원본유지)] model.language_model.layers.0.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
404 | -> [양자화 X (원본유지)] model.language_model.layers.0.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
405 | -> [양자화 X (원본유지)] model.language_model.layers.0.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
406 | -> [양자화 O] model.language_model.layers.0.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
407 | -> [양자화 O] model.language_model.layers.0.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
408 | -> [양자화 X (원본유지)] model.language_model.layers.0.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
409 | -> [양자화 O] model.language_model.layers.0.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
410 | -> [양자화 O] model.language_model.layers.0.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
411 | -> [양자화 X (원본유지)] model.language_model.layers.1.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
412 | -> [양자화 X (원본유지)] model.language_model.layers.1.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
413 | -> [양자화 X (원본유지)] model.language_model.layers.1.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
414 | -> [양자화 X (원본유지)] model.language_model.layers.1.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
415 | -> [양자화 X (원본유지)] model.language_model.layers.1.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
416 | -> [양자화 X (원본유지)] model.language_model.layers.1.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
417 | -> [양자화 X (원본유지)] model.language_model.layers.1.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
418 | -> [양자화 X (원본유지)] model.language_model.layers.1.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
419 | -> [양자화 X (원본유지)] model.language_model.layers.1.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
420 | -> [양자화 O] model.language_model.layers.1.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
421 | -> [양자화 O] model.language_model.layers.1.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
422 | -> [양자화 O] model.language_model.layers.1.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
423 | -> [양자화 X (원본유지)] model.language_model.layers.1.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
424 | -> [양자화 X (원본유지)] model.language_model.layers.1.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
425 | -> [양자화 X (원본유지)] model.language_model.layers.1.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
426 | -> [양자화 X (원본유지)] model.language_model.layers.1.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
427 | -> [양자화 X (원본유지)] model.language_model.layers.1.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
428 | -> [양자화 X (원본유지)] model.language_model.layers.1.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
429 | -> [양자화 X (원본유지)] model.language_model.layers.1.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
430 | -> [양자화 O] model.language_model.layers.1.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
431 | -> [양자화 O] model.language_model.layers.1.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
432 | -> [양자화 X (원본유지)] model.language_model.layers.1.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
433 | -> [양자화 O] model.language_model.layers.1.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
434 | -> [양자화 O] model.language_model.layers.1.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
435 | -> [양자화 X (원본유지)] model.language_model.layers.10.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
436 | -> [양자화 X (원본유지)] model.language_model.layers.10.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
437 | -> [양자화 X (원본유지)] model.language_model.layers.10.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
438 | -> [양자화 X (원본유지)] model.language_model.layers.10.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
439 | -> [양자화 X (원본유지)] model.language_model.layers.10.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
440 | -> [양자화 X (원본유지)] model.language_model.layers.10.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
441 | -> [양자화 X (원본유지)] model.language_model.layers.10.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
442 | -> [양자화 X (원본유지)] model.language_model.layers.10.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
443 | -> [양자화 X (원본유지)] model.language_model.layers.10.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
444 | -> [양자화 O] model.language_model.layers.10.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
445 | -> [양자화 O] model.language_model.layers.10.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
446 | -> [양자화 O] model.language_model.layers.10.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
447 | -> [양자화 X (원본유지)] model.language_model.layers.10.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
448 | -> [양자화 X (원본유지)] model.language_model.layers.10.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
449 | -> [양자화 X (원본유지)] model.language_model.layers.10.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
450 | -> [양자화 X (원본유지)] model.language_model.layers.10.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
451 | -> [양자화 X (원본유지)] model.language_model.layers.10.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
452 | -> [양자화 X (원본유지)] model.language_model.layers.10.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
453 | -> [양자화 X (원본유지)] model.language_model.layers.10.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
454 | -> [양자화 O] model.language_model.layers.10.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
455 | -> [양자화 O] model.language_model.layers.10.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
456 | -> [양자화 X (원본유지)] model.language_model.layers.10.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
457 | -> [양자화 O] model.language_model.layers.10.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
458 | -> [양자화 O] model.language_model.layers.10.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
459 | -> [양자화 X (원본유지)] model.language_model.layers.11.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
460 | -> [양자화 X (원본유지)] model.language_model.layers.11.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
461 | -> [양자화 X (원본유지)] model.language_model.layers.11.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
462 | -> [양자화 X (원본유지)] model.language_model.layers.11.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
463 | -> [양자화 X (원본유지)] model.language_model.layers.11.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
464 | -> [양자화 X (원본유지)] model.language_model.layers.11.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
465 | -> [양자화 X (원본유지)] model.language_model.layers.11.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
466 | -> [양자화 X (원본유지)] model.language_model.layers.11.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
467 | -> [양자화 X (원본유지)] model.language_model.layers.11.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
468 | -> [양자화 O] model.language_model.layers.11.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
469 | -> [양자화 O] model.language_model.layers.11.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
470 | -> [양자화 O] model.language_model.layers.11.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
471 | -> [양자화 X (원본유지)] model.language_model.layers.11.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
472 | -> [양자화 X (원본유지)] model.language_model.layers.11.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
473 | -> [양자화 X (원본유지)] model.language_model.layers.11.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
474 | -> [양자화 X (원본유지)] model.language_model.layers.11.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
475 | -> [양자화 X (원본유지)] model.language_model.layers.11.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
476 | -> [양자화 X (원본유지)] model.language_model.layers.11.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
477 | -> [양자화 X (원본유지)] model.language_model.layers.11.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
478 | -> [양자화 O] model.language_model.layers.11.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
479 | -> [양자화 O] model.language_model.layers.11.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
480 | -> [양자화 X (원본유지)] model.language_model.layers.11.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
481 | -> [양자화 O] model.language_model.layers.11.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
482 | -> [양자화 O] model.language_model.layers.11.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
483 | -> [양자화 X (원본유지)] model.language_model.layers.12.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
484 | -> [양자화 X (원본유지)] model.language_model.layers.12.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
485 | -> [양자화 X (원본유지)] model.language_model.layers.12.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
486 | -> [양자화 X (원본유지)] model.language_model.layers.12.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
487 | -> [양자화 X (원본유지)] model.language_model.layers.12.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
488 | -> [양자화 X (원본유지)] model.language_model.layers.12.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
489 | -> [양자화 X (원본유지)] model.language_model.layers.12.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
490 | -> [양자화 X (원본유지)] model.language_model.layers.12.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
491 | -> [양자화 X (원본유지)] model.language_model.layers.12.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
492 | -> [양자화 O] model.language_model.layers.12.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
493 | -> [양자화 O] model.language_model.layers.12.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
494 | -> [양자화 O] model.language_model.layers.12.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
495 | -> [양자화 X (원본유지)] model.language_model.layers.12.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
496 | -> [양자화 X (원본유지)] model.language_model.layers.12.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
497 | -> [양자화 X (원본유지)] model.language_model.layers.12.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
498 | -> [양자화 X (원본유지)] model.language_model.layers.12.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
499 | -> [양자화 X (원본유지)] model.language_model.layers.12.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
500 | -> [양자화 X (원본유지)] model.language_model.layers.12.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
501 | -> [양자화 X (원본유지)] model.language_model.layers.12.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
502 | -> [양자화 O] model.language_model.layers.12.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
503 | -> [양자화 O] model.language_model.layers.12.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
504 | -> [양자화 X (원본유지)] model.language_model.layers.12.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
505 | -> [양자화 O] model.language_model.layers.12.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
506 | -> [양자화 O] model.language_model.layers.12.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
507 | -> [양자화 X (원본유지)] model.language_model.layers.13.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
508 | -> [양자화 X (원본유지)] model.language_model.layers.13.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
509 | -> [양자화 X (원본유지)] model.language_model.layers.13.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
510 | -> [양자화 X (원본유지)] model.language_model.layers.13.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
511 | -> [양자화 X (원본유지)] model.language_model.layers.13.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
512 | -> [양자화 X (원본유지)] model.language_model.layers.13.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
513 | -> [양자화 X (원본유지)] model.language_model.layers.13.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
514 | -> [양자화 X (원본유지)] model.language_model.layers.13.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
515 | -> [양자화 X (원본유지)] model.language_model.layers.13.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
516 | -> [양자화 O] model.language_model.layers.13.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
517 | -> [양자화 O] model.language_model.layers.13.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
518 | -> [양자화 O] model.language_model.layers.13.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
519 | -> [양자화 X (원본유지)] model.language_model.layers.13.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
520 | -> [양자화 X (원본유지)] model.language_model.layers.13.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
521 | -> [양자화 X (원본유지)] model.language_model.layers.13.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
522 | -> [양자화 X (원본유지)] model.language_model.layers.13.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
523 | -> [양자화 X (원본유지)] model.language_model.layers.13.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
524 | -> [양자화 X (원본유지)] model.language_model.layers.13.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
525 | -> [양자화 X (원본유지)] model.language_model.layers.13.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
526 | -> [양자화 O] model.language_model.layers.13.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
527 | -> [양자화 O] model.language_model.layers.13.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
528 | -> [양자화 X (원본유지)] model.language_model.layers.13.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
529 | -> [양자화 O] model.language_model.layers.13.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
530 | -> [양자화 O] model.language_model.layers.13.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
531 | -> [양자화 X (원본유지)] model.language_model.layers.14.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
532 | -> [양자화 X (원본유지)] model.language_model.layers.14.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
533 | -> [양자화 X (원본유지)] model.language_model.layers.14.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
534 | -> [양자화 X (원본유지)] model.language_model.layers.14.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
535 | -> [양자화 X (원본유지)] model.language_model.layers.14.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
536 | -> [양자화 X (원본유지)] model.language_model.layers.14.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
537 | -> [양자화 X (원본유지)] model.language_model.layers.14.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
538 | -> [양자화 X (원본유지)] model.language_model.layers.14.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
539 | -> [양자화 X (원본유지)] model.language_model.layers.14.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
540 | -> [양자화 O] model.language_model.layers.14.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
541 | -> [양자화 O] model.language_model.layers.14.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
542 | -> [양자화 O] model.language_model.layers.14.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
543 | -> [양자화 X (원본유지)] model.language_model.layers.14.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
544 | -> [양자화 X (원본유지)] model.language_model.layers.14.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
545 | -> [양자화 X (원본유지)] model.language_model.layers.14.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
546 | -> [양자화 X (원본유지)] model.language_model.layers.14.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
547 | -> [양자화 X (원본유지)] model.language_model.layers.14.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
548 | -> [양자화 X (원본유지)] model.language_model.layers.14.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
549 | -> [양자화 X (원본유지)] model.language_model.layers.14.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
550 | -> [양자화 O] model.language_model.layers.14.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
551 | -> [양자화 O] model.language_model.layers.14.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
552 | -> [양자화 X (원본유지)] model.language_model.layers.14.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
553 | -> [양자화 O] model.language_model.layers.14.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
554 | -> [양자화 O] model.language_model.layers.14.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
555 | -> [양자화 X (원본유지)] model.language_model.layers.15.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
556 | -> [양자화 X (원본유지)] model.language_model.layers.15.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
557 | -> [양자화 X (원본유지)] model.language_model.layers.15.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
558 | -> [양자화 X (원본유지)] model.language_model.layers.15.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
559 | -> [양자화 X (원본유지)] model.language_model.layers.15.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
560 | -> [양자화 X (원본유지)] model.language_model.layers.15.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
561 | -> [양자화 X (원본유지)] model.language_model.layers.15.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
562 | -> [양자화 X (원본유지)] model.language_model.layers.15.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
563 | -> [양자화 X (원본유지)] model.language_model.layers.15.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
564 | -> [양자화 O] model.language_model.layers.15.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
565 | -> [양자화 O] model.language_model.layers.15.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
566 | -> [양자화 O] model.language_model.layers.15.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
567 | -> [양자화 X (원본유지)] model.language_model.layers.15.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
568 | -> [양자화 X (원본유지)] model.language_model.layers.15.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
569 | -> [양자화 X (원본유지)] model.language_model.layers.15.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
570 | -> [양자화 X (원본유지)] model.language_model.layers.15.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
571 | -> [양자화 X (원본유지)] model.language_model.layers.15.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
572 | -> [양자화 X (원본유지)] model.language_model.layers.15.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
573 | -> [양자화 X (원본유지)] model.language_model.layers.15.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
574 | -> [양자화 O] model.language_model.layers.15.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
575 | -> [양자화 O] model.language_model.layers.15.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
576 | -> [양자화 X (원본유지)] model.language_model.layers.15.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
577 | -> [양자화 O] model.language_model.layers.15.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
578 | -> [양자화 O] model.language_model.layers.15.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
579 | -> [양자화 X (원본유지)] model.language_model.layers.16.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
580 | -> [양자화 X (원본유지)] model.language_model.layers.16.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
581 | -> [양자화 X (원본유지)] model.language_model.layers.16.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
582 | -> [양자화 X (원본유지)] model.language_model.layers.16.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
583 | -> [양자화 X (원본유지)] model.language_model.layers.16.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
584 | -> [양자화 X (원본유지)] model.language_model.layers.16.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
585 | -> [양자화 X (원본유지)] model.language_model.layers.16.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
586 | -> [양자화 X (원본유지)] model.language_model.layers.16.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
587 | -> [양자화 X (원본유지)] model.language_model.layers.16.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
588 | -> [양자화 O] model.language_model.layers.16.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
589 | -> [양자화 O] model.language_model.layers.16.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
590 | -> [양자화 O] model.language_model.layers.16.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
591 | -> [양자화 X (원본유지)] model.language_model.layers.16.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
592 | -> [양자화 X (원본유지)] model.language_model.layers.16.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
593 | -> [양자화 X (원본유지)] model.language_model.layers.16.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
594 | -> [양자화 X (원본유지)] model.language_model.layers.16.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
595 | -> [양자화 X (원본유지)] model.language_model.layers.16.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
596 | -> [양자화 X (원본유지)] model.language_model.layers.16.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
597 | -> [양자화 X (원본유지)] model.language_model.layers.16.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
598 | -> [양자화 O] model.language_model.layers.16.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
599 | -> [양자화 O] model.language_model.layers.16.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
600 | -> [양자화 X (원본유지)] model.language_model.layers.16.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
601 | -> [양자화 O] model.language_model.layers.16.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
602 | -> [양자화 O] model.language_model.layers.16.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
603 | -> [양자화 X (원본유지)] model.language_model.layers.17.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
604 | -> [양자화 X (원본유지)] model.language_model.layers.17.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
605 | -> [양자화 X (원본유지)] model.language_model.layers.17.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
606 | -> [양자화 X (원본유지)] model.language_model.layers.17.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
607 | -> [양자화 X (원본유지)] model.language_model.layers.17.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
608 | -> [양자화 X (원본유지)] model.language_model.layers.17.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
609 | -> [양자화 X (원본유지)] model.language_model.layers.17.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
610 | -> [양자화 X (원본유지)] model.language_model.layers.17.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
611 | -> [양자화 X (원본유지)] model.language_model.layers.17.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
612 | -> [양자화 O] model.language_model.layers.17.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
613 | -> [양자화 O] model.language_model.layers.17.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
614 | -> [양자화 O] model.language_model.layers.17.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
615 | -> [양자화 X (원본유지)] model.language_model.layers.17.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
616 | -> [양자화 X (원본유지)] model.language_model.layers.17.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
617 | -> [양자화 X (원본유지)] model.language_model.layers.17.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
618 | -> [양자화 X (원본유지)] model.language_model.layers.17.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
619 | -> [양자화 X (원본유지)] model.language_model.layers.17.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
620 | -> [양자화 X (원본유지)] model.language_model.layers.17.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
621 | -> [양자화 X (원본유지)] model.language_model.layers.17.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
622 | -> [양자화 O] model.language_model.layers.17.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
623 | -> [양자화 O] model.language_model.layers.17.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
624 | -> [양자화 X (원본유지)] model.language_model.layers.17.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
625 | -> [양자화 O] model.language_model.layers.17.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
626 | -> [양자화 O] model.language_model.layers.17.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
627 | -> [양자화 X (원본유지)] model.language_model.layers.18.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
628 | -> [양자화 X (원본유지)] model.language_model.layers.18.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
629 | -> [양자화 X (원본유지)] model.language_model.layers.18.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
630 | -> [양자화 X (원본유지)] model.language_model.layers.18.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
631 | -> [양자화 X (원본유지)] model.language_model.layers.18.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
632 | -> [양자화 X (원본유지)] model.language_model.layers.18.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
633 | -> [양자화 X (원본유지)] model.language_model.layers.18.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
634 | -> [양자화 X (원본유지)] model.language_model.layers.18.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
635 | -> [양자화 X (원본유지)] model.language_model.layers.18.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
636 | -> [양자화 O] model.language_model.layers.18.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
637 | -> [양자화 O] model.language_model.layers.18.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
638 | -> [양자화 O] model.language_model.layers.18.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
639 | -> [양자화 X (원본유지)] model.language_model.layers.18.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
640 | -> [양자화 X (원본유지)] model.language_model.layers.18.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
641 | -> [양자화 X (원본유지)] model.language_model.layers.18.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
642 | -> [양자화 X (원본유지)] model.language_model.layers.18.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
643 | -> [양자화 X (원본유지)] model.language_model.layers.18.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
644 | -> [양자화 X (원본유지)] model.language_model.layers.18.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
645 | -> [양자화 X (원본유지)] model.language_model.layers.18.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
646 | -> [양자화 O] model.language_model.layers.18.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
647 | -> [양자화 O] model.language_model.layers.18.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
648 | -> [양자화 X (원본유지)] model.language_model.layers.18.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
649 | -> [양자화 O] model.language_model.layers.18.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
650 | -> [양자화 O] model.language_model.layers.18.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
651 | -> [양자화 X (원본유지)] model.language_model.layers.19.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
652 | -> [양자화 X (원본유지)] model.language_model.layers.19.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
653 | -> [양자화 X (원본유지)] model.language_model.layers.19.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
654 | -> [양자화 X (원본유지)] model.language_model.layers.19.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
655 | -> [양자화 X (원본유지)] model.language_model.layers.19.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
656 | -> [양자화 X (원본유지)] model.language_model.layers.19.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
657 | -> [양자화 X (원본유지)] model.language_model.layers.19.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
658 | -> [양자화 X (원본유지)] model.language_model.layers.19.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
659 | -> [양자화 X (원본유지)] model.language_model.layers.19.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
660 | -> [양자화 O] model.language_model.layers.19.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
661 | -> [양자화 O] model.language_model.layers.19.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
662 | -> [양자화 O] model.language_model.layers.19.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
663 | -> [양자화 X (원본유지)] model.language_model.layers.19.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
664 | -> [양자화 X (원본유지)] model.language_model.layers.19.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
665 | -> [양자화 X (원본유지)] model.language_model.layers.19.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
666 | -> [양자화 X (원본유지)] model.language_model.layers.19.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
667 | -> [양자화 X (원본유지)] model.language_model.layers.19.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
668 | -> [양자화 X (원본유지)] model.language_model.layers.19.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
669 | -> [양자화 X (원본유지)] model.language_model.layers.19.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
670 | -> [양자화 O] model.language_model.layers.19.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
671 | -> [양자화 O] model.language_model.layers.19.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
672 | -> [양자화 X (원본유지)] model.language_model.layers.19.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
673 | -> [양자화 O] model.language_model.layers.19.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
674 | -> [양자화 O] model.language_model.layers.19.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
675 | -> [양자화 X (원본유지)] model.language_model.layers.2.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
676 | -> [양자화 X (원본유지)] model.language_model.layers.2.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
677 | -> [양자화 X (원본유지)] model.language_model.layers.2.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
678 | -> [양자화 X (원본유지)] model.language_model.layers.2.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
679 | -> [양자화 X (원본유지)] model.language_model.layers.2.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
680 | -> [양자화 X (원본유지)] model.language_model.layers.2.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
681 | -> [양자화 X (원본유지)] model.language_model.layers.2.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
682 | -> [양자화 X (원본유지)] model.language_model.layers.2.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
683 | -> [양자화 X (원본유지)] model.language_model.layers.2.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
684 | -> [양자화 O] model.language_model.layers.2.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
685 | -> [양자화 O] model.language_model.layers.2.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
686 | -> [양자화 O] model.language_model.layers.2.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
687 | -> [양자화 X (원본유지)] model.language_model.layers.2.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
688 | -> [양자화 X (원본유지)] model.language_model.layers.2.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
689 | -> [양자화 X (원본유지)] model.language_model.layers.2.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
690 | -> [양자화 X (원본유지)] model.language_model.layers.2.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
691 | -> [양자화 X (원본유지)] model.language_model.layers.2.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
692 | -> [양자화 X (원본유지)] model.language_model.layers.2.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
693 | -> [양자화 X (원본유지)] model.language_model.layers.2.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
694 | -> [양자화 O] model.language_model.layers.2.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
695 | -> [양자화 O] model.language_model.layers.2.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
696 | -> [양자화 X (원본유지)] model.language_model.layers.2.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
697 | -> [양자화 O] model.language_model.layers.2.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
698 | -> [양자화 O] model.language_model.layers.2.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
699 | -> [양자화 X (원본유지)] model.language_model.layers.20.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
700 | -> [양자화 X (원본유지)] model.language_model.layers.20.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
701 | -> [양자화 X (원본유지)] model.language_model.layers.20.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
702 | -> [양자화 X (원본유지)] model.language_model.layers.20.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
703 | -> [양자화 X (원본유지)] model.language_model.layers.20.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
704 | -> [양자화 X (원본유지)] model.language_model.layers.20.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
705 | -> [양자화 X (원본유지)] model.language_model.layers.20.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
706 | -> [양자화 X (원본유지)] model.language_model.layers.20.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
707 | -> [양자화 X (원본유지)] model.language_model.layers.20.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
708 | -> [양자화 O] model.language_model.layers.20.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
709 | -> [양자화 O] model.language_model.layers.20.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
710 | -> [양자화 O] model.language_model.layers.20.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
711 | -> [양자화 X (원본유지)] model.language_model.layers.20.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
712 | -> [양자화 X (원본유지)] model.language_model.layers.20.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
713 | -> [양자화 X (원본유지)] model.language_model.layers.20.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
714 | -> [양자화 X (원본유지)] model.language_model.layers.20.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
715 | -> [양자화 X (원본유지)] model.language_model.layers.20.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
716 | -> [양자화 X (원본유지)] model.language_model.layers.20.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
717 | -> [양자화 X (원본유지)] model.language_model.layers.20.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
718 | -> [양자화 O] model.language_model.layers.20.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
719 | -> [양자화 O] model.language_model.layers.20.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
720 | -> [양자화 X (원본유지)] model.language_model.layers.20.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
721 | -> [양자화 O] model.language_model.layers.20.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
722 | -> [양자화 O] model.language_model.layers.20.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
723 | -> [양자화 X (원본유지)] model.language_model.layers.21.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
724 | -> [양자화 X (원본유지)] model.language_model.layers.21.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
725 | -> [양자화 X (원본유지)] model.language_model.layers.21.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
726 | -> [양자화 X (원본유지)] model.language_model.layers.21.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
727 | -> [양자화 X (원본유지)] model.language_model.layers.21.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
728 | -> [양자화 X (원본유지)] model.language_model.layers.21.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
729 | -> [양자화 X (원본유지)] model.language_model.layers.21.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
730 | -> [양자화 X (원본유지)] model.language_model.layers.21.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
731 | -> [양자화 X (원본유지)] model.language_model.layers.21.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
732 | -> [양자화 O] model.language_model.layers.21.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
733 | -> [양자화 O] model.language_model.layers.21.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
734 | -> [양자화 O] model.language_model.layers.21.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
735 | -> [양자화 X (원본유지)] model.language_model.layers.21.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
736 | -> [양자화 X (원본유지)] model.language_model.layers.21.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
737 | -> [양자화 X (원본유지)] model.language_model.layers.21.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
738 | -> [양자화 X (원본유지)] model.language_model.layers.21.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
739 | -> [양자화 X (원본유지)] model.language_model.layers.21.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
740 | -> [양자화 X (원본유지)] model.language_model.layers.21.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
741 | -> [양자화 X (원본유지)] model.language_model.layers.21.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
742 | -> [양자화 O] model.language_model.layers.21.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
743 | -> [양자화 O] model.language_model.layers.21.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
744 | -> [양자화 X (원본유지)] model.language_model.layers.21.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
745 | -> [양자화 O] model.language_model.layers.21.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
746 | -> [양자화 O] model.language_model.layers.21.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
747 | -> [양자화 X (원본유지)] model.language_model.layers.22.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
748 | -> [양자화 X (원본유지)] model.language_model.layers.22.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
749 | -> [양자화 X (원본유지)] model.language_model.layers.22.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
750 | -> [양자화 X (원본유지)] model.language_model.layers.22.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
751 | -> [양자화 X (원본유지)] model.language_model.layers.22.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
752 | -> [양자화 X (원본유지)] model.language_model.layers.22.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
753 | -> [양자화 X (원본유지)] model.language_model.layers.22.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
754 | -> [양자화 X (원본유지)] model.language_model.layers.22.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
755 | -> [양자화 X (원본유지)] model.language_model.layers.22.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
756 | -> [양자화 O] model.language_model.layers.22.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
757 | -> [양자화 O] model.language_model.layers.22.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
758 | -> [양자화 O] model.language_model.layers.22.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
759 | -> [양자화 X (원본유지)] model.language_model.layers.22.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
760 | -> [양자화 X (원본유지)] model.language_model.layers.22.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
761 | -> [양자화 X (원본유지)] model.language_model.layers.22.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
762 | -> [양자화 X (원본유지)] model.language_model.layers.22.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
763 | -> [양자화 X (원본유지)] model.language_model.layers.22.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
764 | -> [양자화 X (원본유지)] model.language_model.layers.22.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
765 | -> [양자화 X (원본유지)] model.language_model.layers.22.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
766 | -> [양자화 O] model.language_model.layers.22.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
767 | -> [양자화 O] model.language_model.layers.22.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
768 | -> [양자화 X (원본유지)] model.language_model.layers.22.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
769 | -> [양자화 O] model.language_model.layers.22.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
770 | -> [양자화 O] model.language_model.layers.22.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
771 | -> [양자화 X (원본유지)] model.language_model.layers.23.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
772 | -> [양자화 X (원본유지)] model.language_model.layers.23.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
773 | -> [양자화 X (원본유지)] model.language_model.layers.23.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
774 | -> [양자화 X (원본유지)] model.language_model.layers.23.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
775 | -> [양자화 X (원본유지)] model.language_model.layers.23.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
776 | -> [양자화 X (원본유지)] model.language_model.layers.23.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
777 | -> [양자화 X (원본유지)] model.language_model.layers.23.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
778 | -> [양자화 X (원본유지)] model.language_model.layers.23.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
779 | -> [양자화 X (원본유지)] model.language_model.layers.23.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
780 | -> [양자화 O] model.language_model.layers.23.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
781 | -> [양자화 O] model.language_model.layers.23.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
782 | -> [양자화 O] model.language_model.layers.23.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
783 | -> [양자화 X (원본유지)] model.language_model.layers.23.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
784 | -> [양자화 X (원본유지)] model.language_model.layers.23.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
785 | -> [양자화 X (원본유지)] model.language_model.layers.23.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
786 | -> [양자화 X (원본유지)] model.language_model.layers.23.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
787 | -> [양자화 X (원본유지)] model.language_model.layers.23.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
788 | -> [양자화 X (원본유지)] model.language_model.layers.23.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
789 | -> [양자화 X (원본유지)] model.language_model.layers.23.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
790 | -> [양자화 O] model.language_model.layers.23.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
791 | -> [양자화 O] model.language_model.layers.23.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
792 | -> [양자화 X (원본유지)] model.language_model.layers.23.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
793 | -> [양자화 O] model.language_model.layers.23.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
794 | -> [양자화 O] model.language_model.layers.23.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
795 | -> [양자화 X (원본유지)] model.language_model.layers.24.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
796 | -> [양자화 X (원본유지)] model.language_model.layers.24.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
797 | -> [양자화 X (원본유지)] model.language_model.layers.24.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
798 | -> [양자화 X (원본유지)] model.language_model.layers.24.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
799 | -> [양자화 X (원본유지)] model.language_model.layers.24.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
800 | -> [양자화 X (원본유지)] model.language_model.layers.24.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
801 | -> [양자화 X (원본유지)] model.language_model.layers.24.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
802 | -> [양자화 X (원본유지)] model.language_model.layers.24.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
803 | -> [양자화 X (원본유지)] model.language_model.layers.24.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
804 | -> [양자화 O] model.language_model.layers.24.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
805 | -> [양자화 O] model.language_model.layers.24.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
806 | -> [양자화 O] model.language_model.layers.24.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
807 | -> [양자화 X (원본유지)] model.language_model.layers.24.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
808 | -> [양자화 X (원본유지)] model.language_model.layers.24.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
809 | -> [양자화 X (원본유지)] model.language_model.layers.24.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
810 | -> [양자화 X (원본유지)] model.language_model.layers.24.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
811 | -> [양자화 X (원본유지)] model.language_model.layers.24.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
812 | -> [양자화 X (원본유지)] model.language_model.layers.24.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
813 | -> [양자화 X (원본유지)] model.language_model.layers.24.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
814 | -> [양자화 O] model.language_model.layers.24.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
815 | -> [양자화 O] model.language_model.layers.24.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
816 | -> [양자화 X (원본유지)] model.language_model.layers.24.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
817 | -> [양자화 O] model.language_model.layers.24.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
818 | -> [양자화 O] model.language_model.layers.24.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
819 | -> [양자화 X (원본유지)] model.language_model.layers.25.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
820 | -> [양자화 X (원본유지)] model.language_model.layers.25.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
821 | -> [양자화 X (원본유지)] model.language_model.layers.25.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
822 | -> [양자화 X (원본유지)] model.language_model.layers.25.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
823 | -> [양자화 X (원본유지)] model.language_model.layers.25.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
824 | -> [양자화 X (원본유지)] model.language_model.layers.25.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
825 | -> [양자화 X (원본유지)] model.language_model.layers.25.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
826 | -> [양자화 X (원본유지)] model.language_model.layers.25.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
827 | -> [양자화 X (원본유지)] model.language_model.layers.25.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
828 | -> [양자화 O] model.language_model.layers.25.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
829 | -> [양자화 O] model.language_model.layers.25.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
830 | -> [양자화 O] model.language_model.layers.25.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
831 | -> [양자화 X (원본유지)] model.language_model.layers.25.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
832 | -> [양자화 X (원본유지)] model.language_model.layers.25.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
833 | -> [양자화 X (원본유지)] model.language_model.layers.25.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
834 | -> [양자화 X (원본유지)] model.language_model.layers.25.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
835 | -> [양자화 X (원본유지)] model.language_model.layers.25.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
836 | -> [양자화 X (원본유지)] model.language_model.layers.25.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
837 | -> [양자화 X (원본유지)] model.language_model.layers.25.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
838 | -> [양자화 O] model.language_model.layers.25.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
839 | -> [양자화 O] model.language_model.layers.25.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
840 | -> [양자화 X (원본유지)] model.language_model.layers.25.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
841 | -> [양자화 O] model.language_model.layers.25.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
842 | -> [양자화 O] model.language_model.layers.25.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
843 | -> [양자화 O] model.language_model.layers.26.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
844 | -> [양자화 O] model.language_model.layers.26.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
845 | -> [양자화 X (원본유지)] model.language_model.layers.26.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
846 | -> [양자화 O] model.language_model.layers.26.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
847 | -> [양자화 O] model.language_model.layers.26.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
848 | -> [양자화 X (원본유지)] model.language_model.layers.26.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
849 | -> [양자화 O] model.language_model.layers.26.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
850 | -> [양자화 O] model.language_model.layers.26.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
851 | -> [양자화 X (원본유지)] model.language_model.layers.3.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
852 | -> [양자화 X (원본유지)] model.language_model.layers.3.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
853 | -> [양자화 X (원본유지)] model.language_model.layers.3.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
854 | -> [양자화 X (원본유지)] model.language_model.layers.3.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
855 | -> [양자화 X (원본유지)] model.language_model.layers.3.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
856 | -> [양자화 X (원본유지)] model.language_model.layers.3.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
857 | -> [양자화 X (원본유지)] model.language_model.layers.3.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
858 | -> [양자화 X (원본유지)] model.language_model.layers.3.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
859 | -> [양자화 X (원본유지)] model.language_model.layers.3.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
860 | -> [양자화 O] model.language_model.layers.3.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
861 | -> [양자화 O] model.language_model.layers.3.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
862 | -> [양자화 O] model.language_model.layers.3.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
863 | -> [양자화 X (원본유지)] model.language_model.layers.3.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
864 | -> [양자화 X (원본유지)] model.language_model.layers.3.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
865 | -> [양자화 X (원본유지)] model.language_model.layers.3.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
866 | -> [양자화 X (원본유지)] model.language_model.layers.3.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
867 | -> [양자화 X (원본유지)] model.language_model.layers.3.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
868 | -> [양자화 X (원본유지)] model.language_model.layers.3.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
869 | -> [양자화 X (원본유지)] model.language_model.layers.3.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
870 | -> [양자화 O] model.language_model.layers.3.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
871 | -> [양자화 O] model.language_model.layers.3.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
872 | -> [양자화 X (원본유지)] model.language_model.layers.3.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
873 | -> [양자화 O] model.language_model.layers.3.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
874 | -> [양자화 O] model.language_model.layers.3.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
875 | -> [양자화 X (원본유지)] model.language_model.layers.4.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
876 | -> [양자화 X (원본유지)] model.language_model.layers.4.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
877 | -> [양자화 X (원본유지)] model.language_model.layers.4.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
878 | -> [양자화 X (원본유지)] model.language_model.layers.4.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
879 | -> [양자화 X (원본유지)] model.language_model.layers.4.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
880 | -> [양자화 X (원본유지)] model.language_model.layers.4.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
881 | -> [양자화 X (원본유지)] model.language_model.layers.4.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
882 | -> [양자화 X (원본유지)] model.language_model.layers.4.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
883 | -> [양자화 X (원본유지)] model.language_model.layers.4.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
884 | -> [양자화 O] model.language_model.layers.4.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
885 | -> [양자화 O] model.language_model.layers.4.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
886 | -> [양자화 O] model.language_model.layers.4.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
887 | -> [양자화 X (원본유지)] model.language_model.layers.4.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
888 | -> [양자화 X (원본유지)] model.language_model.layers.4.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
889 | -> [양자화 X (원본유지)] model.language_model.layers.4.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
890 | -> [양자화 X (원본유지)] model.language_model.layers.4.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
891 | -> [양자화 X (원본유지)] model.language_model.layers.4.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
892 | -> [양자화 X (원본유지)] model.language_model.layers.4.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
893 | -> [양자화 X (원본유지)] model.language_model.layers.4.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
894 | -> [양자화 O] model.language_model.layers.4.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
895 | -> [양자화 O] model.language_model.layers.4.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
896 | -> [양자화 X (원본유지)] model.language_model.layers.4.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
897 | -> [양자화 O] model.language_model.layers.4.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
898 | -> [양자화 O] model.language_model.layers.4.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
899 | -> [양자화 X (원본유지)] model.language_model.layers.5.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
900 | -> [양자화 X (원본유지)] model.language_model.layers.5.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
901 | -> [양자화 X (원본유지)] model.language_model.layers.5.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
902 | -> [양자화 X (원본유지)] model.language_model.layers.5.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
903 | -> [양자화 X (원본유지)] model.language_model.layers.5.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
904 | -> [양자화 X (원본유지)] model.language_model.layers.5.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
905 | -> [양자화 X (원본유지)] model.language_model.layers.5.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
906 | -> [양자화 X (원본유지)] model.language_model.layers.5.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
907 | -> [양자화 X (원본유지)] model.language_model.layers.5.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
908 | -> [양자화 O] model.language_model.layers.5.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
909 | -> [양자화 O] model.language_model.layers.5.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
910 | -> [양자화 O] model.language_model.layers.5.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
911 | -> [양자화 X (원본유지)] model.language_model.layers.5.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
912 | -> [양자화 X (원본유지)] model.language_model.layers.5.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
913 | -> [양자화 X (원본유지)] model.language_model.layers.5.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
914 | -> [양자화 X (원본유지)] model.language_model.layers.5.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
915 | -> [양자화 X (원본유지)] model.language_model.layers.5.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
916 | -> [양자화 X (원본유지)] model.language_model.layers.5.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
917 | -> [양자화 X (원본유지)] model.language_model.layers.5.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
918 | -> [양자화 O] model.language_model.layers.5.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
919 | -> [양자화 O] model.language_model.layers.5.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
920 | -> [양자화 X (원본유지)] model.language_model.layers.5.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
921 | -> [양자화 O] model.language_model.layers.5.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
922 | -> [양자화 O] model.language_model.layers.5.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
923 | -> [양자화 X (원본유지)] model.language_model.layers.6.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
924 | -> [양자화 X (원본유지)] model.language_model.layers.6.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
925 | -> [양자화 X (원본유지)] model.language_model.layers.6.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
926 | -> [양자화 X (원본유지)] model.language_model.layers.6.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
927 | -> [양자화 X (원본유지)] model.language_model.layers.6.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
928 | -> [양자화 X (원본유지)] model.language_model.layers.6.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
929 | -> [양자화 X (원본유지)] model.language_model.layers.6.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
930 | -> [양자화 X (원본유지)] model.language_model.layers.6.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
931 | -> [양자화 X (원본유지)] model.language_model.layers.6.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
932 | -> [양자화 O] model.language_model.layers.6.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
933 | -> [양자화 O] model.language_model.layers.6.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
934 | -> [양자화 O] model.language_model.layers.6.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
935 | -> [양자화 X (원본유지)] model.language_model.layers.6.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
936 | -> [양자화 X (원본유지)] model.language_model.layers.6.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
937 | -> [양자화 X (원본유지)] model.language_model.layers.6.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
938 | -> [양자화 X (원본유지)] model.language_model.layers.6.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
939 | -> [양자화 X (원본유지)] model.language_model.layers.6.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
940 | -> [양자화 X (원본유지)] model.language_model.layers.6.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
941 | -> [양자화 X (원본유지)] model.language_model.layers.6.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
942 | -> [양자화 O] model.language_model.layers.6.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
943 | -> [양자화 O] model.language_model.layers.6.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
944 | -> [양자화 X (원본유지)] model.language_model.layers.6.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
945 | -> [양자화 O] model.language_model.layers.6.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
946 | -> [양자화 O] model.language_model.layers.6.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
947 | -> [양자화 X (원본유지)] model.language_model.layers.7.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
948 | -> [양자화 X (원본유지)] model.language_model.layers.7.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
949 | -> [양자화 X (원본유지)] model.language_model.layers.7.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
950 | -> [양자화 X (원본유지)] model.language_model.layers.7.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
951 | -> [양자화 X (원본유지)] model.language_model.layers.7.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
952 | -> [양자화 X (원본유지)] model.language_model.layers.7.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
953 | -> [양자화 X (원본유지)] model.language_model.layers.7.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
954 | -> [양자화 X (원본유지)] model.language_model.layers.7.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
955 | -> [양자화 X (원본유지)] model.language_model.layers.7.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
956 | -> [양자화 O] model.language_model.layers.7.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
957 | -> [양자화 O] model.language_model.layers.7.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
958 | -> [양자화 O] model.language_model.layers.7.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
959 | -> [양자화 X (원본유지)] model.language_model.layers.7.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
960 | -> [양자화 X (원본유지)] model.language_model.layers.7.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
961 | -> [양자화 X (원본유지)] model.language_model.layers.7.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
962 | -> [양자화 X (원본유지)] model.language_model.layers.7.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
963 | -> [양자화 X (원본유지)] model.language_model.layers.7.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
964 | -> [양자화 X (원본유지)] model.language_model.layers.7.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
965 | -> [양자화 X (원본유지)] model.language_model.layers.7.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
966 | -> [양자화 O] model.language_model.layers.7.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
967 | -> [양자화 O] model.language_model.layers.7.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
968 | -> [양자화 X (원본유지)] model.language_model.layers.7.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
969 | -> [양자화 O] model.language_model.layers.7.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
970 | -> [양자화 O] model.language_model.layers.7.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
971 | -> [양자화 X (원본유지)] model.language_model.layers.8.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
972 | -> [양자화 X (원본유지)] model.language_model.layers.8.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
973 | -> [양자화 X (원본유지)] model.language_model.layers.8.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
974 | -> [양자화 X (원본유지)] model.language_model.layers.8.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
975 | -> [양자화 X (원본유지)] model.language_model.layers.8.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
976 | -> [양자화 X (원본유지)] model.language_model.layers.8.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
977 | -> [양자화 X (원본유지)] model.language_model.layers.8.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
978 | -> [양자화 X (원본유지)] model.language_model.layers.8.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
979 | -> [양자화 X (원본유지)] model.language_model.layers.8.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
980 | -> [양자화 O] model.language_model.layers.8.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
981 | -> [양자화 O] model.language_model.layers.8.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
982 | -> [양자화 O] model.language_model.layers.8.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
983 | -> [양자화 X (원본유지)] model.language_model.layers.8.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
984 | -> [양자화 X (원본유지)] model.language_model.layers.8.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
985 | -> [양자화 X (원본유지)] model.language_model.layers.8.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
986 | -> [양자화 X (원본유지)] model.language_model.layers.8.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
987 | -> [양자화 X (원본유지)] model.language_model.layers.8.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
988 | -> [양자화 X (원본유지)] model.language_model.layers.8.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
989 | -> [양자화 X (원본유지)] model.language_model.layers.8.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
990 | -> [양자화 O] model.language_model.layers.8.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
991 | -> [양자화 O] model.language_model.layers.8.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
992 | -> [양자화 X (원본유지)] model.language_model.layers.8.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
993 | -> [양자화 O] model.language_model.layers.8.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
994 | -> [양자화 O] model.language_model.layers.8.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
995 | -> [양자화 X (원본유지)] model.language_model.layers.9.altup.correct_output_scale : 원본 형태 torch.Size([2048]) |
996 | -> [양자화 X (원본유지)] model.language_model.layers.9.altup.correction_coefs.weight : 원본 형태 torch.Size([4, 4]) |
997 | -> [양자화 X (원본유지)] model.language_model.layers.9.altup.modality_router.weight : 원본 형태 torch.Size([4, 2048]) |
998 | -> [양자화 X (원본유지)] model.language_model.layers.9.altup.prediction_coefs.weight : 원본 형태 torch.Size([16, 4]) |
999 | -> [양자화 X (원본유지)] model.language_model.layers.9.altup.router_norm.weight : 원본 형태 torch.Size([2048]) |
1000 | -> [양자화 X (원본유지)] model.language_model.layers.9.input_layernorm.weight : 원본 형태 torch.Size([2048]) |
1001 | -> [양자화 X (원본유지)] model.language_model.layers.9.laurel.linear_left.weight : 원본 형태 torch.Size([64, 2048]) |
1002 | -> [양자화 X (원본유지)] model.language_model.layers.9.laurel.linear_right.weight : 원본 형태 torch.Size([2048, 64]) |
1003 | -> [양자화 X (원본유지)] model.language_model.layers.9.laurel.post_laurel_norm.weight : 원본 형태 torch.Size([2048]) |
1004 | -> [양자화 O] model.language_model.layers.9.mlp.down_proj.weight : 원본 형태 (2048, 8192) |
1005 | -> [양자화 O] model.language_model.layers.9.mlp.gate_proj.weight : 원본 형태 (8192, 2048) |
1006 | -> [양자화 O] model.language_model.layers.9.mlp.up_proj.weight : 원본 형태 (8192, 2048) |
1007 | -> [양자화 X (원본유지)] model.language_model.layers.9.per_layer_input_gate.weight : 원본 형태 torch.Size([256, 2048]) |
1008 | -> [양자화 X (원본유지)] model.language_model.layers.9.per_layer_projection.weight : 원본 형태 torch.Size([2048, 256]) |
1009 | -> [양자화 X (원본유지)] model.language_model.layers.9.post_attention_layernorm.weight : 원본 형태 torch.Size([2048]) |
1010 | -> [양자화 X (원본유지)] model.language_model.layers.9.post_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
1011 | -> [양자화 X (원본유지)] model.language_model.layers.9.post_per_layer_input_norm.weight : 원본 형태 torch.Size([2048]) |
1012 | -> [양자화 X (원본유지)] model.language_model.layers.9.pre_feedforward_layernorm.weight : 원본 형태 torch.Size([2048]) |
1013 | -> [양자화 X (원본유지)] model.language_model.layers.9.self_attn.k_norm.weight : 원본 형태 torch.Size([256]) |
1014 | -> [양자화 O] model.language_model.layers.9.self_attn.k_proj.weight : 원본 형태 (512, 2048) |
1015 | -> [양자화 O] model.language_model.layers.9.self_attn.o_proj.weight : 원본 형태 (2048, 2048) |
1016 | -> [양자화 X (원본유지)] model.language_model.layers.9.self_attn.q_norm.weight : 원본 형태 torch.Size([256]) |
1017 | -> [양자화 O] model.language_model.layers.9.self_attn.q_proj.weight : 원본 형태 (2048, 2048) |
1018 | -> [양자화 O] model.language_model.layers.9.self_attn.v_proj.weight : 원본 형태 (512, 2048) |
1019 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.0.bn1.weight : 원본 형태 torch.Size([256]) |
1020 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.0.bn2.weight : 원본 형태 torch.Size([128]) |
1021 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.0.conv_exp.weight : 원본 형태 torch.Size([256, 64, 3, 3]) |
1022 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.0.conv_pwl.weight : 원본 형태 torch.Size([128, 256, 1, 1]) |
1023 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.1.bn1.weight : 원본 형태 torch.Size([512]) |
1024 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.1.bn2.weight : 원본 형태 torch.Size([128]) |
1025 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.1.conv_exp.weight : 원본 형태 torch.Size([512, 128, 3, 3]) |
1026 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.1.conv_pwl.weight : 원본 형태 torch.Size([128, 512, 1, 1]) |
1027 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.2.bn1.weight : 원본 형태 torch.Size([512]) |
1028 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.2.bn2.weight : 원본 형태 torch.Size([128]) |
1029 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.2.conv_exp.weight : 원본 형태 torch.Size([512, 128, 3, 3]) |
1030 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.0.2.conv_pwl.weight : 원본 형태 torch.Size([128, 512, 1, 1]) |
1031 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.dw_mid.bn.weight : 원본 형태 torch.Size([768]) |
1032 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.dw_mid.conv.weight : 원본 형태 torch.Size([768, 1, 5, 5]) |
1033 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.dw_start.bn.weight : 원본 형태 torch.Size([128]) |
1034 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.dw_start.conv.weight : 원본 형태 torch.Size([128, 1, 3, 3]) |
1035 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.layer_scale.gamma : 원본 형태 torch.Size([256]) |
1036 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.pw_exp.bn.weight : 원본 형태 torch.Size([768]) |
1037 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.pw_exp.conv.weight : 원본 형태 torch.Size([768, 128, 1, 1]) |
1038 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.pw_proj.bn.weight : 원본 형태 torch.Size([256]) |
1039 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.0.pw_proj.conv.weight : 원본 형태 torch.Size([256, 768, 1, 1]) |
1040 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.dw_start.bn.weight : 원본 형태 torch.Size([256]) |
1041 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.dw_start.conv.weight : 원본 형태 torch.Size([256, 1, 5, 5]) |
1042 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.layer_scale.gamma : 원본 형태 torch.Size([256]) |
1043 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.pw_exp.bn.weight : 원본 형태 torch.Size([1024]) |
1044 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.pw_exp.conv.weight : 원본 형태 torch.Size([1024, 256, 1, 1]) |
1045 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.pw_proj.bn.weight : 원본 형태 torch.Size([256]) |
1046 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.1.pw_proj.conv.weight : 원본 형태 torch.Size([256, 1024, 1, 1]) |
1047 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.dw_start.bn.weight : 원본 형태 torch.Size([256]) |
1048 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.dw_start.conv.weight : 원본 형태 torch.Size([256, 1, 3, 3]) |
1049 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.layer_scale.gamma : 원본 형태 torch.Size([256]) |
1050 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.pw_exp.bn.weight : 원본 형태 torch.Size([1024]) |
1051 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.pw_exp.conv.weight : 원본 형태 torch.Size([1024, 256, 1, 1]) |
1052 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.pw_proj.bn.weight : 원본 형태 torch.Size([256]) |
1053 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.2.pw_proj.conv.weight : 원본 형태 torch.Size([256, 1024, 1, 1]) |
1054 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.dw_start.bn.weight : 원본 형태 torch.Size([256]) |
1055 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.dw_start.conv.weight : 원본 형태 torch.Size([256, 1, 5, 5]) |
1056 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.layer_scale.gamma : 원본 형태 torch.Size([256]) |
1057 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.pw_exp.bn.weight : 원본 형태 torch.Size([1024]) |
1058 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.pw_exp.conv.weight : 원본 형태 torch.Size([1024, 256, 1, 1]) |
1059 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.pw_proj.bn.weight : 원본 형태 torch.Size([256]) |
1060 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.3.pw_proj.conv.weight : 원본 형태 torch.Size([256, 1024, 1, 1]) |
1061 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.dw_start.bn.weight : 원본 형태 torch.Size([256]) |
1062 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.dw_start.conv.weight : 원본 형태 torch.Size([256, 1, 3, 3]) |
1063 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.layer_scale.gamma : 원본 형태 torch.Size([256]) |
1064 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.pw_exp.bn.weight : 원본 형태 torch.Size([1024]) |
1065 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.pw_exp.conv.weight : 원본 형태 torch.Size([1024, 256, 1, 1]) |
1066 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.pw_proj.bn.weight : 원본 형태 torch.Size([256]) |
1067 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.1.4.pw_proj.conv.weight : 원본 형태 torch.Size([256, 1024, 1, 1]) |
1068 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.dw_mid.bn.weight : 원본 형태 torch.Size([1536]) |
1069 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.dw_mid.conv.weight : 원본 형태 torch.Size([1536, 1, 5, 5]) |
1070 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.dw_start.bn.weight : 원본 형태 torch.Size([256]) |
1071 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.dw_start.conv.weight : 원본 형태 torch.Size([256, 1, 5, 5]) |
1072 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1073 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.pw_exp.bn.weight : 원본 형태 torch.Size([1536]) |
1074 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.pw_exp.conv.weight : 원본 형태 torch.Size([1536, 256, 1, 1]) |
1075 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1076 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.0.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1536, 1, 1]) |
1077 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1078 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1079 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1080 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1081 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1082 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1083 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.1.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1084 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.10.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1085 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.10.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1086 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.10.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1087 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.10.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1088 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.10.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1089 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1090 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1091 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1092 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1093 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1094 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1095 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1096 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1097 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1098 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.11.norm.weight : 원본 형태 torch.Size([640]) |
1099 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.12.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1100 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.12.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1101 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.12.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1102 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.12.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1103 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.12.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1104 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1105 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1106 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1107 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1108 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1109 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1110 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1111 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1112 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1113 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.13.norm.weight : 원본 형태 torch.Size([640]) |
1114 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.14.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1115 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.14.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1116 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.14.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1117 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.14.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1118 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.14.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1119 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1120 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1121 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1122 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1123 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1124 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1125 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1126 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1127 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1128 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.15.norm.weight : 원본 형태 torch.Size([640]) |
1129 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.16.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1130 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.16.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1131 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.16.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1132 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.16.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1133 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.16.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1134 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1135 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1136 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1137 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1138 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1139 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1140 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1141 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1142 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1143 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.17.norm.weight : 원본 형태 torch.Size([640]) |
1144 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.18.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1145 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.18.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1146 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.18.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1147 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.18.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1148 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.18.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1149 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1150 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1151 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1152 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1153 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1154 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1155 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1156 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1157 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1158 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.19.norm.weight : 원본 형태 torch.Size([640]) |
1159 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1160 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1161 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1162 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1163 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1164 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1165 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.2.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1166 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.20.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1167 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.20.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1168 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.20.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1169 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.20.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1170 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.20.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1171 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1172 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1173 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1174 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1175 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1176 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1177 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1178 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1179 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1180 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.21.norm.weight : 원본 형태 torch.Size([640]) |
1181 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.22.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1182 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.22.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1183 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.22.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1184 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.22.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1185 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.22.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1186 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1187 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1188 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1189 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1190 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1191 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1192 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1193 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1194 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1195 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.23.norm.weight : 원본 형태 torch.Size([640]) |
1196 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.24.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1197 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.24.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1198 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.24.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1199 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.24.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1200 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.24.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1201 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1202 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1203 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1204 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1205 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1206 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1207 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1208 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1209 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1210 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.25.norm.weight : 원본 형태 torch.Size([640]) |
1211 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.26.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1212 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.26.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1213 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.26.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1214 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.26.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1215 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.26.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1216 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1217 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1218 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1219 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1220 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1221 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1222 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1223 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1224 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1225 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.27.norm.weight : 원본 형태 torch.Size([640]) |
1226 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.28.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1227 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.28.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1228 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.28.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1229 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.28.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1230 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.28.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1231 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1232 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1233 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1234 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1235 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1236 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1237 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1238 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1239 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1240 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.29.norm.weight : 원본 형태 torch.Size([640]) |
1241 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1242 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1243 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1244 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1245 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1246 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1247 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.3.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1248 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.30.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1249 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.30.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1250 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.30.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1251 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.30.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1252 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.30.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1253 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1254 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1255 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1256 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1257 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1258 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1259 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1260 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1261 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1262 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.31.norm.weight : 원본 형태 torch.Size([640]) |
1263 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.32.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1264 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.32.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1265 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.32.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1266 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.32.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1267 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.32.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1268 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1269 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1270 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1271 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1272 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1273 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1274 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1275 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1276 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1277 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.33.norm.weight : 원본 형태 torch.Size([640]) |
1278 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.34.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1279 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.34.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1280 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.34.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1281 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.34.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1282 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.34.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1283 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1284 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1285 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1286 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1287 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1288 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1289 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1290 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1291 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1292 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.35.norm.weight : 원본 형태 torch.Size([640]) |
1293 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.36.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1294 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.36.pw_exp.bn.weight : 원본 형태 torch.Size([1280]) |
1295 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.36.pw_exp.conv.weight : 원본 형태 torch.Size([1280, 640, 1, 1]) |
1296 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.36.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1297 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.36.pw_proj.conv.weight : 원본 형태 torch.Size([640, 1280, 1, 1]) |
1298 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1299 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1300 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1301 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1302 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1303 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1304 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.4.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1305 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1306 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1307 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1308 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1309 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1310 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1311 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.5.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1312 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1313 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1314 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1315 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1316 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1317 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1318 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.6.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1319 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1320 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1321 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1322 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1323 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 640, 1, 1]) |
1324 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1325 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.7.pw_proj.conv.weight : 원본 형태 torch.Size([640, 2560, 1, 1]) |
1326 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.8.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1327 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.8.pw_exp.bn.weight : 원본 형태 torch.Size([640]) |
1328 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.8.pw_exp.conv.weight : 원본 형태 torch.Size([640, 640, 1, 1]) |
1329 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.8.pw_proj.bn.weight : 원본 형태 torch.Size([640]) |
1330 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.8.pw_proj.conv.weight : 원본 형태 torch.Size([640, 640, 1, 1]) |
1331 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.key.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1332 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.key.norm.weight : 원본 형태 torch.Size([640]) |
1333 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.key.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1334 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.output.proj.weight : 원본 형태 torch.Size([640, 768, 1, 1]) |
1335 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.query.proj.weight : 원본 형태 torch.Size([768, 640, 1, 1]) |
1336 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.value.down_conv.weight : 원본 형태 torch.Size([640, 1, 3, 3]) |
1337 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.value.norm.weight : 원본 형태 torch.Size([640]) |
1338 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.attn.value.proj.weight : 원본 형태 torch.Size([64, 640, 1, 1]) |
1339 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.layer_scale.gamma : 원본 형태 torch.Size([640]) |
1340 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.2.9.norm.weight : 원본 형태 torch.Size([640]) |
1341 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.dw_mid.bn.weight : 원본 형태 torch.Size([3840]) |
1342 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.dw_mid.conv.weight : 원본 형태 torch.Size([3840, 1, 5, 5]) |
1343 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.dw_start.bn.weight : 원본 형태 torch.Size([640]) |
1344 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.dw_start.conv.weight : 원본 형태 torch.Size([640, 1, 5, 5]) |
1345 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1346 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.pw_exp.bn.weight : 원본 형태 torch.Size([3840]) |
1347 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.pw_exp.conv.weight : 원본 형태 torch.Size([3840, 640, 1, 1]) |
1348 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1349 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.0.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 3840, 1, 1]) |
1350 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1351 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1352 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1353 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1354 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1355 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.1.norm.weight : 원본 형태 torch.Size([1280]) |
1356 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.10.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1357 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.10.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1358 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.10.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1359 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.10.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1360 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.10.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1361 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1362 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1363 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1364 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1365 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1366 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.11.norm.weight : 원본 형태 torch.Size([1280]) |
1367 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.12.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1368 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.12.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1369 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.12.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1370 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.12.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1371 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.12.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1372 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1373 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1374 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1375 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1376 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1377 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.13.norm.weight : 원본 형태 torch.Size([1280]) |
1378 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.14.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1379 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.14.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1380 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.14.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1381 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.14.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1382 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.14.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1383 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1384 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1385 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1386 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1387 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1388 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.15.norm.weight : 원본 형태 torch.Size([1280]) |
1389 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.16.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1390 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.16.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1391 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.16.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1392 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.16.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1393 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.16.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1394 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1395 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1396 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1397 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1398 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1399 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.17.norm.weight : 원본 형태 torch.Size([1280]) |
1400 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.18.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1401 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.18.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1402 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.18.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1403 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.18.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1404 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.18.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1405 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1406 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1407 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1408 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1409 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1410 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.19.norm.weight : 원본 형태 torch.Size([1280]) |
1411 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.2.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1412 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.2.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1413 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.2.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1414 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.2.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1415 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.2.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1416 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.20.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1417 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.20.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1418 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.20.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1419 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.20.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1420 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.20.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1421 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1422 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1423 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1424 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1425 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1426 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.21.norm.weight : 원본 형태 torch.Size([1280]) |
1427 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.22.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1428 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.22.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1429 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.22.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1430 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.22.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1431 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.22.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1432 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1433 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1434 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1435 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1436 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1437 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.23.norm.weight : 원본 형태 torch.Size([1280]) |
1438 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.24.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1439 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.24.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1440 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.24.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1441 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.24.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1442 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.24.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1443 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1444 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1445 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1446 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1447 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1448 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.25.norm.weight : 원본 형태 torch.Size([1280]) |
1449 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.26.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1450 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.26.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1451 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.26.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1452 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.26.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1453 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.26.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1454 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1455 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1456 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1457 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1458 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1459 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.27.norm.weight : 원본 형태 torch.Size([1280]) |
1460 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.28.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1461 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.28.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1462 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.28.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1463 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.28.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1464 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.28.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1465 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1466 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1467 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1468 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1469 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1470 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.29.norm.weight : 원본 형태 torch.Size([1280]) |
1471 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1472 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1473 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1474 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1475 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1476 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.3.norm.weight : 원본 형태 torch.Size([1280]) |
1477 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.30.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1478 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.30.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1479 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.30.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1480 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.30.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1481 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.30.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1482 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1483 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1484 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1485 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1486 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1487 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.31.norm.weight : 원본 형태 torch.Size([1280]) |
1488 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.32.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1489 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.32.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1490 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.32.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1491 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.32.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1492 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.32.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1493 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1494 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1495 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1496 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1497 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1498 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.33.norm.weight : 원본 형태 torch.Size([1280]) |
1499 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.34.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1500 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.34.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1501 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.34.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1502 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.34.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1503 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.34.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1504 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1505 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1506 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1507 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1508 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1509 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.35.norm.weight : 원본 형태 torch.Size([1280]) |
1510 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.36.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1511 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.36.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1512 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.36.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1513 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.36.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1514 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.36.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1515 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1516 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1517 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1518 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1519 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1520 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.37.norm.weight : 원본 형태 torch.Size([1280]) |
1521 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.38.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1522 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.38.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1523 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.38.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1524 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.38.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1525 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.38.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1526 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.4.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1527 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.4.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1528 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.4.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1529 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.4.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1530 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.4.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1531 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1532 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1533 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1534 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1535 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1536 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.5.norm.weight : 원본 형태 torch.Size([1280]) |
1537 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.6.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1538 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.6.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1539 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.6.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1540 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.6.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1541 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.6.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1542 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1543 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1544 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1545 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1546 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1547 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.7.norm.weight : 원본 형태 torch.Size([1280]) |
1548 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.8.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1549 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.8.pw_exp.bn.weight : 원본 형태 torch.Size([2560]) |
1550 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.8.pw_exp.conv.weight : 원본 형태 torch.Size([2560, 1280, 1, 1]) |
1551 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.8.pw_proj.bn.weight : 원본 형태 torch.Size([1280]) |
1552 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.8.pw_proj.conv.weight : 원본 형태 torch.Size([1280, 2560, 1, 1]) |
1553 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.attn.key.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1554 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.attn.output.proj.weight : 원본 형태 torch.Size([1280, 1536, 1, 1]) |
1555 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.attn.query.proj.weight : 원본 형태 torch.Size([1536, 1280, 1, 1]) |
1556 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.attn.value.proj.weight : 원본 형태 torch.Size([96, 1280, 1, 1]) |
1557 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.layer_scale.gamma : 원본 형태 torch.Size([1280]) |
1558 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.blocks.3.9.norm.weight : 원본 형태 torch.Size([1280]) |
1559 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.conv_stem.bn.weight : 원본 형태 torch.Size([64]) |
1560 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.conv_stem.conv.weight : 원본 형태 torch.Size([64, 3, 3, 3]) |
1561 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.msfa.ffn.pw_exp.bn.weight : 원본 형태 torch.Size([3840]) |
1562 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.msfa.ffn.pw_exp.conv.weight : 원본 형태 torch.Size([3840, 1920, 1, 1]) |
1563 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.msfa.ffn.pw_proj.bn.weight : 원본 형태 torch.Size([2048]) |
1564 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.msfa.ffn.pw_proj.conv.weight : 원본 형태 torch.Size([2048, 3840, 1, 1]) |
1565 | -> [양자화 X (원본유지)] model.vision_tower.timm_model.msfa.norm.weight : 원본 형태 torch.Size([2048]) |