はじめに
Intel Arc A770を使っています。4bit量子化された「Gemma-3-12b-it」を使って速度を比較してみました。Vulkanバックエンドでのllama.cppの使い方は
こちら。
SYCLバックエンドでのllama.cppは
こちらからダウンロードした「
llama-cpp-ipex-llm-2.3.0b20250424-ubuntu-core.tgz」を使いました。
結果
-ngl 99
Vulkan
hoge@kioxia:~/Documents/vulkan-llamacpp/build/bin$ ./llama-bench -m /home/hoge/Documents/models/gemma-3-12b-it-Q4_K_M.gguf -ngl 99
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Arc(tm) A770 Graphics (DG2) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
| model |
size |
params |
backend |
ngl |
test |
t/s |
| ------------------------------ |
---------: |
---------: |
---------- |
--: |
------------: |
-------------------: |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
Vulkan |
99 |
pp512 |
61.36 ± 0.14 |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
Vulkan |
99 |
tg128 |
11.59 ± 0.03 |
build: d5fe4e81 (5192)
SYCL
hoge@kioxia:~/Documents/llama-cpp$ ./llama-bench -m /home/hoge/Documents/models/gemma-3-12b-it-Q4_K_M.gguf -ngl 99
| model |
size |
params |
backend |
ngl |
test |
t/s |
| ------------------------------ |
---------: |
---------: |
---------- |
--: |
------------: |
-------------------: |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
SYCL |
99 |
pp512 |
1071.56 ± 0.93 |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
SYCL |
99 |
tg128 |
16.36 ± 0.12 |
build: 4e836c04 (2812)
-ngl 40
Vulkan
hoge@kioxia:~/Documents/vulkan-llamacpp/build/bin$ ./llama-bench -m /home/hoge/Documents/models/gemma-3-12b-it-Q4_K_M.gguf -ngl 40
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Arc(tm) A770 Graphics (DG2) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
| model |
size |
params |
backend |
ngl |
test |
t/s |
| ------------------------------ |
---------: |
---------: |
---------- |
--: |
------------: |
-------------------: |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
Vulkan |
40 |
pp512 |
59.39 ± 0.33 |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
Vulkan |
40 |
tg128 |
7.18 ± 0.01 |
build: d5fe4e81 (5192)
SYCL
hoge@kioxia:~/Documents/llama-cpp$ ./llama-bench -m /home/hoge/Documents/models/gemma-3-12b-it-Q4_K_M.gguf -ngl 40
| model |
size |
params |
backend |
ngl |
test |
t/s |
| ------------------------------ |
---------: |
---------: |
---------- |
--: |
------------: |
-------------------: |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
SYCL |
40 |
pp512 |
167.31 ± 36.20 |
| gemma3 12B Q4_K - Medium |
6.79 GiB |
11.77 B |
SYCL |
40 |
tg128 |
11.22 ± 0.07 |
build: 4e836c04 (2812)