Skip to content

Conversation

@pzdunows
Copy link

No description provided.

@pzdunows
Copy link
Author

This change improves performance of LFM2 models.

Before:

D:\test>fix_disabled\llama-bench.exe -m LFM2-1.2B-Q8_0.gguf -ngl 100 -t 8
HIP Library Path: D:\test\fix_disabled\amdhip64_7.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 7900 XTX, gfx1100 (0x1100), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| lfm2 1.2B Q8_0                 |   1.16 GiB |     1.17 B | ROCm       | 100 |           pp512 |    13955.14 ± 240.79 |
| lfm2 1.2B Q8_0                 |   1.16 GiB |     1.17 B | ROCm       | 100 |           tg128 |        278.71 ± 5.62 |

build: 017eceed6 (7036)

After:

D:\test>fix_enabled\llama-bench.exe -m LFM2-1.2B-Q8_0.gguf -ngl 100 -t 8
HIP Library Path: D:\test\fix_enabled\amdhip64_7.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 7900 XTX, gfx1100 (0x1100), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| lfm2 1.2B Q8_0                 |   1.16 GiB |     1.17 B | ROCm       | 100 |           pp512 |    14617.30 ± 177.74 |
| lfm2 1.2B Q8_0                 |   1.16 GiB |     1.17 B | ROCm       | 100 |           tg128 |        328.27 ± 7.79 |

build: 017eceed6 (7036)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant