llm judge eval error:failed with code 134 #2285
Unanswered
daimeiquan
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
OS:
WSL Ubuntu 24.04
shell:
python run.py --models hf_qwen2_5_0_5b_instruct --datasets PubMedQA_llmjudge_gen
error:
.../opencompass/opencompass/runners/local.py - _launch - 241 - task OpenICLEval[qwen2.5-0.5b-instruct/PubMedQA] fail, see
outputs/default/20251002_215819/logs/eval/qwen2.5-0.5b-instruct/PubMedQA.out
.../opencompass/opencompass/runners/base.py - summarize - 64 - OpenICLEval[qwen2.5-0.5b-instruct/PubMedQA] failed with code 134
logs/eval/qwen2.5-0.5b-instruct/PubMedQA.out:
free(): double free detected in tcache 2
Aborted (core dumped)
.../opencompass/opencompass/configs/datasets/PubMedQA/PubMedQA_llmjudge_gen_f00302.py:
judge_cfg=dict( type='opencompass.models.OpenAISDK', path='qwen3:8b', key='ollama', openai_api_base='http://localhost:11434/v1', query_per_second=5, max_out_len=256, batch_size=8, meta_template=dict( round=[ dict(role='HUMAN', api_role='HUMAN'), dict(role='BOT', api_role='BOT', generate=True), ] ), ),pip list:
`
Package Version Editable project location
absl-py 2.3.1
accelerate 1.10.1
addict 2.4.0
aiohappyeyeballs 2.6.1
aiohttp 3.12.15
aiosignal 1.4.0
alpaca_eval 0.6
annotated-types 0.7.0
anthropic 0.69.0
antlr4-python3-runtime 4.11.0
anyio 4.11.0
async-timeout 5.0.1
attrs 25.3.0
beautifulsoup4 4.14.2
boto3 1.28.43
botocore 1.31.43
brotlicffi 1.0.9.2
cachetools 6.2.0
certifi 2025.8.3
cffi 2.0.0
chardet 5.2.0
charset-normalizer 3.3.2
click 8.3.0
cn2an 0.5.23
colorama 0.4.6
contourpy 1.3.2
cpm-kernels 1.0.11
cryptography 46.0.1
cycler 0.12.1
dashscope 1.24.6
datasets 3.6.0
decorator 5.2.1
dill 0.3.8
dingo-python 1.5.0
distro 1.9.0
docstring_parser 0.17.0
einops 0.8.1
evaluate 0.4.6
exceptiongroup 1.3.0
faiss-gpu 1.7.2
fastapi 0.117.1
fasttext-wheel 0.9.2
filelock 3.17.0
fire 0.7.1
fonttools 4.60.0
frozenlist 1.7.0
fsspec 2025.3.0
func_timeout 4.3.5
fuzzywuzzy 0.18.0
gmpy2 2.2.1
google 3.0.0
gradio_client 1.13.3
h11 0.16.0
h5py 3.14.0
hanziconv 0.3.2
hf-xet 1.1.10
httpcore 1.0.9
httpx 0.27.2
huggingface-hub 0.35.1
human-eval 1.0.3
idna 3.7
immutabledict 4.2.1
importlib_metadata 8.7.0
jieba 0.42.1
Jinja2 3.1.6
jiter 0.11.0
jmespath 1.0.1
joblib 1.5.2
json5 0.12.1
jsonlines 4.0.0
jsonpatch 1.33
jsonpointer 3.0.0
kiwisolver 1.4.9
langdetect 1.0.9
langid 1.1.6
latex2sympy2_extended 1.10.2
Levenshtein 0.27.1
ltp 4.2.14
ltp-core 0.1.4
ltp-extension 0.1.13
lxml 6.0.2
markdown-it-py 4.0.0
MarkupSafe 3.0.2
math-verify 0.8.0
matplotlib 3.10.6
mdurl 0.1.2
mkl_fft 1.3.11
mkl_random 1.2.8
mkl-service 2.4.0
mmengine 0.10.7
mmengine-lite 0.10.7
mpmath 1.3.0
multidict 6.6.4
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.4.2
nltk 3.9.1
numpy 1.26.4
nvidia-cublas-cu12 12.9.1.4
nvidia-cuda-cupti-cu12 12.9.79
nvidia-cuda-nvrtc-cu12 12.9.86
nvidia-cuda-runtime-cu12 12.9.79
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.4.1.4
nvidia-cufile-cu12 1.14.1.1
nvidia-curand-cu12 10.3.10.19
nvidia-cusolver-cu12 11.7.5.82
nvidia-cusparse-cu12 12.5.10.65
nvidia-cusparselt-cu12 0.7.1
nvidia-nccl-cu12 2.27.3
nvidia-nvjitlink-cu12 12.9.86
nvidia-nvtx-cu12 12.9.79
openai 1.56.2
OpenCC 1.1.9
opencompass 0.5.0
opencv-python 4.11.0.86
opencv-python-headless 4.11.0.86
packaging 25.0
pandas 1.5.3
patsy 1.0.1
Pillow 9.4.0
pip 25.2
platformdirs 4.4.0
portalocker 3.2.0
prettytable 3.16.0
proces 0.1.7
propcache 0.3.2
protobuf 6.32.1
psutil 7.1.0
py 1.11.0
pyahocorasick 2.2.0
pyarrow 21.0.0
pybind11 3.0.1
pycparser 2.23
pycryptodome 3.23.0
pydantic 2.11.9
pydantic_core 2.33.2
pyext 0.7
Pygments 2.19.2
PyJWT 2.8.0
pyparsing 3.2.5
pyphen 0.17.2
pypinyin 0.55.0
PySocks 1.7.1
python-dateutil 2.9.0.post0
python-dotenv 1.1.1
python-Levenshtein 0.27.1
pytorch-triton 3.1.0+cf34004b8a
pytz 2025.2
PyYAML 6.0.2
rank-bm25 0.2.2
RapidFuzz 3.14.1
rdkit 2025.3.6
regex 2025.9.18
requests 2.32.5
retry 0.9.2
retrying 1.4.2
rich 14.1.0
rouge 1.0.1
rouge-chinese 1.0.3
rouge_score 0.1.2
s3transfer 0.6.2
sacrebleu 2.5.1
safetensors 0.6.2
scikit-learn 1.5.0
scipy 1.15.3
seaborn 0.13.2
sentence-transformers 5.1.1
setuptools 78.1.1
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
soupsieve 2.8
spark-ai-python 0.4.5
sseclient-py 1.7.2
starlette 0.48.0
sympy 1.13.3
tabulate 0.9.0
tenacity 9.1.2
tencentcloud-sdk-python 3.0.1470
termcolor 3.1.0
textstat 0.7.10
threadpoolctl 3.6.0
tiktoken 0.11.0
timeout-decorator 0.5.0
tokenizers 0.22.1
toml 0.10.2
tomli 2.2.1
torch 2.8.0+cu129
torchvision 0.23.0+cu129
tqdm 4.67.1
transformers 4.56.2
tree-sitter 0.21.3
tree-sitter-languages 1.10.2
triton 3.4.0
typer 0.19.2
typing_extensions 4.15.0
typing-inspection 0.4.1
urllib3 1.26.20
uvicorn 0.37.0
volcengine 1.0.201
volcengine-python-sdk 4.0.22
wcwidth 0.2.14
websocket-client 1.8.0
websockets 15.0.1
wheel 0.45.1
wonderwords 2.2.0
wordninja 2.0.0
xxhash 3.5.0
yapf 0.43.0
yarl 1.20.1
zhipuai 2.1.5.20250825
zhon 2.1.1
zipp 3.23.0
`
Beta Was this translation helpful? Give feedback.
All reactions