あれ2025/7/5 0:51:00 uv run gen_judgment.py --bench-name japanese_mt_bench --model-list xxxxxxxxx --judge-file data/judge_prompts.jsonlf
『Japanese MT-bench++: より自然なマルチターン対話設定の 日本語大規模ベンチマーク』2025/6/22 22:50:00 https://www.anlp.jp/proceedings/annual_meeting/2025/pdf_dir/D9-1.pdf
『FastChat/fastchat/llm_judge at jp-stable · Stability-AI/FastChat』2025/6/22 22:39:00 https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge