add MMLU benchmark val / test #1

LZHgrla · 2023-07-13T02:55:46Z

MMLU benchmark val / test. This implementation follows the approach of the original implementation and QLoRA.

Model	Split	Method	Original Impl.	Ours
LLaMA 7B	val	zero-shot	32.3	33.3
	val	five-shot	33.6	33.2
	test	zero-shot	32.5	32.9
	test	five-shot	35.1	35.6

Note: We use the preprocessed MMLU dataset from QLoRA, instead of the original dataset.

* add dataset pipeline doc * add dataset pipeline doc * fix bugs * fix bugs * refine doc * fix bugs * Update README.md * Update README.md * update docs (#1) * Update README.md * fix pre-commit * rename xTuner to XTuner * Update README.md * Update README.md * Update README.md * Update README.md * fix pre-commit * Update README.md * Update README.md * Update README.md * Update README.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update finetune.md * Update finetune.md * Update chat.md * fix pre-commit * add zh_cn chat and finetune doc * Update chat.md * Update README.md * del tool_usage * Update README.md * Update chat.md * Update chat.md * Update README.md * Update README.md * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * fix pre-commit * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * Update README_zh-CN.md * Update README_zh-CN.md * Update README_zh-CN.md * refactor data pipeline doc * add colorist llama2 * fix incremental pretraining doc --------- Co-authored-by: LZHgrla <[email protected]> Co-authored-by: LZHgrla <[email protected]>

* add mmlu dataset configs * add mmlu metric * fix bugs * implement predict for sft model * remove dummy file * clean code * modify prefix for mmlu test * add test.py * add mmlu val/test for gunaco config * use float16 for gunaco * add METAINFO and add logger

* add dataset pipeline doc * add dataset pipeline doc * fix bugs * fix bugs * refine doc * fix bugs * Update README.md * Update README.md * update docs (InternLM#1) * Update README.md * fix pre-commit * rename xTuner to XTuner * Update README.md * Update README.md * Update README.md * Update README.md * fix pre-commit * Update README.md * Update README.md * Update README.md * Update README.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update chat.md * Update finetune.md * Update finetune.md * Update chat.md * fix pre-commit * add zh_cn chat and finetune doc * Update chat.md * Update README.md * del tool_usage * Update README.md * Update chat.md * Update chat.md * Update README.md * Update README.md * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * fix pre-commit * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * Update README_zh-CN.md * Update README_zh-CN.md * Update README_zh-CN.md * refactor data pipeline doc * add colorist llama2 * fix incremental pretraining doc --------- Co-authored-by: LZHgrla <[email protected]> Co-authored-by: LZHgrla <[email protected]>

LZHgrla added 10 commits July 12, 2023 17:37

add mmlu dataset configs

cc5a8ad

add mmlu metric

d20bc9d

fix bugs

1a5e29e

implement predict for sft model

1543518

remove dummy file

3ddcd62

clean code

8c5eb18

modify prefix for mmlu test

5b8447e

add test.py

d7ae93b

add mmlu val/test for gunaco config

1c37dbd

use float16 for gunaco

4338c9b

LZHgrla requested a review from pppppM July 13, 2023 03:01

add METAINFO and add logger

2ef387e

LZHgrla merged commit e840b6c into InternLM:main Jul 14, 2023

LZHgrla deleted the lzh/add_mmlu branch July 21, 2023 07:10

apachemycat mentioned this pull request Jun 26, 2024

单机多卡训练卡住，日志也看不出问题 #792

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add MMLU benchmark val / test #1

add MMLU benchmark val / test #1

Uh oh!

LZHgrla commented Jul 13, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

add MMLU benchmark val / test #1

add MMLU benchmark val / test #1

Uh oh!

Conversation

LZHgrla commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LZHgrla commented Jul 13, 2023 •

edited

Loading