Temperature test and results #43

fxlrnrpt · 2025-11-23T13:48:38Z

No description provided.

SemyonEpanov · 2025-11-24T17:41:26Z

src/postprocessing/jsonl_count_accuracy.py

+    parser = argparse.ArgumentParser(description="Count accuracy for JSONL files with gold and answer fields")
+    parser.add_argument("jsonl_path", type=str, help="Path to the JSONL file")
+
+    args = parser.parse_args()


Do we allow CLI use for information/description files?

It is not really an experiment, but a post-processing tool. Should be fine in my view

SemyonEpanov · 2025-11-24T18:27:10Z

src/experiments/convert.py

@@ -0,0 +1,9 @@
+import pandas as pd
+
+df = pd.read_json("data/out/distillation/mmlu_synth_gptoss_a_t0_8.jsonl", lines=True)


Switching to read_ndjson will save us a couple of minutes :)

SemyonEpanov · 2025-11-24T20:56:23Z

src/experiments/distill/mmlu_synth_qwen3_a_t0_8.py

Wouldn't it be better to stick to one file per experiment?
Files for all experiments will also be attached.
Alternative: describe key changes in the experiments in plain text (either in .md or as comments in the code itself).

fxlrnrpt and others added 10 commits November 17, 2025 15:42

Add t0 results

8dc6691

Add t0_8

269e387

Add temperature analysis

ed8ca7a

Add qwen3 temperature test

e763819

Add interim results for qwen

40fc546

Added error string handler

5372f58

Add qwen temperature test

f9e884d

Fix counting script

8f44781

Fix skipping

3bcfded

Add synth results

b774374

fxlrnrpt requested a review from SemyonEpanov November 23, 2025 13:48

Remove redundant experiments

2fa7863

SemyonEpanov reviewed Nov 24, 2025

View reviewed changes

Refactor synth_aug_mmlu.py to use Parquet instead of JSONL

61b3d71

fxlrnrpt merged commit 0a56823 into main Nov 26, 2025

fxlrnrpt deleted the temperature-test branch November 26, 2025 16:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Temperature test and results #43

Temperature test and results #43

Uh oh!

fxlrnrpt commented Nov 23, 2025

Uh oh!

SemyonEpanov Nov 24, 2025

Uh oh!

fxlrnrpt Nov 26, 2025

Uh oh!

SemyonEpanov Nov 24, 2025

Uh oh!

SemyonEpanov Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,9 @@
		import pandas as pd

		df = pd.read_json("data/out/distillation/mmlu_synth_gptoss_a_t0_8.jsonl", lines=True)

Temperature test and results #43

Temperature test and results #43

Uh oh!

Conversation

fxlrnrpt commented Nov 23, 2025

Uh oh!

SemyonEpanov Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

fxlrnrpt Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

SemyonEpanov Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

SemyonEpanov Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants