quinn's picture

quinn

jwhe

·

AI & ML interests

None yet

Recent Activity

new activity 6 days ago

harborframework/parity-experiments:[Parity] CL-bench: codex/gpt-5.2 vs infer_codex.py (50 tasks, 3 trials, MATCHING)

new activity 16 days ago

harborframework/parity-experiments:[Parity] CL-bench: codex/gpt-5.1 vs original pipeline (50 tasks, 3 trials)

authored a paper 2 months ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

View all activity

Organizations

New activity in harborframework/parity-experiments 6 days ago

[Parity] CL-bench: codex/gpt-5.2 vs infer_codex.py (50 tasks, 3 trials, MATCHING)

#230 opened 6 days ago by

New activity in harborframework/parity-experiments 16 days ago

[Parity] CL-bench: codex/gpt-5.1 vs original pipeline (50 tasks, 3 trials)

#210 opened 16 days ago by