Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> there is no quantitative measure of performance here

Have them do multiplication or other complicated arithmetic. You say that isn't difficult. Then why do they burn 200k tokens in 20 minutes without converging? I did a deep exploration to help myself understand here [0].

[0] https://adamsohn.com/reliably-incorrect/

 help



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: