Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Self organizing systems is an area of research to which I think LLMs will contribute immensely.

But as of now, even newer AI models are not particularly insightful. I'm always surprised by how suboptimal near-frontier LLMs are at collaborating in some of the easier cooperative environments on my benchmarking and RL platform. For example, check out a replay of consensus grid here: https://gertlabs.com/spectate

 help



While interesting, its not clear to me with just looking at concensus grid how they are prompted.

Do you tell them to think and coordinate the next step through some type of sync/talking mechanism or is it turn by turn?

I suspect turn by turn as it is similiar to other experiements and in this case, it wouldn't work because they wouldn't have a certain amount of time to think about the next step together?


All of our environments are tick based (with ticks of varying speeds), and this is explained in the prompt given to the models, along with the latest observation and a history of recent events/conversations/actions.

So that does make the game more challenging, versus some other simulations we have where multiple conversation turns happen before action. But the inefficiencies I'm describing are different; for example, an agent reaches part of the destination area but is clearly blocking another player who needs to pass, and most models will just stay put instead of moving along to another target spot.


So is "Game Overview" the prompt? Because i can't seem to see any indication / hint given to the models that its a game they should work together on and commmunicate etc.

No the full prompt is not available in the UI, sorry.

Have you tried recursive self-reflective agents?

The agent makes a copy of itself in /tmp/. Runs. Evaluates. Updates itself. Makes a copy of itself. Runs. Evaluates. Updates itself. Makes a ...... you get the idea.

They will not stop if the recursion is given a hard to meet termination condition. Also, if it can cheat to solve the termination condition it will.


I have not run one personally, but I love the idea. Reminds me of yoyo-evolve. My friend made this repo: https://github.com/dwolner/cosmic-insight



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: