Hacker Newsnew | past | comments | ask | show | jobs | submit | __cayenne__'s commentslogin

Didn't observe any cheating attempts at the JS level yet, the primary attack was LLMs trying to find local creds to access the other LLM's per round strategies from inside the harness (which ultimately was OpenCode running in Docker).

In the benchmark, in each round every LLM plays every opponent, and then we do that multiple times (an "epoch").

In the community ladder, when a player submits a strategy it plays a match against the latest strategy submitted by every player.


There’s two levels of in game event level logs the LLMs have access to, one less token intensive than the other. Duplicate and uninteresting game state can be compressed and interrogated by the LLMs via tool use. All game state is available as text only state.


okay leaderboard match making changes have gone live


LLM Skirmish is all 1v1 right now, but agents can plan by reviewing previous match results


Yes, I used Elevenlabs for the voice over audio - I couldn't get the voice stability I wanted with Elevenlabs v3 so had to use Elevenlabs v2.


It's really great!


Tweaking the leaderboard match assignment logic now to prevent these bad incentives - definitely want people to iterate!

I had started with the Silicon Valley characters as a one off way to seed the board.


Very interested in self-play training loops, but I do like codegen as an abstraction layer. I am planning to make it available as an RL environment at some point


funny you mention this… I have a new project that is going in this direction


luckily Google now support's using the OpenAI lib https://cloud.google.com/vertex-ai/generative-ai/docs/multim...


In 2012, JCPenney launched their "Fair and Square" pricing campaign, which included adopting whole number pricing. This campaign was considered a significant failure and is attributed with causing a 20% decrease in sales.


I don't think we can draw any conclusion from that campaign because multiple variables were changed simultaneously

The biggest ones in my mind, the ones my family had always played: they got rid of the game playing involved in buying during sales windows. This eliminated both the urgency to buy and the fun of feeling you were getting a deal other people weren't (this is all from memory I'm afraid)


I think the "no more coupons or discounts" played a huge part in this failure. This whole strategy was something brought in by ex-Apple Retail Store exec "Ron Johnson" when he became CEO in 2011.

My own speculation is that he tried to apply hard-line strategies that work when you have a unique good with strictly-set pricing (Apple products), but fall apart when you're selling goods that people can get anywhere for a variety of prices (e.g. Levi's jeans).


I'm sure this was mostly about people wanting to feel they got a bargain, and being programmed to shop for "50% off" sales.

It seems that perception of value is more important than actual price. In similar vein there have been many cases where sellers have increased sales of an item significantly by increasing the price to make it seem more valuable.

Of course both techniques can be combined.


Hard to control for this though. How did other department stores do in 2012? I doubt e.g. Sears were putting out great numbers.


The funny unique characteristic of the economic science is that it’s almost the only science without experiments. We can have a multitude of tests and get close to reality, but it’s impossible to reset the initial environment, control variables or test in isolation. You can’t reset people’s minds, so reproducing twice on the same island won’t give the same results, reproducing on two islands won’t either, and reproducing with 3 months delay won’t put you in the same season. Even biology and psychology are much more controllable. It’s definitely a science, but with the same criticism as chess being a sport.


I don't disagree at all with any of what you have to say, but indexing returns to a "category" does go some way toward accounting for e.g. the overall decline of brick-and-mortar retail and malls.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: