Hacker Newsnew | past | comments | ask | show | jobs | submit | Alifatisk's commentslogin

Oh, have we reached the end of the pendulum swing and are now swinging back to ordinary CSS again.

To be fair, CSS has evolved and improved A LOT in the past years, and a big reason why is Tailwind.

I write Rails aswell but comparing Rails with Erlang is kinda weird, they are for completely different domains. Both can co-exist in this world. These are only tools, not a cult or part of personality.


Whats crazier is that Codex is free. I thought I had to pay to even try it out but nope, you can use the desktop app or cli for free, its apparently included in the free plan. You just have to sign in to your ChatGPT account.

Of course I am aware that the caveat here is that all my interaction is part of training, but I’m fine with that. Even Qwen Cli discontinued the free plan.


First hit is free… got to get you hooked.

I think it's free for about 2 useful requests and then you have to upgrade or wait?

Switching to GPT 5.4-mini can increase the number of requests we can use freely.

So basically a 20$ Claude plan lmao

I stopped using my Claude subscription because it became so prohibitive. Back to ChatGPT and Codex full time and been pretty happy. I miss the tone/writing style of Claude, but don't miss the frustration of being told I've reached my plan limits in a comically short amount of time.

Using these prompts/steering[0], setting Base style to Friendly, Warm to More, Enthusiastic to Default, Headers, Lists, and Emoji to Less, I have found I can get gpt-5.5 about ... 80% of the way there to writing as non-annoyingly as Claude. And it's so much faster and has such higher limits that that's worth it for me.

I also put together this ridiculous thing[1] because I missed the font and color scheme of Claude.

[0] https://gist.githubusercontent.com/dmd/91e9ca98b2c252a185e8e...

[1] https://github.com/dmd/aimpostor


How do you fit that entire prompt in the customized instructions ?

Some of it is in my customized instructions, some of it I fed pieces in at a time saying "remember this please:" so it goes into Memories.

I'm not entirely clear on the mechanism by which memories make it into context, so it's possible some of it isn't all the time, but it does seem to be working reasonably.

Again, it's not as good as Claude when it comes to writing "not like an AI". But it's significantly better than it was.


Thanks, I’ll give those a try!

FYI I'm actively working on aimpostor, so check back in a couple days for some quality improvements. (I'm definitely not going to bother with a Sparkle updater or anything like that.)

on Codex I ran into limits maybe like 2 times in 3 months, after doing several "upgrade this experimental game to my latest shared framework" passes on 5.5 Extra High

On which plan?

I can go through a 5-hour limit with a $20/mo Plus subscription in a few minutes with 5.5 Extra High. This causes me to reserve the latest/best rev for the harder problems.

5.5 really does seem to be very superior to 5.4, but it's also very expensive to run: The gas gauge moves fast. It's not very clearly defined whether 5.5 will cost less to get a problem solved quickly, or if a bunch of automatic iterations of 5.4 will solve it less-expensively. Both are often frustrating to me on the $20 plan.

(Also: Are you sure you're seeing it right? 5.5 has been in the wild for less than a month, so far. https://openai.com/index/introducing-gpt-5-5/ )


The standard $20 plan, on my existing Godot code: https://github.com/InvadingOctopus/comedot

Most of those commits since the last few months are thanks to Codex reviews (but the code is not AI generated): 5.5 since it came out, and 5.4 etc before that, almost always on Extra High because it's for a framework that underlies the other stuff I do so I want make to sure everything's correct.

Sometimes I have to run multiple passes on the same task: I rarely continue any session beyond 4-5 prompts to avoid "bloat" or accumulate "stale context", so sometimes Codex finds different stuff in subsequent reviews of the same file/subsystem.

The project is modular enough where each file can be considered standalone with only 1-2 dependencies, and I already used to write a lot of comments everywhere (something some people laughed at), so maybe that helps the AI along?


Thanks. That's good data.

I'm taking this, along with my own experience, to mean that the GPTs are cheaper to use for refactors of an existing body of work than they are for creating a new one.

(And perhaps part of that is in the name? These "LLM" contraptions are very good at translation, after all. And tokens seem to relate more to concepts than to specific phrases or words.)


the current state of that 20$ claude plan, despite twice this week them stating better usage. first for "double 5 hour usage", then for 50% overall more usage a week.

MAYBE the 50% overall is true, but the double usage during a 5 hour window i just dont see it at all. I've maxed 3 5 hour windows since this happened, 0% chance it was double as much as normal, i ate up about 4-5% of my weekly total each time(this was ~10% each time pre announcements). wish i could give token numbers but its obscured i just know it was around 120k 4.6 with some delegation to sonnet subagents.

So SURE its almost certainly more allotted weekly, but if those totals are consistent for 5 hour blocks, you gotta split your daily usage into at least 3 sessions with 5 hours between them to even hit that weekly limit. its unreal how much they have burned their good reputation in a 2 month stretch, i am positive its also being astroturfed with bots more than happy to advance the narrative.

the internet is annoying, these tools are overall cool, just wish anthropic would go back to being semi predictable.


Can’t you just turn off training on your data in the settings?

WDYM it is free? You hit rate limits after anout 10 messages

(side note I like that they rate limit messages and not tokens. at least it does not just stop mid reply)


How much better is it than Claude? I have both but Claude sucks up so many tokens.

5.5 is absolutely comparable to opus 4.7 (both on highest effort), maybe even better. It generally seems less lazy, faster, and writes code closer to what I'd write. The only downside is that for very very long tasks, it can kind of lose track of the goal. For tasks under ten minutes I'll go with codex every time.

The main difference is in the frontend skills. GPT produces terrible design. What I do these days is ask Opus to produce an HTML mockup, then feed it to Codex.

I have not had problems with long goals. I let it chomp for 40 minutes on a proof in my custom theorem prover (xhigh fast), and it got there. Very happy with Codex, I ditched Claude for it.

They've added a new goal mode that might help with that

I switched some time after Anthropic bricked their models with adaptive thinking. It's a legit mystery to me how people are still using CC professionally.

Codex is far less frustrating and manages context better. It's also costing me about 1/3rd as much as Opus 4.7 on CC.


The only way to keep using CC for me has been to stick to 4.6 1M

Oh I didn't know you could type /model claude-opus-4-6 and still use it.

Thanks!


Yes, and /model claude-opus-4-6[1m] gets you the larger context window. Happy to help :)

Thanks for the hint, but is a large context window actually that useful? I tend to get garbage too often with a normal big context window.

IME, based on an in-house bench it's still good to about 20% on the 1M for 4.6 and 4.7 with a code base >50k loc. The trick I used before switching providers was to have it write a handoff when it hit ~18% of context and reset.

There are also many people running 4.5 with specific parameters that claim to be having luck.


The best way is to code in a way that doesnt break via model instruction nuances changing. 4.7 is superior

I stopped trying to use Claude to do anything with 4.7 because it sucks up so many tokens so quickly. I use the 4.6 model still and have switched to Codex for larger tasks. It also works better at more complex coding tasks than Claude for web apps that have python backends and typescript front ends.

I've been on the codex train for a few months now for personal stuff, but have Claude at work. I always tell people it's as good if not better than CC, but it has different strengths and weaknesses.

Claude was more autonomous and still is a little, but I think GPT 5.5 closed that gap a lot. Claude is far better at front end design. I think it's still better at big picture planning.

Codex is far better at code review and catching bugs that actually matter. I think it's better at following directions, although I think that regressed a bit with 5.5 (flip side of the autonomy I mentioned earlier). A lot of CC users claim to not like Codex's personality (or lack of), but personally I prefer it.


I just like that Openai let's you use your codex subscription with whatever harness you like. I prefer Pi, so that's what I use. GPT 5.5 xhigh feels equivalent to Opus to me, so there's no reason for me to be locked into the Claude Code cli. I use it off-and-on throughout my workday and never even come close to the pro limits.

I found it actually thinks about architecture and tests and not just spit out code with TODO in it like Claude.

Compaction is basically seamless which is a major weak point of Claude. At effort=low, Claude is better than codex but still slower. If you don't mind trading the upfront quality of work with additional micromanaging but at a faster speed, it is fine. I also think because of that very reason, you absorb more of the code.

Less gibliterrating and more doing

Very fast


I was really unimpressed by the free Codex (for nodejs/react dev). I think it must be using a less powerful model or they’re limiting it in some other way.

Are you specifically pointing at a different experience between free + paid? Or just that the free version is unimpressive?

I'm using paid on TypeScript and it's genuinely terrific. Subjectively I think it has the edge over Opus.

I'd be surprised if OpenAI is hamstringing the free version. That would seem crazy from a GTM PoV. If anything the labs seem to throttle the heavy paid users.


Yes, the free version doesn't have access to the same models that the paid does.

You have access to 5.5 xhigh on free. Which model is missing except the 5.3 that run on cerebras?

It's only missing the trash models. Likely a user skill issue.

The free version of ChatGPT is definitely worse as well. My SO uses the free version and I can tell a significant downgrade.

Post your chat session

Can Codex chats be shared? (This is a genuine question; so far, I've only used Codex in CLI on Linux.)

Via jsonl file

> I was called in to explain "academic dishonesty" from apparently copying results from a former student who I had never met. I truly had no idea where Qwen got the results, I just cared that it was correct. I told them I found it on StackOverflow, which they believed and let me off with just one failed class to retake.

How did this happen? I assumed Qwen generated all its content? Even when its using web search or sifting through a document.

> So what did I learn? Yes you can use AI, and you don't need to burn thousands of dollars on Claude, when GLM and Qwen are subsidized by the Chinese government and available for free.

This is so true. I've gotten tons of help from Qwen, Kimi and GLM (Z.ai) when I have gotten stuck. All for free! We are at a point where we have access to very powerful models today. Especially Qwen models who have very good vision models, like truly vision models. You can upload graphs, diagrams and all kinds of images to let it analyze and discuss about it. It's not like the other models who just performs an OCR on the image, it truly analyzes the image. I am very thankful that I have access to these three sites for free. Especially Qwen and Z.ai.


> How did this happen? I assumed Qwen generated all its content? Even when its using web search or sifting through a document.

Qwen does generate everything, so I'm not exactly sure. It wasn't anything too unique, so it could've easily learned and regurgitated it from any one of hundreds of sites.

+1 on Qwen Vision, its far better than anything else I've seen, and its open source, woo!

What are Western models for other than being the ethics police when you ask about something controversial, if the Chinese models are superior and lack the moral / ethical BS? Don't trust China? Self-host Chinese models or host in AWS and verify the output!


I assume it was you who wrote the article, well done! It was entertaining to read.

Yep! It's genuine human slop!

The platform its on is open source and anyone can write anything anonymously if that's of interest


Whats the use case for Grok over the others? Close integration with Twitter is the only thing I can come up with.

I guess it's cool if you like "white genocide" sprinkled randomly throughout the output.

I've tried Claude Code with another LLM, it's very good at doing tasks and figuring things out. So this made me wonder, even though we know how good Claude models is, maybe the true value is in the harness now?

> This PR has been marked as AI slop and the description has been updated to avoid confusion or misleading reviewers

> Seahorse emoji demonstrates this nicely, the LLM internally holds a semantic vector for seahorse+emoji but the output translation layer can't match it.

I am curious about this, how can the LLM hold the embedding for seahorse+emoji if it doesn’t exist? How did it end up like this? Perhaps the dataset had discussions from people about new potential emojis?


Because it's just the embedding for a seahorse plus the embedding for an emoji symbol output.

> When artificial systems produce human-like language, people may draw a reverse inference: if LLMs can speak like humans, perhaps humans think like LLMs.

I think I experienced this when I learned about LLMs, chain of thought, thinking tokens, short-term memory context, and long-term memory context. I began applying these concepts to real life and reasoning about how our brains work as if these concepts described how our brains actually function. But maybe this is more akin to the Tetris effect?


People have been doing this since the invention of clockwork. Analogies are useful, even when they're utterly wrong, since they provide a perspective and that perspective is not necessarily wrong. Who knew?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: