if you're not checking citations in the paper youre publishing AND trusting a non SOTA, hallucination prone ai model to come up with sources for it, its probably for the best of everyone that this paper isn't published.
yes there will be rare exceptions but in general i feel like this is a really good addition.
7b mistral is quite outdated. On a 12gb 4070 you can run qwen 3.5 9b q4km or qwen 3.6 35b, the latter will be a lot smarter but also a lot slower due to ram offload.
Try both in lm studio, they really are surprisingly capable
Gemma 4 26B-A4B might be interesting to try on your machine. The latest optimizations make MoE models work pretty nicely on setups like that with a decent GPU and lots of slowish RAM. I have a 16gb GPU and 64gb of 3200mhz DDR4 and get 15-20 tokens/sec out of that model with zero finagling or tweaking. I’ve been very impressed by it, even having run just about every other open weight model that would fit on my machine over the last few years.
anthropic really needs to just make a great, personalized customer support experience where you get a couple dollars worth of opus credits that has some authority and ability to help with your issue.
"it couldnt be that simple because xyz" why not? I'm yet to see any big ai company actually try this
Great points. Using their printer "rooted" or with custom firmware seems like a decent compromise to me, kind of like what graphene is doing with pixels
If I had an actual need that wasn't being met, I might buy one of their printers just to root and run with custom firmware. I might just do it for the fun of it. Even with tariffs their printers are only running around $220 at Best Buy.
However, even that sounds suspiciously like a project in and of itself. I haven't had time to design and print anything in the last month. So I expect I'll keep rolling along like I am. Things could always change, though.
I wrote up a bit about my workflow here[0][1]. I'm using conductor.build to manage multiple codex sessions at once. When I hit the rate limit, I'm using codex-auth[2] to switch codex accounts.
Gemini feels deep and philosophical. Especially for product management. Tell him you're a product manager and we're a team of two.
But regular reminder - All LLMs can be wrong all the time. I only work with LLMs in domains I'm expert in OR I have other sources to verify their output with utmost certainty.
Or when you don't care about results being very correct.
When I'm cooking meatballs with sauce and the recipe calls for frying them, I'll have an LLM guestimate how long and which program to use in an air fryer to mimic the frying pan, based on a picture of balls in a Pyrex. So I can just move on with the sauce, instead of spending time browsing websites and stressing about getting it perfect.
I used to hate these non-deterministic instructions, now I treat it as their own game. When I will publish my first recipe, I'll have an LLM randomize the ingredient amounts, round them up to some imprecise units and also randomize the times. Psychologists say we artists need to participate and I WILL participate.
LLMs can also be really good in fields where you are not an expert. You just need to be very aware of your limitations, and start parallel conversation so one agent fact checks the other.
Seriously, it’s not worth reaching for less intelligence. Use Extended Pro 100% of the time for things you’d spend the amount of time GP spent writing their post.
Agreed, Gemini is clearly a capable model, but the tool use is lagging behind the other two. Ironically it regularly gets things wrong (ie. the current version of some software) because of an unwillingness to use web search.
ChatGPT and Gemini are actually fairly comparable.
Claude has been utterly useless with most math problems in my experience because, much like less capable students, it tends to get overly bogged down in tedious details before it gets to the big picture. That's great for programming, not so much for frontier math. If you're giving it little lemmas, then sure it's great, but otherwise you're just burning tokens.
I don't think that's unpopular, it is pretty well written. But the "I believe" section is extraordinarily hard to believe given Altman's history.
> Working towards prosperity for everyone, empowering all people
> We have to get safety right
> AI has to be democratized; power cannot be too concentrated
None of these statements, IMO, reflect his actions over the past 5 years.
> we urgently need a society-wide response to be resilient to new threats. This includes things like new policy to help navigate through a difficult economic transition in order to get to a much better future
I agree with this, but there is a near 0% chance of that happening anytime soon in the US. I think he probably is aware of this.
Just my opinion, but it comes off as very insincere.
To be clear, what happened is still awful and there's absolutely no justification for it.
He doesn't trust it for anything else either as far as I can tell. In an interview he's boasted about how he uses a paper notebook for everything all day.
it's "written well" but not at all a smart piece of writing. leading with a photo of a cute baby before engaging in an extended defense of one's own integrity is so obvious as to be insulting
yes there will be rare exceptions but in general i feel like this is a really good addition.
reply