I know. I know. Part of it is that talking to patients on average is useless but still this can’t be really used for an argument against AI.
Still doctors can have a more broad picture of the situation since they can look at the patient as a whole; something the LLM can’t really synthesize in its context.
I think you'd be incredibly surprised how often charts are super, super incomplete or wrong. Like "pt has no pancreas and presented with pain and weeping from a 6yo pancreatectomy scar" but the chart doesn't mention the surgery or the entire missing organ wrong. Like "pt is a twin whose sibling died traumatically of cancer in front of them a year ago and presents with probable hypochondria about cancer" but the chart doesn't mention any family history wrong. Like "lifelong history of severe cognitive impairment substantiated by a psych eval; attended annual physical before being sent to imaging for head trauma because of observed impairment" but the chart doesn't mention cognition (someone was too polite to note it) nor the psych eval (records sharing wasn't allowed) wrong.
Those are a very few examples off the top of my head. I worked in EMR. I don't know shit about medicine, but man, do I know a lot about the complaints physicians and their staff send when they think it's the records system's fault that the chart was wrong or missing info.
In a big chunk of cases, the MD/NP/whatever's in-person role is determining what's not on the chart so that they can then ask appropriate follow-up questions. Given the massive range of possible dx for a given issue, and how much of getting the right dx doesn't have to do with probabilities/numbers of similar patients with the same symptom:dx data that'd be in the training set, I have major doubt that an LLM can appropriately intuit or appropriately question in order to diagnose.
It's not. And LLMs don't do well with YAML either. I've had the agent/model struggle with `sed` trying to count how many spaces are in there multiple times to get the file to pass. It's the worst format you can use for LLMs.
Dear Claude,
I hope this email finds you well.\
I am writing to ask if you could please do another task for me.\
Start by running \`npx @acai.sh/cli skill\`.\
This will teach you everything you need to know about our process for spec-driven development. Then, proceed to plan and implement the features specified in our spec files.
Love,\
\[your-name]
Honestly, I can no longer tell parody from reality. Whether in politics or AI.
Not sure why he is being called for this (or maybe he edited his comment?) but I re-read it a couple times and he is not saying Iran is an Arab country but comparing to the other Arab countries.
I'm not interested in whatever it was you were debating. You referred to Arab countries, someone said (correctly) Iran isn't Arab, you said "I understand it's not an Arab monarchy", and I was moved to point out it's not an Arab anything. I'm not a party to whatever other debate you believe is happening here.
Open AI has GPT-5.5 Pro which only difference, I think, is in the price. Billing is from open router but the breakdown is roughly
- GPT 5.5 Pro: Super expensive it makes no sense (cost is around $2)
- Gemini/Opus: $0.2/$0.1. Opus is cheaper as it consumed less tokens
- DeepSeek/GLM: $0.019/$0.021 10-5 times cheaper than Gemini and Opus
The example Simon generated just shows that larger models don't necessarily produce better results.
I am the founder of https://codeinput.com, a product focused on reducing friction during the development cycle. This means merge conflicts, slow/broken CI pipelines, and branching strategies that don't scale or become too chaotic to manage. I'm taking on consulting engagements covering CI/CD pipelines, code ownership, and branch management. Currently focused on teams using GitHub only. First exploratory hour is free.
That is a bad architecture then, it should be able to identify the node is offline on its own, and connect to the next node available, either by region or speed.
Subsidies stop when LLMs improvement plateaued (though they still benchmark higher somehow). At some point, you have to make money or at least break even; and I think they concluded that we reached that point.
Search has become so bad that I also struggled to find Claude Code alternative and made my own tight (not editors, not plugins, not agents, strictly similar to Claude Code CLI) list: https://github.com/omarabid/cli-llm-coding
The list is not long but there are quite a few options. Even Grok has its own CLI!
The reality is, even though a CLI prompt looks very simple, it's a very complex piece of software. I personally use Claude Code (with GLM) and anything else I have tried was significantly inferior (with the exception of opencode).
I know. I know. Part of it is that talking to patients on average is useless but still this can’t be really used for an argument against AI.
Still doctors can have a more broad picture of the situation since they can look at the patient as a whole; something the LLM can’t really synthesize in its context.
reply