More

remexre · 2026-05-11T00:08:07 1778458087

I can't tell if you're hyperbolizing the idea of community, or describing schooling.

binary132 · 2026-05-11T17:02:23 1778518943

No need to apply critical or creative thought when the downvote button is right there. Go on, just give it a click and carry on with your day so we don’t have to think about the bad things.

remexre · 2026-05-04T21:26:23 1777929983

If I interpret "a machine superintelligence" as "a classroom of 300IQ humans," I'm not really sure how this is true? You still have material and energy constraints, you can't think your way out of those.

Philpax · 2026-05-05T02:12:47 1777947167

For the concrete problem we're discussing, you can hack your competitors out of existence, replace all of your knowledge workers to shed costs, hyperoptimise your logistics, etc. It's not just intelligence, it's speed and scale.

Bostrom's Superintelligence (2014) is a bit of a dreary read, and I didn't finish it, but it pulls no punches about the leverage that a superintelligence might have in our highly-connected world.

vkou · 2026-05-05T04:44:00 1777956240

> For the concrete problem we're discussing, you can hack your competitors out of existence, replace all of your knowledge workers to shed costs, hyperoptimise your logistics, etc. It's not just intelligence, it's speed and scale.

For the concrete problem we're discussing, that hypothetical belongs in a Marvel movie, not reality. In the real world, you can't 'hack your competitors out of existence', and you'll be going to prison very quickly for trying this sort of thing.

Philpax · 2026-05-05T05:01:24 1777957284

I did say

> especially if you're willing to break the law / normal operating decorum

in my original post. If you have a superintelligence, you have something that can find and take advantage of every exploitation vector in parallel - technical, social, bureaucratic - and use that to destroy a company from the inside. A superintelligence that is subservient to its operator is an informational superweapon.

I agree that this sounds fanciful, but you can see what existing cyberattacks can do to organisations; it does not take that much imagination to gauge how much worse it could be when the process can be automated and scaled.

vkou · 2026-05-05T06:27:54 1777962474

> A superintelligence that is subservient to its operator is an informational superweapon.

The five dollar wrench attack will put an end to that operator's use of an informational superweapon.

> I agree that this sounds fanciful, but you can see what existing cyberattacks can do to organisations

What can it do? Generally, a minor disruption to operations.

It consistently does a lot less than what law enforcement can do to you if you start messing with other rich peoples' money, while having enough of a presence to own a super-intelligence and a trillion-dollar data center.

Philpax · 2026-05-05T12:59:29 1777985969

Within a day - well before any legal or societal force could intervene - a superintelligence could make its way into every part of an organisation's internal network and tear it apart from the inside.

Conventional hackers are limited by the serial nature of their work - finding breaches, exploiting them, conducting further exploration of the network, trying not to get detected - in ways that a superintelligence would not be. The latter could be a hundred times as effective, a hundred times as fast, and a hundred times more parallel.

I agree that this is unlikely to happen because the societal bill would come due in time, but my point is that a month's lead is enough to do significant and lasting damage.

remexre · 2026-05-02T04:20:55 1777695655

the one where i think of a particular piece of work, and i know who did it, then tell a student "oh, see if $author's group published anything else about this."

i'm not using software for this if this is off the top of my head, and it's the sort of thing that, at scale, hurts the forgotten author and their students

mattkrause · 2026-05-03T03:14:10 1777778050

There’s a cute study demonstrating this effect by comparing career success in economics and psychology.

The author lists for economics papers are traditionally alphabetized, so more of your output will be known by your name if it occurs early in the alphabet. Abbie Ableson gets lots of mentions as "Ableson et al." while Zhang Zhu will almost always be relegated to the "et al". If name recognition matters, you’d expect successful academic economists to be clustered at the beginning of the alphabet—-and this appears to be true.

In most psychology journals, the author list is instead ordered by contribution/senority, and this effect disappears. https://www.aeaweb.org/articles?id=10.1257/08953300677652608...

bjourne · 2026-05-02T10:46:10 1777718770

I see. The informal credit assignment process is something that only runs inside of your head.

ohnei · 2026-05-02T13:12:57 1777727577

Right, academics who deligate their entire intellectual life to GPT will be unaffected.

bjourne · 2026-05-02T14:09:30 1777730970

Right, and everyone else unaware of this made up "informal credit assignment process".

mattkrause · 2026-05-03T02:42:19 1777776139

I don’t know that everyone would label it like that, but it’s inarguably true that success in academia comes from your reputation/name recognition.

Metrics are often attempts to formalize this but they’re not how most people actually make decisions: nobody is inviting seminar speakers or choosing collaborators because they have a high h-index. If anything, it goes the other way: name recognition gets you invited to speak or collaborate, which makes more people aware of your work, which boosts metrics.

bjourne · 2026-05-03T15:37:50 1777822670

That is false. The first thing everyone (at least everyone in CS---IDK about other fields) looks at are h-indexes, impact factors, number of papers per year, university rankings, and similar metrics. Researchers are most definitely selecting collaborators with a high h-index.

ohnei · 2026-05-02T22:38:08 1777761488

https://xkcd.com/1053/

remexre · 2026-04-02T18:56:46 1775156206

pidfd, eventfd, AF_NETLINK, epoll, memfd, timerfd?

remexre · 2026-03-21T02:36:19 1774060579

you're thinking of the programs in low-level langs that survived their higher-level-lang competitors; if you plot the programs on your machine by age, how does the low quartile compare on reliability between programs written in each group

DaleBiagio · 2026-03-23T19:43:31 1774295011

Survivorship bias is exactly right.

The C and assembly programs we still use are the ones that were good enough to last. The thousands that weren't are gone.

Nobody counts the programs that were never finished because the language made them too hard to write in the first place.

remexre · 2026-02-25T05:50:13 1771998613

my impression is that most CL these days is existing large closed-source codebases, hence the price tag for those compilers (you're not trying it out for a bit, you're funding the compiler devs to work full-time on the issues you're actually having) and relatively little open-source activity for "finished" things -- if you're developing against internal libraries, it's hard to open-source just the part you intend to

(work at a CL shop; mostly SBCL users, but maybe 1/3 of people are die-hard ACL fans)

remexre · 2026-02-20T22:58:37 1771628317

https://0pointer.net/blog/file-descriptor-limits.html

remexre · 2026-02-04T06:36:45 1770187005

how does this compare to MoSA (arXiv:2505.00315)? do you require that there's a single contiguous window? and do you literally predict on position, or with a computed feature?

jmward01 · 2026-02-04T07:54:31 1770191671

I predict a specific location then put a window around that. Of course you can predict a different location per head or multiple window locations per head as well. The cost is negligible (single linear embx1 size) so attn becomes a fixed cost per token just like traditional windowed attn. Of course this doesn't solve memory consumption because you still have a kv cache unless you only do attn over the initial embeddings at which point you don't need the cache, just the token history. This is the tact I'm taking now since I have other ways of providing long context at deeper layers that remain O(1) for token prediction and are paralellizable like standard attn. I think this kind of architecture is the future, infinite context, fixed size state, O(1) prediction, externalized memory are all possible and break current context, memory and compute problems. It is clear that in the future token caching will be dead once these types of models (mine or someone else's with the same properties) are properly tuned and well trained.

remexre · 2026-02-01T09:49:19 1769939359

c++ certainly also has and needs a similarly sufficiently smart compiler to be compiled at all…

remexre · 2026-01-15T02:32:18 1768444338

https://en.wikipedia.org/wiki/Lexer_hack

Make your parser call back into your lexer, so it can pass state to it; make the set of type names available to it.