More

carterschonwald · 2026-05-15T17:35:34 1778866534

it seems like with some care and disabling sip, that some pretty good work arounds using llm assisted kext hackery would get pretty far

carterschonwald · 2026-05-13T10:24:27 1778667867

yeah shared “did you this weeks X” is lame, but it was social glue for a long time.

munificent · 2026-05-13T22:29:26 1778711366

I think about this all the time.

The trend towards personalization in media and software comes at the cost of a loss of a shared social experience we can use to relate to each other.

iammjm · 2026-05-14T09:24:49 1778750689

yeah but do we really need some trash reality-TV for a "shared social experience"? most of TV's programming was garbage anyway and contributed to a lot of what was/is wrong with the society

carterschonwald · 2026-05-09T15:00:58 1778338858

this is literally just “leave a child at the work computer with a real doc open playing office”. otoh it is good to design benchmarks tonground these things.

on the flip side if you’re literally just using a bare bones harness on top of a stochastic parrot, of course stochastic errors accumulate.

theres a lot of ways for improving text faithfulness through harness tool designs, and my incremental experiments seem promising.

but unless work is gated on shit like “the script used must type checked ghc haskell or lean4”, unsupervised stuff is gonna decay

rhubarbtree · 2026-05-09T16:54:59 1778345699

It’s not a stochastic parrot.

kgwgk · 2026-05-09T21:05:58 1778360758

It’s a stochastic goblin.

carterschonwald · 2026-05-09T23:16:38 1778368598

good one

carterschonwald · 2026-05-08T00:28:05 1778200085

i mean of course. ive been working on this the past few months and ive a bunch of tech towards this in flight, including some harness forks to layer my ideas in. eg my oh punkin pi test bed on my github.com/cartazio page , theres some shockingly obvious ince you see it tricks that i think i can stack into a really nice harness product for just doing hard real work with these models more easily

carterschonwald · 2026-05-05T22:32:24 1778020344

16weeks plus week or so per year of service is pretty good

carterschonwald · 2026-05-04T00:30:03 1777854603

the funny thing is once the llms got mostly good enough in november 2025 for me, it was mind boggling how much it helped me get stuff out of my head with ease.

its easier for me to code now, because its like i have a 24/7 insane intern that needs to be supervised via pair programming but also understands most topics enough to be useful/ dangerous.

ironically ive been spending much of my time iterating on ways to improve model reasoning and reliability and aside from the challenge of benchmark design, ive had some pretty good success!!

my fork of omp: https://github.com/cartazio/oh-punkin-pi has a bunch of my ideas layered on top. ultimately its just a bridge till i’ve finished the build of the proper 2nd gen harness with some other really cool stuff folded in. not sure if theres a bizop in a hosted version of what ive got planned, but the changes ive done in my forks have made enough difference that i can see the different in per model reasoning

carterschonwald · 2026-05-04T00:03:58 1777853038

im def working on benchmarks for how my own general harness improves task performance vs same model in a commodity setup. its hard to do!

i will say that my current harness: https://github.com/cartazio/oh-punkin-pi is a testbed for a bunch of 2nd gen harness tech, largely optimized for reasoning llms only. the next one after this harness is gonna be epicccc

carterschonwald · 2026-05-03T23:50:12 1777852212

i might borrow the skills etc for good ideas sometime. thats a lot of integration surface

carterschonwald · 2026-04-29T00:03:29 1777421009

check out my pi forks.

JamesSwift · 2026-04-29T00:21:19 1777422079

Ummmmmm, how?

HDBaseT · 2026-04-29T02:34:00 1777430040

I searched his HackerNews username on Google.

[0] - https://github.com/cartazio/oh-punkin-pi

JamesSwift · 2026-04-29T14:37:52 1777473472

That (and oh-my-pi) seem like an excessive swing in the other direction. Im all for the simplicity and minimalism of pi. There are just a few fundamental things that need updated (mainly subagent context and open-by-default security model).

carterschonwald · 2026-05-04T01:50:06 1777859406

yup thats mine. :) i actually had some stuff layered into mono pi, and i frankly hit my limit in terms of architecture issues in monopi, omp aka oh my pi is frankly better architectured. if you pared back the fearure set to be minimal, you would full stop have a better designed minimal harness.

i do have a proper next gen no slop harness in the work.

amusingly , dog fooding existing tools with my improvements layered in, has repeatedly validated my design choices and if anything has reduced my tolerance for the errors that seem to happen in vanilla or first party harnesses

carterschonwald · 2026-04-28T03:20:24 1777346424

more than that, its pretty clear that there is an insane underinvestment in the harness layer. ive been iterating on my own ideas in that area through the lens of increasing reliability. and holy crap is there so much low hanging fruit. i literally can’t figure out a sustainable way to do the work without commercializing at that layer