More

poly2it · 2026-06-09T20:36:46 1781037406

I'm not in HFT, but I assume this is also an interesting applicable domain?

ag2718 · 2026-06-09T20:40:44 1781037644

Yes, definitely: this type of work is applicable in domains where software run on general-purpose processors cannot meet latency or power requirements.

UltraSane · 2026-06-09T21:46:34 1781041594

The author actually works at Jane Street.

poly2it · 2026-06-09T13:15:43 1781010943

On topic of hidden files: wherefrom is the pattern of treating configuration files as hidden? I'm referring to the pattern of `.configfile` -- I mean, for code projects, a local config file is a first-level construct. This leads to hidden files being not being a viable construct, as there is no longer any consensus on what should be hidden.

andOlga · 2026-06-09T15:51:09 1781020269

I don't know the answer to this, but I have to wonder if, for source files specifiically, .git is the culprit here... It's not part of your project, it's part of your repo. Which maybe makes sense if people ever divorced their source code from the repo but that's not a thing anymore. Others probably just copied it.

poly2it · 2026-06-09T12:36:21 1781008581

Is this poor man's Nix/direnv, or does it do anything else?

poly2it · 2026-06-09T12:25:16 1781007916

I'm wondering who's paying for this. Is it becoming a part of the iCloud subscription? A separate billed product?

poly2it · 2026-06-08T10:55:51 1780916151

Why was this posted to HN? What an utter waste of time. Someone's slopwriter writes a slop article about which slopper slops the most slopulicious slop. Comments agree it's a bogus "study". We need some gate on AI-written articles. It's so weird that AI-written comments are not permitted, while the front page can be occupied by stuff like this.

poly2it · 2026-06-07T14:49:04 1780843744

That's precisely the issue, the power dynamic needs to flip. The state of USA and more worryingly the direction it is in is horrifying.

poly2it · 2026-06-05T19:47:36 1780688856

What? What LLM were you using a decade ago? Am I misreading you?

utopiah · 2026-06-05T19:50:41 1780689041

You might not be aware of it but GenAI predates OpenAI which was founded more than 10 years ago anyway.

poly2it · 2026-06-05T20:07:23 1780690043

Of course I am aware, but how is this relevant today? How does that prove that the science is irrelevant and wasted?

utopiah · 2026-06-06T06:24:45 1780727085

Did I say that the science is irrelevant and wasted?

HDThoreaun · 2026-06-05T20:38:16 1780691896

No. GenAI means LLMs right now. I agree it didnt in the past, but definitions change.

poly2it · 2026-06-03T18:10:58 1780510258

There's a favourite button for comments if you click the timestamp.

poly2it · 2026-06-01T00:14:30 1780272870

The URL seems to have been messed up on this submission. Here is the corrected:

https://www.yahoo.com/entertainment/tv/articles/harvard-grad...

poly2it · 2026-05-31T12:48:56 1780231736

Sorry if this sounds naive, but does it make sense to write a codec library in C/ASM considering how well Rust is progressing, especially when, as the author puts it, AV2 decoding is roughly five times more complex than AV1 decoding?

Arodex · 2026-05-31T13:02:39 1780232559

The algorithms deployed in these kind of codecs take into account not only human vision and mathematical laws of information, but also nitty-gritty details of how computers work, which are optimally exploited by directly having humans write detailed assembly rather than a compiler make a best guess and effort.

alkonaut · 2026-06-01T07:20:03 1780298403

Surely 100% of these low level features are availale in rust too? I understand it is a massive undertaking and builds off the previous codec(s) but writing these things by hand such as inline assembly seems to be as easy if not easier in Rust?

And as soon as you walk into concurrency territory for a complex codec like this then it seems almost impossible for humans to do correctly while retaining safety.

rafaelmn · 2026-06-01T09:42:59 1780306979

Why ? If it's shared reads and scoped writes (read-only look up, output to a thread owned buffer span) concurrency seems pretty straightforward.

Rust can only prove a limited subset of correct programs to be safe, when you're doing bare metal stuff you've often not in that subsystem and drop down to unsafe. I'm guessing there's always stuff that's not perf critical and can live in Rust sandbox - so not saying no wins - but it doesn't sound like Rust is a no-brainer.

alkonaut · 2026-06-02T09:58:35 1780394315

I mean all else being equal, I'd just take the ergonomics (dependency management, build-system/multi-targeting, modern language features, no h-files, ...) and run. I can always just write at the level I want (safe rust, unsafe rust, C, inline asm...).

I still think no one should in 2026 be writing a nontrivial codec or anything parsing untrusted data, in C. There's just no excuse.

The gains are re-use of skill and code. And I hope that's the reason this is continuing with C, this is basically a v2 of an existing project, not a greenfield codec, even if it's much larger.

jbk · 2026-05-31T12:57:17 1780232237

Because it's 5 times more complex, you need to get the maximum performance available. Therefore more ASM than ever.

Rust does not bring more performance. Just more safety.

alkonaut · 2026-06-01T07:24:08 1780298648

It brings tooling that is a LOT easier. Just things like dependency management, test running and so on is so much better in Rust than in C, even if you happen to write the exact same code because you basically write unsafe code and hand rolled assembly for many things. I think this is people using the tool they know rather than the best tool (And if you know a tool well, it might become the best tool for the job because of that). It could be because a huge chunk of existing code can be re-used. But all else being equal (existing code, existing developers don't exist) I refuse to believe a codec should ever be written in C ever again.

LoganDark · 2026-05-31T14:51:46 1780239106

The safety can be worth it in certain cases. Like when handling untrusted input. And it's not just Rust: look at WUFFS for example. WUFFS can actually rival handwritten implementations in certain cases.

xp84 · 2026-05-31T15:56:32 1780242992

Are video codecs in the present day able to be sandboxed? In my fantasies at least I’d like the worst a malicious video file can do is cause garbage output or cause the codec to crash.

Forgive the ignorance, I have worked entirely in the abstracted layers of the stack, and mostly web.

adgjlsfhk1 · 2026-06-01T00:55:19 1780275319

not really. they're mostly pure assembly and sandboxing assembly isn't really a things

nullpoint420 · 2026-06-01T05:32:35 1780291955

yes it is. all modern operating systems sandbox assembly. that's how it works.

LoganDark · 2026-06-01T18:29:30 1780338570

Windows may use virtualization-based security by default, but I'm not aware of macOS or Linux doing the same -- Apple builds security directly into the silicon such that no virtualization is required, and Linux just rawdogs everything.

Whether that counts is up to you. I suppose it's still "sandboxed" in that it runs in a less privileged context than the kernel.

throawayonthe · 2026-05-31T15:12:35 1780240355

but not these cases

bigyabai · 2026-05-31T19:11:25 1780254685

It really should be, though: https://en.wikipedia.org/wiki/FORCEDENTRY

IshKebab · 2026-05-31T16:35:53 1780245353

I don't see why not. What makes you think this is unique?

adgjlsfhk1 · 2026-06-01T00:57:14 1780275434

WUFFS like approaches work better for algorithms like lz77 that are substantially bandwidth constrained. for something like a video codec, the computational intensity is much higher so you need better codegen to reach max speed

cesarb · 2026-05-31T19:24:46 1780255486

> Rust does not bring more performance. Just more safety.

Though more safety can in some cases bring a bit more performance. For instance, with Rust you can often avoid "defensive copies" of objects.

itishappy · 2026-06-01T02:34:53 1780281293

When writing a high performance video codec avoiding defensive copies of objects is something you want always, not just often.

C makes it easy to be fast but hard to be safe. Rust makes it easy to be safe but hard to be fast.

Also note that video codecs tend to wrap C or Rust around handcrafted ASM. Performance is king.

cogman10 · 2026-05-31T13:22:50 1780233770

Encoder and decoder writers frequently need extremely fine grain control over SIMD instructions in order to get good performance.

The way they weave these instructions can be very hard to express with a high level language.

Further, there's a ton of work with arrays and importantly parts of arrays. They can, for example, need to extract every other element up to 1/2 the array. Unfortunately, rust has runtime array bounds checks which make writing that sort of code slower. The compiler can elade those checks, but usually only in simple cases.

The authors would be writing a bunch of unsafe rust to get the performance they want and rust makes that more painful on purpose.

I like rust, but C/ASM really is the right choice here. This is one of the few cases where rust's safety is a major detriment.

dcsommer · 2026-05-31T18:13:20 1780251200

Performance should not be priority #1. Security should be. Why do we slow down all CPUs to prevent SPECTRE attacks yet continue to write in C? As rav1d shows, the perf loss is far less to migrate from C to Rust than it is to apply SPECTRE mitigations, and adding a sandbox around a memory-unsafe codec is going to be way more expensive again than using Rust code to start.

Const-me · 2026-05-31T20:05:04 1780257904

> Performance should not be priority #1. Security should be.

For a web browser, or a server in a bank, sure. For anything else, questionable.

> adding a sandbox around a memory-unsafe codec is going to be way more expensive

In modern world, overhead of strong sandboxes is surprisingly small. A nuclear but most reliable option is hardware assisted VM. On modern computers with SLAT and virtualized IO the overhead for most use cases is negligible. If you want something lighter weight, can use a multi-user nature of all modern OS kernels and isolate into a separate process with restricted permissions. Sandboxing overhead is approximately zero.

cogman10 · 2026-05-31T19:30:03 1780255803

> As rav1d shows

rav1d is not a full rewrite of dav1d to rust. So it really doesn't show that. It's currently C + rust + asm.

I don't think we can say anything about what this does or does not prove about the performance of safe code.

> Performance should not be priority #1. Security should be.

Entirely depends on the application. The reason rust has `unsafe` is because there's some situations where performance needs to preempt potential security problems.

dcsommer · 2026-05-31T20:51:48 1780260708

Codecs are difficult and expensive to develop. Therefore they get reused in many contexts, including security critical ones. Sandboxing is shown over and over to not be a great security solution, so what this means in practice is that security-critical software that needs software decoding get pwned because software engineers don't care to prioritize it in the first place.

Why shouldn't safety be the default? If you really want to, it wouldn't be too hard to maintain a patch on top of rustc to drop the bounds checks if you want to compile object files without them.

Software decoding has a safety culture problem, and we need to talk about it.

cogman10 · 2026-05-31T21:05:01 1780261501

> Why shouldn't safety be the default?

Because safe code isn't fast enough to decode live video.

> If you really want to, it wouldn't be too hard to maintain a patch on top of rustc to drop the bounds checks if you want to compile object files without them.

Yeah, but then you are undermining safety in a critical way that does lead to security vulnerabilities (buffer overflow). And you are also now maintaining and requiring other devs for a project to use a custom version of rustc. That's certainly part of the reason that's simply not happened.

But another major part of it is that encoders end up with a lot of custom ASM regardless. That custom ASM is going to be where vulnerabilities end up. You don't really escape that by using rust.

If you are already abandoning where you critically need safety the most for performance, then why pick a language that additionally penalizes you for using unsafe constructs?

> Software decoding has a safety culture problem, and we need to talk about it.

Compilers and languages have an optimization problem that we need to talk about. SIMD optimizations remain a very hard thing for compilers to get right. We should talk about what it'd take to make compilers better and the reasons for why codec devs need to drop down to asm instead of using a high level compiler.

There might not be a solution to this problem, there are reasons for it.

Dylan16807 · 2026-06-01T02:33:24 1780281204

> Because safe code isn't fast enough to decode live video.

I strongly doubt that.

And if any implementation of AV2 can be "fast enough", then there should be no question at all that we can write "fast enough" safe decoders for every other codec. Absolutely no way safe code is inherently that much slower.

cogman10 · 2026-06-01T12:13:26 1780316006

Show me the AV1, H.265, or even H.264 decoder that doesn't ultimately rely heavily on hand written assembly to achieve "fast enough".

You can doubt all you like. Ultimately, there's a reason why dav1d includes hand coded SIMD for common platforms.

It's simply impossible to get a compiler to emit something like this [1].

[1] https://github.com/videolan/dav1d/blob/master/src/x86/ipred_...

Dylan16807 · 2026-06-01T13:06:28 1780319188

Is it simply impossible to get compiled code within a factor of five? That claim needs strong evidence.

More importantly, if you can show that your assembly code isn't altering pointers it shouldn't alter, and isn't going out of bounds on its reads, you're most of the way to having assembly in your verified safe code. And rough bounds checking with padding can as cheap as a bitmask.

cogman10 · 2026-06-01T13:19:51 1780319991

> Is it simply impossible to get compiled code within a factor of five? That claim needs strong evidence.

1. I didn't make that claim.

2. A negative assertion doesn't require evidence. If I say "this is impossible to do" the burden to disprove me is showing it's actually possible. You can't prove a negative. For example, if I say "the tooth fairy doesn't exist" I don't need to provide evidence of the tooth fairy's non-existence. If you disagree, you need to provide evidence to the contrary.

Dylan16807 · 2026-06-01T14:16:13 1780323373

> 1. I didn't make that claim.

Then you didn't read my previous comment correctly. AV2 must be "fast enough" if the designers aren't crazy. And AV2 is 5x slower than AV1. Therefore if compiled code is within a factor of five of hand-written assembly, it's "fast enough" for AV1, and h.264, and probably h.265 too.

You were disagreeing with my claim that other codecs could be "fast enough" with a safe compiler, right? If you weren't disagreeing, I don't know why you challenged me to show you some particular kind of code.

> 2. A negative assertion doesn't require evidence. If I say "this is impossible to do" the burden to disprove me is showing it's actually possible. You can't prove a negative. For example, if I say "the tooth fairy doesn't exist" I don't need to provide evidence of the tooth fairy's non-existence. If you disagree, you need to provide evidence to the contrary.

You're saying it's "simply impossible" for a compiler to optimize instructions to a certain level. But anything one person can code, another person can teach a compiler to do in similar situations. I don't need to show you an example, I just need to point you at the Church-Turing thesis and related documents.

imtringued · 2026-06-01T07:34:03 1780299243

What's supposed to be the big source of unsafety in codecs though? Feels like the problem here is that C developers are ruining the reputation of C with their garbage code.

Bounds checking as a source of slowdown is overrated in a niche where you're working on fixed size blocks. It feels like the C developers are getting the parts outside the ASM kernels wrong.

cogman10 · 2026-06-01T12:28:46 1780316926

> What's supposed to be the big source of unsafety in codecs though?

Hand written assembly. It's quite easy to accidentally start reading or manipulating a block of memory you didn't intend to when doing complex SIMD transformations.

> Bounds checking as a source of slowdown is overrated in a niche where you're working on fixed size blocks.

I think you don't really understand how codecs work. It is not uncommon for a transformation like `a = b[c[i] * 3 + offset];`. There's no way for a compiler to omit the bounds check because it can't prove the contents of `c` aren't going to exceed the bounds of `b`.

This isn't a "crappy C developer" problem. This is a "There isn't a language that does a great job at capturing high level SIMD expressions" problem.

muhbaasu · 2026-05-31T13:30:09 1780234209

The ffmpeg devs have said many times in public that they routinely get speedups of 10x or more over C code. I'm not a reputable source on this myself but I highly recommend looking into their channels, mails, or posts.

nmz · 2026-05-31T16:46:14 1780245974

https://youtu.be/nepKKz-MzFM&t=7195

If you can stand Lex Friedman for a bit, the VLC authors talk about why you use ASM for a video decoder instead of pure C or rust.

IshKebab · 2026-05-31T16:44:39 1780245879

I don't know why you've been down-voted. It definitely isn't an optimal decision. A video codec isn't all assembly. There's plenty of plain unsafe C code. E.g. this is the first random file I clicked. It has a ton of raw C pointer stuff just begging to be exploited.

https://code.videolan.org/videolan/dav2d/-/blob/main/src/dat...

There is a project to write an AV1 decoder in Rust: Rav1d (really stretching the name here).

https://github.com/memorysafety/rav1d

They got within 5% of the performance of dav1d and held a contest to close the gap but I think I read somewhere that this wasn't achieved.

https://www.memorysafety.org/blog/rav1d-perf-bounty/

They claimed

> This is enough of a difference to be a problem for potential adopters, and, frankly, it just bothers us.

But in my opinion nobody actually cares about 5% in absolute terms. It's likely just Rust naysayers using that as an excuse.

I think the likely reason for dav2d using C is that they can reuse lots of code and infrastructure from dav1d. But I agree it would be much better if they worked on Rav2d instead (these names!). You can hardly complain about a 5% overhead if you're opting in to 5x more decoding complexity.

skelpmargyar · 2026-05-31T22:14:26 1780265666

Of course any random C file is going to have pointers. Where can anything in the linked code be exploited? It seems like they're testing for bad input data with asserts to catch bugs in some functions, and properly validating bad inputs in others. Just because they're writing C doesn't mean it's vulnerable.

How can you claim nobody cares about 5%? A 5% performance increase is significant. And video decoding is not always for playback, where 5% may not matter as much.

IshKebab · 2026-06-01T18:47:50 1780339670

> Where can anything in the linked code be exploited?

Difficult to tell - that's the point!

throawayonthe · 2026-05-31T15:16:39 1780240599

yes it makes sense to use C/ASM here, but if you're curious, there is a rust port of dav1d named rav1d: https://github.com/memorysafety/rav1d

it's not much slower than the original C/ASM implementation (last i checked ~5%?) but that matters here

nick__m · 2026-05-31T20:29:53 1780259393

It's a Rust/ASM port, look there: https://github.com/memorysafety/rav1d/blob/main/src/ext/x86/...

I am not sure if it is that much safer than the C version when raw assembly is still required.

Thaxll · 2026-05-31T19:34:54 1780256094

It is much slower than 5%, there were other independent tests that put it around 20%.

Telaneo · 2026-05-31T12:57:26 1780232246

Go ask FFmpeg what they're writing their encoders and decoders in.

latexr · 2026-05-31T13:14:03 1780233243

That isn’t particularly helpful to someone asking a question in good faith. What others are using doesn’t clarify why they are using it. Plus, FFmpeg is itself a decade older than Rust. The OP is asking about starting a new project today.

Telaneo · 2026-05-31T13:42:16 1780234936

> What others are using doesn’t clarify why they are using it.

It does if you ask them, or at least research the topic at hand.

latexr · 2026-05-31T16:15:22 1780244122

Isn’t that just the same as answering “Google it”, then? We’re on a discussion forum, where matter experts visit, talking about a specific topic. If one can’t ask their questions in this highly relevant situation, where can they? The point of HN is supposed to be gratifying curiosity.

Gigachad · 2026-06-01T03:34:52 1780284892

Just don't try reporting a security issue to them.

Telaneo · 2026-06-01T05:16:56 1780291016

Is this a reference to this: https://news.ycombinator.com/item?id=45785291 ?

If so, FFmpeg's stance is very understandable in my opinion.

Gigachad · 2026-06-01T05:21:41 1780291301

Somewhat, but somewhat not. Yes it's a very obscure format, and yes it's partially a marketing stunt from Google for their AI tools. But it's also a real bug which is exploitable on ffmpeg. And we have seen in the past that state sponsored hacking groups specifically target media decoders with obscure formats that aren't often tested or known about.

Media decoders are one of the highest risk programs since they deal with untrusted user input and are incredibly complex. So just because a large project like ffmpeg uses C, doesn't mean there isn't very good reason to consider a language like Rust for saftey reasons.

Telaneo · 2026-06-01T05:41:02 1780292462

If Google want secure encoders and decoders, then they can donate money or patches. Since they don't, the clearly don't actually care all that much, or are just mooching of volunteers' goodwill.

The disadvantage in speed when using Rust is pretty obvious.[1] When it comes to video encoding and decoding, I and FFmpeg care a lot more about speed than memory safety. So those reasons have been considered and largely discounted.

[1] https://xcancel.com/FFmpeg/status/1924137645988356437 (to be fair, this is only transpiled from C, so it could probably be optimised further, but that apparently needed a 20k USD bounty to then not even happen (as far as I can tell))

[2] https://www.memorysafety.org/blog/rav1d-perf-bounty/

MattRix · 2026-05-31T13:01:31 1780232491

Yes? There is 5x more code to optimize the ASM for.