Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I very, very nearly migrated to a full Duckdb solution for customer-facing historical stock data. It would have been magical, and ridiculously, absurdly, ungodly fast. But the cloud costs ended up being close to a managed analytics solution, with significantly more moving parts (on our end). But I think thats just our use case, going forward I'd look at duckdb as an option for any large-scale datasets.

Using ECS/EKS containers reading from a segmented dataset in EFS is a really solid solution, you can get sub second performance over 6 billion rows / 10000 columns with proper management and reasonably restrictive queries.

Another option is to just deploy a couple huge EC2 instances that can fully fit the dataset. Costs here were about the same, but with a little more pain in server management. But the speed man, its just unbelievable.



Co-founder and head of produck at MotherDuck here - would love to chat. We're running DuckDB in a serverless fashion, so you're only paying for what you consume.

Feel free to reach out to tino@motherduck.com.


To note, we migrated from Redshfit, which had 7-30 second performance. Our current managed solution is something like 1 - 5. Duckdb just smashes everything else, at least on our data.


Did you check the cost to run it on Motherduck (https://motherduck.com/)?


What do you mean by segmented dataset in EFS?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: