Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Actually you're almost 100% describing how Wimsey works! It's using native df code rather than a UDF of some kind. Under the hood it uses Narwhal's which converts polars style expressions into native pandas/polars/spark/dask code with super minimal overheads.

If you're using a lazy dataframe (via polars, spark etc) Wimsey will force collection, so that can have speed implications. Reason being that I can't find a cross-language way yet of embedding assertions for fail later down the line.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: