Hacker Newsnew | past | comments | ask | show | jobs | submit | florianleibert's commentslogin

hi! open sourced a serving config for Kimi K2.6 from 90 tok/s 508 tok/s on 8xMI300X. Same weights / 0 quality loss.

Scaling is linear @15.8 tok/s per slot latency is constant. REpo has command launcher, Dockerfile, benchmark tool. Known limitations: BF16 KV only (FP8 crashes due to an AITER 384-expert constraint)


This is pretty cool.


There is also a tutorial on the Spinnaker website. https://www.spinnaker.io/guides/tutorials/codelabs/dcos-sour...


Here is a tutorial on the easiest way to run Tensorflow: https://dcos.io/blog/2017/tutorial-deep-learning-with-tensor...


Here is another good tutorial - about Tensorflow on DC/OS: https://mesosphere.com/blog/2017/05/11/deep-learning-tensorf...


super interesting... I think deep learning infrastructure still needs a lot of improvement. thanks for sharing.


Super impressive performance!


Here is more information about the underlying technology, which also powers e.g. Twitter (Apache Mesos and DC/OS). https://azure.microsoft.com/en-us/blog/microsoft-joins-the-n...


Here is a reference to the original mesos (core of DC/OS) paper by Ben Hindman, Matei Zaharia and Andy Konwinsky: https://www.cs.berkeley.edu/~alig/papers/mesos.pdf


Great job for open sourcing this Florian. You guys have been doing great stuff there at mesosphere.


Thanks for the link to the paper.

This is really big news. Kudos to the Mesosphere team.


The 0.7X releases contained a number of bugs. 0.8X has been significantly better!


I'd love to hear about the bugs that you encountered. Did you file any github issues for them?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: