More

metalwhale · 2025-07-15T15:21:27 1752592887

Hello everyone! I'm writing a blog about my experience on training a minimal DDPM and just want to share what I've learned so far. Feel free to read and discuss with me.

metalwhale · on Oct 30, 2024

Thank you for sharing this! I have one question: Is there any plan to add support for local LLM / embeddings models?

motoxpro · on Oct 30, 2024

"Right now the system only supports OpenAI as an embedding provider, but we plan to extend with local and OSS model support soon."

In the post you responded to

metalwhale · on Oct 30, 2024

Haha I feel so dumb now. Thank you!

aargh_aargh · on Oct 30, 2024

This question keeps popping up but I don't get it. Everyone and their dog has an OpenAI-compatible API. Why not just serve a local LLM and put api.openai.com 127.0.0.1 in your hosts file?

I mean why is that even a question? Is there some fundamental difference between the black box that is GPT-* and say, LLaMA, that I don't grok?

metalwhale · on Oct 15, 2024

I think it cannot surpass SOTA in some LM evaluation sets, but please understand that achieving better results requires a very good training dataset, which not everyone can afford.

On the other hand, the main points of Zamba/Mamba are low latency, generation speed, and efficient memory usage. If this is true, LLMs could be much easier for everyone to use. All we need to do is wait for someone with a good training dataset to train a SOTA Mamba.

metalwhale · on May 24, 2024

This is interesting.

How can you retrieve the latent representation of the candidate LLMs? Some models do not have open weights (such as GPT-4), which means AFAIK it is impossible to directly access the hidden latent space through their API.

Am I missing something?

danlenton · on May 29, 2024

We just initialize a random latent vector for each model, and then jointly train each of these unique latent vectors :)

metalwhale · on Feb 5, 2024

Thank you. Can I ask a question?

Does this mean LLMs can generate text with empty context? How can LLMs choose the first token without any previous tokens? My understanding is that to compute logits for the next token, LLMs require input from all previous tokens. Am I correct?

metalwhale · on June 22, 2020

Disclaimer: I'm not the author. Just want to share this awesome project.

metalwhale · on June 12, 2020

Thank you so much for sharing this great repo! I have noticed that the source transformation notebook is not finished yet. How is it now?

metalwhale · on June 11, 2020

Disclaimer: I'm not the author. Just interested in the article and want to share this awesome post.

metalwhale · on March 4, 2020

But it's worth joining ((((:

metalwhale · on March 4, 2020

Well, I will choose shuffle mode if this service is available (((: