We can describe the architecture, training procedure, and inference flow for LLM, But what about the hidden/internal state?
Share this post
What do all these layers do in LLM?
Share this post
We can describe the architecture, training procedure, and inference flow for LLM, But what about the hidden/internal state?