A brief history of LLM Scaling Laws from compute-optimal training and inference to scaling test-time compute and whether Scaling Laws are coming to an end.
An overview of the motivations and techniques used for generating synthetic data for LLM post-training, as seen in the Llama 3.1, AFM, Qwen2 and Hunyuan-Large papers.