Paper notes - Merging LLMs at Pre-training, Considering Token Probabilities at RL

machine-learning, research,, deep-learning,, transformer,, large-language-models,, RL

Here are two papers that target:


Model Merging in Pre-training of Large Language Models #

What’s new #

How it works #

Results #

Why it matters #


Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs #

What’s new #

How it works #

-Low-Probability Token Isolation (Lopti):

Results #

Why it matters #