Google just fused two opposite training methods into one system — Supervised Reinforcement Learning (SRL) — and made small 7B models earn intelligence with dense, step-wise rewards. In tests on s1K 1.1 and code tasks, SRL beat standard SFT and held up when no rollout was correct. The winning recipe: SRL first, then RLVR. Meanwhile, Gemini 2.0’s multi-agent “AI co-scientist” helped researchers surface drug candidates for liver fibrosis and independently reconstructed a decade-long discovery about phage tail piracy in bacteria—in days.
???? Brand Deals & Partnerships: me@faiz.mov
✉ General Inquiries: airevolutionofficial@gmail.com
???? What You’ll See (sources):
Google SRL paper (arXiv): From Expert Trajectories to Step-wise Reasoning
https://arxiv.org/pdf/2510.25992
• RLVR background (arXiv): Reinforcement Learning with Verifiable Rewards
https://arxiv.org/abs/2506.14245
• Google Research blog: Accelerating scientific breakthroughs with an AI co-scientist
https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/
• Liver fibrosis study (Advanced Science, DOI): AI-Assisted Drug Re-Purposing for Human Liver Fibrosis
https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202508751
• PubMed record (Vorinostat details):
https://pubmed.ncbi.nlm.nih.gov/40946179/
• Preprint (liver fibrosis):
https://www.biorxiv.org/content/10.1101/2025.04.29.651320v1.full-text
• Tail piracy explainer (Imperial College news):
https://www.imperial.ac.uk/news/268213/microbial-piracy-uncovers-fight-drug-resistant-infections/
• Coverage (GEN / summary with AI angle):
https://www.genengnews.com/topics/infectious-diseases/tail-swapping-pirate-phages-expose-new-route-for-amr/
???? Why It Matters:
Small AIs can now reason step by step without giant reward models, and multi-agent systems are doing real science—compressing years of discovery into days. This flips the script on scale vs. smarts.
				
				???? Brand Deals & Partnerships: me@faiz.mov
✉ General Inquiries: airevolutionofficial@gmail.com
???? What You’ll See (sources):
Google SRL paper (arXiv): From Expert Trajectories to Step-wise Reasoning
https://arxiv.org/pdf/2510.25992
• RLVR background (arXiv): Reinforcement Learning with Verifiable Rewards
https://arxiv.org/abs/2506.14245
• Google Research blog: Accelerating scientific breakthroughs with an AI co-scientist
https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/
• Liver fibrosis study (Advanced Science, DOI): AI-Assisted Drug Re-Purposing for Human Liver Fibrosis
https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202508751
• PubMed record (Vorinostat details):
https://pubmed.ncbi.nlm.nih.gov/40946179/
• Preprint (liver fibrosis):
https://www.biorxiv.org/content/10.1101/2025.04.29.651320v1.full-text
• Tail piracy explainer (Imperial College news):
https://www.imperial.ac.uk/news/268213/microbial-piracy-uncovers-fight-drug-resistant-infections/
• Coverage (GEN / summary with AI angle):
https://www.genengnews.com/topics/infectious-diseases/tail-swapping-pirate-phages-expose-new-route-for-amr/
???? Why It Matters:
Small AIs can now reason step by step without giant reward models, and multi-agent systems are doing real science—compressing years of discovery into days. This flips the script on scale vs. smarts.
- Catégories
 - Intelligence Artificielle
 - Mots-clés
 - AI News, AI Updates, AI Revolution
 


						
Commentaires