Posts, articles, and discussions
Community posts
StackLLaMA: A hands-on guide to train LLaMA with RLHF
By April 5, 2023
Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU
By March 9, 2023
Introducing ⚔️ AI vs. AI ⚔️ a deep reinforcement learning multi-agents competition system
By February 7, 2023
Illustrating Reinforcement Learning from Human Feedback (RLHF)
By December 9, 2022
Train your first Decision Transformer
By September 8, 2022
Proximal Policy Optimization (PPO)
By August 5, 2022
Advantage Actor Critic (A2C)
By July 22, 2022
Policy Gradient with PyTorch
By June 30, 2022
Deep Q-Learning with Atari
By June 7, 2022
An Introduction to Q-Learning Part 2
By May 20, 2022
An Introduction to Q-Learning Part 1
By May 18, 2022
An Introduction to Deep Reinforcement Learning
By May 4, 2022
Introducing Decision Transformers on Model Database 🤗
By March 28, 2022
Welcome Stable-baselines3 to the Model Database Hub 🤗
By January 21, 2022