Posts, articles, and discussions
Community posts

StackLLaMA: A hands-on guide to train LLaMA with RLHF
By April 5, 2023

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU
By March 9, 2023

Introducing ⚔️ AI vs. AI ⚔️ a deep reinforcement learning multi-agents competition system
By February 7, 2023

Illustrating Reinforcement Learning from Human Feedback (RLHF)
By December 9, 2022

Train your first Decision Transformer
By September 8, 2022

Proximal Policy Optimization (PPO)
By August 5, 2022

Advantage Actor Critic (A2C)
By July 22, 2022

Policy Gradient with PyTorch
By June 30, 2022

Deep Q-Learning with Atari
By June 7, 2022

An Introduction to Q-Learning Part 2
By May 20, 2022

An Introduction to Q-Learning Part 1
By May 18, 2022

An Introduction to Deep Reinforcement Learning
By May 4, 2022

Introducing Decision Transformers on Model Database 🤗
By March 28, 2022

Welcome Stable-baselines3 to the Model Database Hub 🤗
By January 21, 2022