9 articles on reinforcement learning — fundamentals, Q-learning, policy gradients, and human feedback.
Part of the xbe.at knowledge base. ← Back to index
- Fundamentals — RL concepts, rewards, environments, agents, episodes
- Q-Learning — mathematical foundations and Python implementation
- Model-based RL — planning and world models
- Policy methods — sequential decision-making, policy gradient approaches
- RLHF — reinforcement learning from human feedback (applied to LLMs)
- Transfer learning in RL — Low-Rank Adaptation and reward modeling
- Use cases — common industry applications of RL in Python
- Taxonomy — RL in the broader supervised/unsupervised/RL landscape
- Reinforcement Learning and Q-Learning: Mathematical Foundations and Python Implementation
- Reinforcement Learning from Human Feedback in Python
- Model-Based Reinforcement Learning with Python
- Transfer Learning, Low-Rank Adaptation, and Reward Modeling in Python