Notations
This section provides the notations and definitions commonly used in reinforcement learning. The following table outlines the symbols and their meanings.
Symbol |
Meaning |
|---|---|
State space. |
|
Action space. |
|
Reward space, being equal to the space of values of the reward function. |
|
Reward funciton, |
|
Reward funciton, |
|
Entropy of the source, |
|
Replay buffer. |
|
State, action, and reward at time step |
|
Discount factor ( |
|
Return ( |
|
Transition probability of getting to the next state |
|
Stochastic policy (agent behavior strategy), |
|
Deterministic policy. |
|
State-value function of a given state |
|
The value of state |
|
Action-value function of the given state and action |
|
Action-value function when we follow a policy |
|
Advantage function, |
Note
In this document, we adopt the following conventions:
Uppercase letters represent random variables or functions, such as
, etc.
Calligraphic uppercase letters represent sets, such as
, etc.
Lowercase letters represent deterministic values, such as
, etc.
Bellman Expectation Equation
Important