A page outlining the notation used in Sutton’s RL Book.
Added an dynamic training feature, where the AIAgent:train_long_memory() code is run an increasing number of times at the end of each epoch as the total number of epochs increases.
Modified the train_long_memory() code.
Added a new DAIAgent constants file with MAX_DYNAMIC_TRAINING_LOOPS
Added a Dynamic Training checkbox to the TUI settings
Added dynamic_training() and epoch() methods to the AIAgent class.
Added a loops TUI element to show the current number of loops used during the long training phase.
Added a Dynamic Training section to the project’s website.
Added a Epsilon N feature in a new EpsilonAlgoN class.
Added a drop-down menu in the TUI to let the user choose between Epsilon and Epsilon N.
Added an Epsilon N section to the project’s website.
Changed
The learning rate default value for the RNN.
Changed the default minimum Epsilon from 0.0 to 0.1.
Fixed
Passed the TUI minimum, initial and decayEpsilon values from the settings to the actual EpsilonAlgo and EpsilonAlgoN classes AISim:update_settings().
[0.9.0] - 2025-10-13
Changed
Added formatting for large numbers to the stored games and highscores game number.
Added a Tabbed Plots widget.
Moved the Game Score plot widget into the new Tabbed Plots widget.
Added a Highscores plot.
Added a Loss plot.
Based on the shiny new Loss plot, tuned the learning rate for the linear and RNN models.
Added a Learning Rate input to the configuration settings.
Wired the sane default learning rate values (based on the model selection) so that the correct one is loaded when the Defaults button is pressed.
Modified the AITrainer to capture the loss at the end of each train step and return an average of these values at the end of an epoch with a new get_epoch_loss() function.
Added additional constants to the DLabels and Dlayouts files to support the new features.
Removed
Removed the LabPlot widget. It has been replaced by the new TabbedPlots widget.
[0.8.1] - 2025-10-13
Added
Show the number of frames for the long training phase, when mem_type is random frames.
Fixed
Increased width so Random Frames isn’t truncated.
Fixed SQL bug in ReplayMemory:get_random_frames().
Changed
Changed the highscores layout so there’s a space between the scrollbar and the seconds.
Set the stored games back to zero when the restart button is pressed
Clear the data from the games and frames SQLite3 tables when the restart button is pressed.
Format the number of stored games (add commas) when they are large.
[0.8.0] - 2025-10-13
Changed
Linear Model, ModelL, changes:
Formatting improvement of the highscores.
Increased the linear model’s (ModelL) learning rate.
Added an additional hidden layer to the linear model.
Decreased the dropout value from 0.2 to 0.1.
[0.7.0] - 2025-10-13
Added
Added a Time column to the Highscores
A screenshot of the TUI to the website.
Changed
Plotting Widget:
Renamed the plotting widget from Db4EPlot to LabPlot
Removed the code that averaged results since I’m using a sliding method instead.
Added a target model that keeps a frozen copy of the main network. This prevents chasing moving targets instability issues.
Added a soft update (τ=0.01) to blend weights in slowly. This smooths the learning curve.
Replaced MSE with Huber loss. It’s less sensitive to large TD errors (outliers).
Added gradient clipping to prevents exploding gradients.
Added an update frequency (target_update_freq=100) to sync target every ~100 frames
Vectorize the data coming out of the ReplayMemory leading to a HUGE improvement in speed.
Adjust gradients on the batch, not single frames for better learning and smoother gradients
Return the loss for future plotting.
Fixed
Cleared the Plot Widget when the Restart button is pressed
Reset the neural network model’s learned weights when the Restart button is pressed.