python 如何在多个事件中训练RL代理

f4t66c6m 于 2022-12-21 发布在 Python

关注(0)|答案(2)|浏览(94)

我怎样才能创建一个RL代理，它必须在每个200个时间步的1000个不同的事件上执行？使用gym-anytrading和stable-baselines 3

python

来源：https://stackoverflow.com/questions/74735313/how-to-train-a-rl-agent-in-multiple-episodes

2条答案

按热度按时间

isr3a4wc1#

你也可以把你的最大步数封装到环境的step方法中的done标志中。

# define your env and model above
 episodes = 1000
 for ep in range(1, episodes+1):
    state = env.reset()
    done = False
    score = 0
    step = 0

    while step < 200 and not done:
        action = model.predict(state)
        state, reward, done, _ = env.step(action)
        score += reward
        step += 1
    print (f"Episode {ep} is finished at {step} step with a score {score}")

赞(0）回复(0）举报 2022-12-21

qpgpyjmq2#

虽然我不能提供确切的代码示例，因为我不能看到你的代码，我可以告诉你的方法，我对我的项目。你可以检查一个步骤执行后，如果它的终端状态的意思，如果一个代理达到目标，你可以计数步骤，并检查是否超过你的阈值。我是如何在我的代码是：

while episode_counter < training_episode:
#initate your agents here
            while not (agent.is_terminal() or otherAgent.is_terminal() or anotherAgent.is_terminal()):
            #  agent execute step

希望这会有所帮助。我的代码是与多代理，但你可以使用相同的方法为单代理环境。

赞(0）回复(0）举报 2022-12-21

我来回答

python 如何在多个事件中训练RL代理

2条答案

相关问题

热门标签

最新问答