亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關于我們
? 蟲蟲下載站

?? cagent.h

?? 強化學習算法(R-Learning)難得的珍貴資料
?? H
字號:
// Copyright (C) 2003
// Gerhard Neumann (gerhard@igi.tu-graz.ac.at)

//                
// This file is part of RL Toolbox.
// http://www.igi.tugraz.at/ril_toolbox
//
// All rights reserved.
// 
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions
// are met:
// 1. Redistributions of source code must retain the above copyright
//    notice, this list of conditions and the following disclaimer.
// 2. Redistributions in binary form must reproduce the above copyright
//    notice, this list of conditions and the following disclaimer in the
//    documentation and/or other materials provided with the distribution.
// 3. The name of the author may not be used to endorse or promote products
//    derived from this software without specific prior written permission.
// 
// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
// IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
// OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
// IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
// INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
// NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
// THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

#ifndef CAGENT_H
#define CAGENT_H

#include "cenvironmentmodel.h"
#include "cagentlistener.h"
#include "cagentcontroller.h"
#include "caction.h"
#include "cstate.h"
#include "chierarchiccontroller.h"
#include "cstatecollection.h"
#include "ril_debug.h"

#include <list>

class CEpisode;

/// Class for sending the State-Action-State Tuple to the Listeners
/** Maintains a List of CSemiMDPListeners. The class provides methods for sending the Listeners that a new
Episode has started, a new Step (State-Action-State Tuple) or an intermediate Step has occured.
@see CSemiMDPListener
*/
class CSemiMDPSender
{
protected:
	std::list<CSemiMDPListener *> *SMDPListeners;
public:
	CSemiMDPSender();
	virtual ~CSemiMDPSender();

/// Add a Listener to the Listener-List
	void addSemiMDPListener(CSemiMDPListener *listener);
/// Remove a Listener from the Listerner-List
	void removeSemiMDPListener(CSemiMDPListener *listener);

	bool isListenerAdded(CSemiMDPListener *listener);

/// Tells all Listeners that a new Episode has occured
	virtual void startNewEpisode();
/// Sends the State-Action-State Tuple to all Listeners
	virtual void sendNextStep(CStateCollection *lastState, CAction *Action,  CStateCollection *currentState);
/// Sends the State-Action-State Tuple to all Listeners, indicating that ist was an intermediate step
	virtual void sendIntermediateStep(CStateCollection *lastState, CAction *Action, CStateCollection *currentState);
};

/// Class for providing the general Functions for the learning Environment
/**
CSemiMarkovDecisionProcess is the super class of all "acting" agents. It maintains a list of available Actions for the 
SMDP. It also loggs the number of Episodes and the Number of Steps done in the current Epsiode. 
It provides the functionality for sending a Semi-Markov Step, but its only able to send PrimitiveActions. 
For extended Actions you have to use the hierarchicalMDP.
/par
The class is a subclass of CDeterministicController in order to make any Controller assigned to the SMDP a deterministic Controller. The CDeterministicController Object is always the first Object to be informed about the new Step, and its not in the ListenerList (Recursions).
This feature is needed for learning the exact Policy of the agent (see CSarsaLearner), so the agent can be used as estimation policy.
@see CAgent
@see CHierarchicalalSemiMarkovDecisionProcess
*/
class CSemiMarkovDecisionProcess : public CDeterministicController, public CSemiMDPSender
{
protected:

	CAction *lastAction;
	
	int currentEpisodeNumber;
	int currentSteps;

	int totalSteps;

	bool isFirstStep;


public:

	CSemiMarkovDecisionProcess();
	~CSemiMarkovDecisionProcess();

/// Sends the next Step to all Listeners. I.e that if the Action is a finished MultiStepAction.
	virtual void sendNextStep(CStateCollection *lastState, CAction *action, CStateCollection *currentState);

/// Returns the last Action sent to all Listeners
	CAction* getLastAction();

/// Sends to all Listeners that a new Episode occured and updates currentEpisodeNummer.
	virtual void startNewEpisode();

/// Returns the number of Episodes.
	int getCurrentEpisodeNumber() {return this->currentEpisodeNumber;};
/// Returns the number of steps.
	int getCurrentStep() {return this->currentSteps;};

	int getTotalSteps() {return this->totalSteps;};

/// Adds an Action to the ActionSet of the SMDP. 
	virtual void addAction(CAction *action);
};

/// Subclass of CSemiMarkovDecisionProcess, used for hierarchical Learning
/**
This abstract class provides full Hierarchical learning functionality. It implements the CHierarchicalStackListener interface, 
so the HierarchicalController can inform the SMDP about a hierarchical Step. Than the SMDP sends the State-Action-State Tuple
with the action done by the specific hierarchical SMDP. \par
In order to provide hierarchical Functionality the class also represents an ExtendedAction, so it can be used as Action for another hierarchical SMDP. 
It can't be used as action for the agent, you have to use CHierarchicalController to create the HierarchicalStructure.
Use the hierarchical controller as controller for the agent, the agent itself does'nt know anything about the hierarchical structure of the learning Problem. 
The class is abstract because the isFinished Method from CMultiStepAction remains to be implemented.
@see CHierarchicalController
*/
class CHierarchicalSemiMarkovDecisionProcess : public CSemiMarkovDecisionProcess, public CHierarchicalStackListener, public CExtendedAction, public CStateModifiersObject
{
protected:
/// Returns the action done by the SMDP 
	virtual CAction *getExecutedAction(CHierarchicalStack *actionStack);
	
	/** Pointer to the currentEpisode, the currenEpisode must be updated before the sendNextStep method is called. So currentEpisode 
	has to be the first Element of the agent's Listener-List. Needed for determining the intermediate Steps.
	*/
	CEpisode *currentEpisode;

	CStateCollectionImpl *pastState;
	CStateCollectionImpl *currentState;
public:
/**
Creates a new hierarchical SMDP. The episode is needed for reconstruction of the intermediate and hierarchical steps.
 @param currentEpisode Pointer to the current Episode. It is recommended to use the currentEpisode Object of the Agent.
*/
	CHierarchicalSemiMarkovDecisionProcess(CEpisode *currentEpisode);
	CHierarchicalSemiMarkovDecisionProcess(CStateProperties *modelProperties, std::list<CStateModifier *> *modifiers = NULL);

	~CHierarchicalSemiMarkovDecisionProcess();

	virtual void setLoggedEpisode(CEpisode *loggedEpisode);

	virtual void nextStep(CStateCollection *oldState, CHierarchicalStack *actionStack, CStateCollection *newState);
	virtual void newEpisode();

/// Sends the nextStep to the listeners
/**
If the action is an extended action, all intermediated steps and the rlt_real step itself get recovered from the Episode object,
and send to the listeners (intermediate Steps gets send with the "intermediateStep" method). If the action is not an extended action,
the nextSend Method from the super class gets called.
*/
	virtual void sendNextStep(CAction *action);

	virtual bool isFinished(CStateCollection *oldState, CStateCollection *newState) {return false;};

	virtual CAction *getNextHierarchyLevel(CStateCollection *stateCollection, CActionDataSet *actionDataSet = NULL);

	/// Add a state Modifier to the StateCollections
	virtual void addStateModifier(CStateModifier *modifier);
	/// remove a state Modifier from the StateCollections
	virtual void removeStateModifier(CStateModifier *modifier);

};

/// The class represents the main acting Object of the Learning System, the agent
/** The agent is the object which acts within its environment an sends every step to its 
SemiMDPListener, its the only "acting" object so it's the most important part of the toolbox. The Agent follows the Policy set by setController(CAgentController *). It saves the
currentState, then tells the model which action to execute and then saves the new state. Having done that the agent is able
to send the State-Action-State tuple to all its Listeners. \par
The agent's actionset can only maintain PrimitiveActions. The agent has an agent controller which can choose from the actions in the agent's actoinset. It is not allowed that an controller returns an action which isn't in the agent's action set.
ExtendedActions can only be added to the CHierarchicalSemiMarkovDecisionProcess class.
\par
Another important functionality of the agent are the StateModifiers which can be added to the agent. 
The stateModifier is than added to the stateCollections (currentState, lastState) of the agent. 
If you add a StateModifier to the agent, the modified state is calculated  by the state modifier after the modelstate has changed and added to the state collection.
The modified State, which can be a discrete, a feature or any other State calculated from the original model state is now 
available to all Listeners, and it only gets calculated once. So the Listeners have access to several different kind of states.
\par
For the execution of actions you have several possibilities, the agent can execute:
- a given action (doAction), 
- a single action from the controller (doControllerStep) 
- or one or more Episodes following the Policy from the Controller (doControllerEpisode). You can specify how much steps each episode should have at maximum.
The agent also loggs the current episode. You need the current Episode for the hierarchical SMDPs, they need 
an instance of the Episode to reconstruct the intermediate steps. This feature can be turned off by setLoggegEpisode(bool) for performance reasons if it isn't needed.
@see CSemiMDPListener
@see CHierarchicalSemiMarkovDecisionProcess
@see CSemiMarkovDecisionProcess
*/
class CAgent : public CSemiMarkovDecisionProcess, public CStateModifiersObject
{
protected:
	CStateCollectionImpl *currentState;
	CStateCollectionImpl *lastState;
	
    int maxEpisodes;
	int maxSteps;

	bool keyboardBreaks;

	CEnvironmentModel *model;

	bool bLogEpisode;

	int doRun(bool bContinue);

	CEpisode *currentEpisode;
public:

	CAgent(CEnvironmentModel *model);
	~CAgent();

/// Execute the action and send the State-Action-State tuple
	virtual void doAction(CAction *action);
/// Add a state Modifier to the StateCollections
	virtual void addStateModifier(CStateModifier *modifier);
/// remove a state Modifier from the StateCollections
	virtual void removeStateModifier(CStateModifier *modifier);



/// Executes maxEpisodes, if an epsiode reaches maxsteps, a new episode is startet automatically
/// Returns -1 if training has been paused by a keystroke.
/// Call doResume() to continue training.
	int doControllerEpisode(int maxEpisodes = 1, int maxSteps = 5000);
/// Set the Training Parameters, called by doControllerEpisode	
	void setParameters(int maxEpisodes, int maxSteps);
/// Resume the Training if it was paused (e.g. by a keystroke)	
/// Returns -1 if training has been paused by a keystroke.
/// Call doResume() to continue training.
	int doResume();

/// Tells all Listeners that a new Episode has occured and resets the model
	virtual void startNewEpisode();

/// Gets action from the controller and executes it
	/** 
	Be aware that you will get an assertation if the agent controller isn't set probably!
	*/
	void doControllerStep();

/// Sets wether the training can be paused by a keystroke. 
	void setKeyboardBreak(bool keyboardBreak);
    bool getKeyboardBreak();

/// add an primitiv action to the agent's actionlist. The agent can only choose from this actions.
	virtual void addAction(CPrimitiveAction *action);

/// Sets wether the currentEpisode should be logged or not.
	virtual void setLogEpisode(bool bLogEpisode);

/// Returns the currentEpisode Object (only valid if bLogEpisode = true).
	virtual CEpisode *getCurrentEpisode();

	virtual CStateCollection *getCurrentState();

	CEnvironmentModel *getEnvironmentModel();
};


#endif

?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
亚洲少妇30p| 亚洲h动漫在线| 欧美日韩在线综合| 成人精品视频一区二区三区尤物| 亚洲日本在线看| 欧美一级二级三级乱码| jvid福利写真一区二区三区| 久久精品国产网站| 亚洲一区二区偷拍精品| 亚洲国产精品99久久久久久久久| 在线播放日韩导航| 国产精品一区二区无线| 日韩va亚洲va欧美va久久| 亚洲精品少妇30p| 国产欧美综合在线| 精品少妇一区二区三区视频免付费 | 国产精品亚洲专一区二区三区| 一区二区三区四区高清精品免费观看| 久久久www成人免费无遮挡大片| 在线不卡一区二区| 欧美三级在线视频| 色综合天天天天做夜夜夜夜做| 国产精品一区在线观看你懂的| 免费高清在线一区| 首页欧美精品中文字幕| 亚洲午夜久久久久久久久电影网 | 亚洲免费在线电影| 欧美国产日本韩| 久久婷婷成人综合色| 欧美一区二区久久| 91精品国产综合久久福利| 欧美视频在线不卡| 91国产福利在线| 欧洲一区二区三区在线| 91久久精品一区二区三区| 99久久久国产精品| 99久久精品国产精品久久| 不卡一区在线观看| 成人av资源下载| 成人福利在线看| 97精品国产97久久久久久久久久久久| 成人黄动漫网站免费app| 成人丝袜视频网| 97精品久久久久中文字幕 | 日韩午夜精品视频| 欧美成人精品福利| 精品盗摄一区二区三区| 久久综合色鬼综合色| 久久色中文字幕| 欧美韩国一区二区| 亚洲天堂免费看| 亚洲影视资源网| 图片区小说区国产精品视频| 日本美女视频一区二区| 久久精品国产在热久久| 国产一区二区调教| 春色校园综合激情亚洲| gogo大胆日本视频一区| 欧美亚洲综合色| 欧美精品乱码久久久久久| 日韩三级中文字幕| xnxx国产精品| 亚洲色图视频网| 亚洲国产欧美日韩另类综合| 日韩精品色哟哟| 国产成人综合在线| 91麻豆精品秘密| 777午夜精品免费视频| 26uuu精品一区二区三区四区在线| 国产精品美女久久久久av爽李琼 | 久久蜜桃av一区二区天堂| 国产精品国产三级国产aⅴ中文 | 欧美刺激午夜性久久久久久久| 欧美精品一区二区蜜臀亚洲| 国产精品午夜久久| 亚洲一区二区在线免费观看视频| 美女mm1313爽爽久久久蜜臀| 不卡的电影网站| 911精品产国品一二三产区| 久久久久亚洲蜜桃| 亚洲一区二区av在线| 激情综合色丁香一区二区| 不卡视频在线看| 91精品国产日韩91久久久久久| 国产亚洲一区二区三区| 一卡二卡欧美日韩| 久久成人免费网站| 91色porny蝌蚪| 欧美精品一区二区高清在线观看| 自拍视频在线观看一区二区| 免费成人av在线播放| 91在线免费播放| 久久综合狠狠综合久久激情| 一区二区三区四区视频精品免费 | 国产精品亚洲综合一区在线观看| 欧美亚洲一区二区在线| 中文字幕乱码一区二区免费| 日韩精品一二区| 一本色道综合亚洲| 亚洲国产精品国自产拍av| 青青草视频一区| 欧美午夜一区二区| 国产精品毛片久久久久久| 青青青伊人色综合久久| 欧美影院精品一区| 亚洲欧美中日韩| 国产美女精品在线| 日韩精品一区二区三区蜜臀| 一区二区三区精品在线观看| 国产99久久久国产精品| 日韩一区二区高清| 夜夜嗨av一区二区三区| av在线一区二区三区| 欧美大黄免费观看| 丝袜美腿亚洲综合| 色综合色综合色综合| 中文成人av在线| 国产精品自产自拍| 精品日本一线二线三线不卡| 午夜精品久久久久久| 色8久久人人97超碰香蕉987| 国产精品久久久久一区二区三区共| 国产综合久久久久久鬼色 | 成人午夜在线视频| 国产亚洲一区字幕| 国产精品 日产精品 欧美精品| 精品久久久网站| 久久电影国产免费久久电影| 日韩欧美一级二级三级久久久| 丝袜美腿高跟呻吟高潮一区| 欧美日韩国产a| 日韩精品亚洲一区二区三区免费| 欧美色电影在线| 亚洲成人av免费| 777亚洲妇女| 美女一区二区三区在线观看| 欧美变态tickling挠脚心| 久久国产婷婷国产香蕉| 欧美成人video| 精品一区二区三区香蕉蜜桃| 久久亚洲综合色一区二区三区| 国产乱码精品一区二区三区五月婷| 欧美精品一区二| 国产精品夜夜爽| 亚洲欧洲av在线| 色8久久人人97超碰香蕉987| 亚洲五码中文字幕| 欧美精品粉嫩高潮一区二区| 捆绑变态av一区二区三区| 精品乱人伦一区二区三区| 久久99精品久久久久久久久久久久| 精品国产乱码久久| 国产成人精品亚洲午夜麻豆| 国产精品久久久久久久久久久免费看| 91色在线porny| 亚洲香肠在线观看| 91精品国产综合久久久蜜臀粉嫩| 美女视频黄久久| 国产喂奶挤奶一区二区三区| 不卡av免费在线观看| 一区二区三区国产精品| 欧美一区日韩一区| 国产美女精品人人做人人爽| 中文在线一区二区| 日本精品一级二级| 蜜臀久久99精品久久久久久9 | 亚洲最大色网站| 欧美一区二区三区四区五区| 国产精品中文有码| 亚洲欧美日韩小说| 日韩三级电影网址| 国产成人av电影在线| 亚洲午夜在线视频| 久久久一区二区| 在线欧美日韩精品| 久久99国产精品免费网站| 中文字幕在线不卡国产视频| 欧美日韩日日摸| 成人午夜在线视频| 琪琪久久久久日韩精品| 国产精品高潮呻吟| 日韩丝袜美女视频| 91小宝寻花一区二区三区| 免费在线视频一区| 中文字幕一区在线观看视频| 日韩视频免费直播| 91亚洲国产成人精品一区二三| 蜜臀av一区二区在线免费观看| 国产精品久久久久aaaa樱花| 欧美日韩高清一区二区不卡 | 91香蕉视频mp4| 久久精品国产第一区二区三区| 国产精品国产成人国产三级| 日韩午夜在线观看| 色国产综合视频| 国产风韵犹存在线视精品| 日韩影院在线观看| 亚洲欧美偷拍另类a∨色屁股| 欧美成人女星排行榜| 在线观看亚洲一区|