亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關(guān)于我們
? 蟲蟲下載站

?? i_qlearner_id.java

?? 一個(gè)多機(jī)器人的仿真平臺(tái)
?? JAVA
?? 第 1 頁 / 共 2 頁
字號(hào):
/** * i_QLearner_id.java  */package EDU.gatech.cc.is.learning;import	java.io.*;import	java.util.*;/** * An object that learns to select from several actions based on * a reward.  Uses the Q-learning method as defined by Watkins. * <P> * The module will learn to select a discrete output based on  * state and a continuous reinforcement input.  The "i"s in front  * of and behind the name imply that this class takes integers as  * input and output.  The "d" indicates a double for the reinforcement  * input (i.e. a continuous value).  * <P> * Copyright (c)2000 Tucker Balch * * @author Tucker Balch (tucker@cc.gatech.edu) * @version $Revision: 1.1 $ */public class i_QLearner_id extends i_ReinforcementLearner_id	implements Cloneable, Serializable	{	/**	 * Used to indicate the learner uses average rewards.	 */	public static final int AVERAGE = 0;	/**	 * Used to indicate the learner uses discounted rewards.	 */	public static final int DISCOUNTED = 1;	private	int	criteria = DISCOUNTED;	// assume discounted rewards        private double  q[][];                  // the q-values        private double  p[][];                  // count of times in each                                                 // state/action        private double  profile[][];            // count of times in each                                                 // state/action for this trial        private int     last_policy[];          // used to count changes in                                                 // policy        private int     changes = 0;            // used to count changes                                                 // in policy per trial	private	int	queries = 0;		// queries per trial	private	double	total_reward = 0;	// reward over trial        private int     first_of_trial = 1;     // indicates if first time        private double  gamma=0.8;              // discount rate        private double  alpha=0.2;              // learning rate        private double  randomrate=0.1;         // frequency of random actions        private double  randomratedecay=0.99;   // decay rate of random actions        private Random  rgen;                   // the random number generator        private int     xn;                     // last state        private int     an;                     // last action	private	long	seed=0;			// random number seed	private static final boolean DEBUG=false;	/**	 * Instantiate a Q learner using default parameters.         * Parameters may be adjusted using accessor methods.	 *	 * @param numstates  int, the number of states the system could be in.	 * @param numactions int, the number of actions or outputs to          *                        select from.	 * @param criteria   int, should be DISCOUNTED or AVERAGE.	 * @param seed       long, the seed. 	 */	public i_QLearner_id(int numstatesin, int numactionsin, int criteriain,		long seedin)		{		super(numstatesin, numactionsin);		if ((criteriain != DISCOUNTED)&&(criteriain != AVERAGE))			{			System.out.println("i_QLearner_id: invalid criteria");			criteria = DISCOUNTED;			}		else			criteria = criteriain;		if (criteria == DISCOUNTED)			System.out.println("i_QLearner_id: DISCOUNTED");		else			System.out.println("i_QLearner_id: AVERAGE");		seed = seedin;                rgen = new Random(seed);                q = new double[numstates][numactions];                profile = new double[numstates][numactions];                p = new double[numstates][numactions];                last_policy = new int[numstates];                for(int i=0; i<numstates; i++)                        {                        for(int j=0; j<numactions; j++)				{                                q[i][j] = rgen.nextDouble()*2 - 1;				p[i][j] = 0;				profile[i][j] = 0;				}                        last_policy[i] = 0;                        }                xn = an = 0;		}	/**	 * Instantiate a Q learner using default parameters.	 * This version assumes you will use a seed of 0.         * Parameters may be adjusted using accessor methods.	 *	 * @param numstates  int, the number of states the system could be in.	 * @param numactions int, the number of actions or outputs to          *                        select from.	 * @param criteria   int, should be DISCOUNTED or AVERAGE.	 */	public i_QLearner_id(int numstatesin, int numactionsin, int criteriain)		{		super(numstatesin, numactionsin);		if ((criteriain != DISCOUNTED)&&(criteriain != AVERAGE))			{			System.out.println("i_QLearner_id: invalid criteria");			criteria = DISCOUNTED;			}		else			criteria = criteriain;		if (criteria == DISCOUNTED)			System.out.println("i_QLearner_id: DISCOUNTED");		else			System.out.println("i_QLearner_id: AVERAGE");                rgen = new Random(seed);                q = new double[numstates][numactions];                profile = new double[numstates][numactions];                p = new double[numstates][numactions];                last_policy = new int[numstates];                for(int i=0; i<numstates; i++)                        {                        for(int j=0; j<numactions; j++)				{                                q[i][j] = rgen.nextDouble()*2 - 1;				p[i][j] = 0;				profile[i][j] = 0;				}                        last_policy[i] = 0;                        }                xn = an = 0;		}	/**	 * Instantiate a Q learner using default parameters.	 * This version assumes you will use discounted rewards.         * Parameters may be adjusted using accessor methods.	 *	 * @param numstates  int, the number of states the system could be in.	 * @param numactions int, the number of actions or outputs to          *                        select from.	 */	public i_QLearner_id(int numstatesin, int numactionsin)		{		super(numstatesin, numactionsin);		System.out.println("i_QLearner_id: DISCOUNTED");		criteria = DISCOUNTED;                rgen = new Random(seed);                q = new double[numstates][numactions];                profile = new double[numstates][numactions];                p = new double[numstates][numactions];                last_policy = new int[numstates];                for(int i=0; i<numstates; i++)                        {                        for(int j=0; j<numactions; j++)				{                                q[i][j] = rgen.nextDouble()*2 - 1;				p[i][j] = 0;				profile[i][j] = 0;				}                        last_policy[i] = 0;                        }                xn = an = 0;                }	/**	 * Set gamma for the Q-learner.	 * This is the discount rate, 0.8 is typical value.	 * It should be between 0 and 1.	 *	 * @param g double, the new value for gamma (0 < g < 1).	 */	public void setGamma(double g)		{		if ((g<0)||(g>1))			{			System.out.println("id_QLearner_i.setGamma: illegal value");			return;			}		gamma = g;		}	/**	 * Set alpha for the Q-learner.	 * This reflects how quickly it should learn.	 * Alpha should be between 0 and 1.	 *	 * @param a double, the new value for alpha (0 < a < 1).	 */	public void setAlpha(double a)		{		alpha = a;		}	/**	 * Set the random rate for the Q-learner.	 * This reflects how frequently it picks a random action.	 * Should be between 0 and 1.	 *	 * @param r double, the new value for random rate (0 < r < 1).	 */	public void setRandomRate(double r)		{		randomrate = r;		}	/**	 * Set the random decay for the Q-learner.	 * This reflects how quickly the rate of chosing random actions	 * decays. 1 would never decay, 0 would cause it to immediately	 * quit chosing random values.	 * Should be between 0 and 1.	 *	 * @param r double, the new value for randomdecay (0 < r < 1).	 */	public void setRandomRateDecay(double r)		{		randomratedecay = r;		}	/**	 * Generate a String that describes the current state of the	 * learner.	 *	 * @return a String describing the learner.	 */        public String toString()                {                int i, j;		String retval = super.toString();                retval = retval + "type = id_QLearner_i alpha = "			+alpha+" gamma = "                        +gamma+"\n";                for (i=0; i<numstates;i++)                        {                        for (j=0; j<numactions;j++)                                {                                retval = retval + q[i][j] + "   ";                                }                        if (i<(numstates - 1)) retval += "\n";                        }                return retval;                }	/**	 * Select an output based on the state and reward.	 *	 * @param statein  int,    the current state.	 * @param rewardin double, reward for the last output, positive	 *                         numbers are "good."	 */        public int query(int yn, double rn)                {		//System.out.println("state "+yn+" reward "+rn);		total_reward += rn;		queries++;                // yn is present state, rn is present reward                double  pick;                int     action;                if (yn>(numstates -1)) // very bad                        {                        System.out.println("id_QLearner_i.query: state "+yn                                +" is out of range.");                        return 0;                        }                /*                 * Find approximate value of present state, and best action.                 *                 * ie:  max q[yn][i] over all i, i is the best action.                 */                double  Vn = -9999999999f;  //very bad                action = 0;                for (int i = 0; i < numactions; i++)                        {                        if (q[yn][i] > Vn)                                {                                Vn = q[yn][i];                                action = i;                                }                        }                /*                 * Now update according to Watkin's iteration:                 */                if (first_of_trial != 1)                        {                        if (DEBUG) System.out.println(				"xn ="+xn+" an ="+an+" rn="+rn);			if (criteria == DISCOUNTED)				{                        	// Watkins update rule:                        	q[xn][an] = (1 - alpha)*q[xn][an] +                                	alpha*(rn + gamma*Vn);				}			else // criteria == AVERAGE				{                        	// Average update rule                        	q[xn][an] = (p[xn][an] * q[xn][an] + rn + Vn)/                        		(p[xn][an] + 2);

?? 快捷鍵說明

復(fù)制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號(hào) Ctrl + =
減小字號(hào) Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
亚洲 欧美综合在线网络| 18欧美亚洲精品| 久久机这里只有精品| 日韩欧美亚洲另类制服综合在线 | 石原莉奈一区二区三区在线观看| 欧美影视一区二区三区| 日韩不卡一区二区| 精品av综合导航| 不卡的av电影| 亚洲国产精品天堂| www国产成人免费观看视频 深夜成人网| 国内精品视频一区二区三区八戒| 国产精品久久久爽爽爽麻豆色哟哟| 日本韩国欧美在线| 美女网站在线免费欧美精品| 中文字幕av资源一区| 欧美日韩高清一区二区| 国产一区二区免费看| 亚洲人精品一区| 日韩一区二区三区在线| 成人福利视频在线| 午夜一区二区三区视频| www亚洲一区| 一本大道久久a久久综合| 免费看黄色91| 亚洲乱码国产乱码精品精98午夜| 7777精品伊人久久久大香线蕉经典版下载 | 日韩影院在线观看| 久久色中文字幕| 在线精品亚洲一区二区不卡| 国产在线精品免费| 亚洲一区二区三区激情| 国产欧美综合在线观看第十页| 欧美日韩不卡在线| 成人免费看片app下载| 蜜臀99久久精品久久久久久软件| 亚洲欧美一区二区久久| 久久亚洲春色中文字幕久久久| 欧美日韩一二三区| 成人免费毛片app| 极品少妇一区二区三区精品视频 | 成人中文字幕在线| 蜜臀av性久久久久蜜臀aⅴ四虎| 亚洲日本中文字幕区| 久久久精品国产免大香伊| 91精品在线免费| 欧美色图在线观看| av在线不卡观看免费观看| 国内一区二区视频| 日本不卡免费在线视频| 香蕉乱码成人久久天堂爱免费| 国产精品久线在线观看| 久久美女艺术照精彩视频福利播放| 欧美三级电影一区| 91高清视频在线| 91美女精品福利| voyeur盗摄精品| 国产99久久久国产精品潘金 | 激情深爱一区二区| 日韩在线卡一卡二| 日韩极品在线观看| 亚洲成人一区二区在线观看| 亚洲精品欧美综合四区| 国产精品久久久久久久岛一牛影视| 久久久国产精华| 久久久综合精品| 欧美成人在线直播| 日韩欧美国产高清| 精品久久一区二区| 欧美精品一区二区在线播放 | 欧美影视一区二区三区| 色婷婷精品大在线视频| 91小视频免费看| 色系网站成人免费| 欧美午夜精品免费| 欧美老肥妇做.爰bbww| 欧美三级电影一区| 欧美一区二区三区视频免费播放 | 日韩精品一区二区三区在线播放 | 北岛玲一区二区三区四区 | 91老司机福利 在线| 91免费在线视频观看| 在线观看亚洲专区| 欧美日韩www| 日韩西西人体444www| 精品国产凹凸成av人导航| 国产亚洲精品精华液| 国产精品女人毛片| 亚洲一区二区三区美女| 日韩精品成人一区二区三区 | 视频一区在线播放| 美女在线观看视频一区二区| 国内成人自拍视频| 99免费精品视频| 色综合色综合色综合色综合色综合| 在线观看不卡一区| 日韩久久免费av| 国产精品午夜春色av| 亚洲免费观看高清完整版在线观看熊| 一个色综合av| 久热成人在线视频| 波多野结衣一区二区三区 | 欧美亚洲动漫制服丝袜| 欧美一二三在线| 国产日韩欧美电影| 亚洲综合一区二区| 国内精品自线一区二区三区视频| 成人深夜福利app| 在线播放中文一区| 国产免费成人在线视频| 亚洲午夜免费电影| 国产精品中文有码| 在线中文字幕不卡| 久久婷婷国产综合精品青草 | 日本v片在线高清不卡在线观看| 六月丁香综合在线视频| 91麻豆高清视频| 欧美大胆一级视频| 亚洲另类在线制服丝袜| 激情文学综合网| 欧美视频精品在线观看| 国产午夜久久久久| 婷婷成人激情在线网| 成人动漫av在线| 日韩亚洲国产中文字幕欧美| 亚洲欧洲性图库| 国产一区二区视频在线| 欧美日韩在线直播| 综合av第一页| 国产乱色国产精品免费视频| 欧美日韩免费一区二区三区视频| 久久综合九色欧美综合狠狠| 午夜欧美大尺度福利影院在线看| 成人国产精品免费网站| 精品蜜桃在线看| 午夜精品成人在线| 91啪在线观看| 国产精品第13页| 国产综合色在线| 91精品国产综合久久久久久久久久 | 久久99国产精品久久99果冻传媒| 欧亚洲嫩模精品一区三区| 欧美激情艳妇裸体舞| 麻豆国产一区二区| 在线成人免费视频| 亚洲国产婷婷综合在线精品| eeuss鲁片一区二区三区在线看| 久久久精品综合| 激情欧美一区二区三区在线观看| 91精品国产综合久久精品app| 亚洲另类中文字| 色哟哟亚洲精品| 亚洲黄网站在线观看| 91在线视频在线| 亚洲人成精品久久久久| 成人的网站免费观看| 国产精品久久久久久久久免费丝袜 | 不卡的电视剧免费网站有什么| 久久久美女艺术照精彩视频福利播放| 免费观看一级欧美片| 这里只有精品99re| 日本欧美加勒比视频| 欧美乱熟臀69xxxxxx| 亚洲综合丁香婷婷六月香| 在线一区二区三区四区五区 | 日韩毛片视频在线看| 93久久精品日日躁夜夜躁欧美| 中文字幕在线不卡一区二区三区| 成人免费高清视频| 中文字幕一区二| 在线精品视频免费播放| 亚洲国产另类av| 日韩欧美一二三| 国产高清精品在线| 欧美国产一区在线| 色综合色综合色综合色综合色综合| 一区二区三区美女视频| 欧美性生活大片视频| 日产国产欧美视频一区精品| 欧美一区二区三区视频| 国产毛片精品国产一区二区三区| 国产欧美视频一区二区三区| 99精品视频一区| 丝袜诱惑制服诱惑色一区在线观看 | 亚洲欧洲色图综合| 欧美午夜精品免费| 蜜臀久久久久久久| 久久久久久久久久久久电影| 91一区二区在线观看| 天天综合天天综合色| 欧美精品一区二区不卡| 成人高清免费在线播放| 亚洲福利一区二区| www久久精品| 色伊人久久综合中文字幕| 日韩成人午夜电影| 亚洲国产岛国毛片在线| 欧美日韩卡一卡二| 粉嫩久久99精品久久久久久夜| 悠悠色在线精品|