?? type_gvsm.html
字號:
<html><head><title>Generated Documentation</title></head><body> <image src="headerimage.png"> <br><br><table><tr><td><big><big><big style="font-family: arial;"><b>GVSM</b></big></big></big><br><br></td><td> The Vector Space Model for information retrieval represents each term in a text document as a weight value in an orthogonal dimension. This class is designed to be used in two passes: First you build up the vocabulary by feeding it a representative collection of documents, then you obtain the vector for each document.</td></tr></table><br><br><big><big><i>Statics (public)</i></big></big><br><div style="margin-left: 40px;">void <big><b>ExtractWords</b></big>(const char* pFile, int nSize, ^R|,*dPV pProcessWordFunc, void* pThis)<br><div style="margin-left: 80px;"><font color=brown> Parses all the words in the file and calls pProcessWordFunc for each one</font></div><br></div><br><big><big><i>Constructors (public)</i></big></big><br><div style="margin-left: 40px;"><big><b>GVSM</b></big>()<br></div><br><big><big><i>Destructors</i></big></big><br><div style="margin-left: 40px;"><big><b>~GVSM</b></big>()<br></div><br><big><big><i>Public</i></big></big><br><div style="margin-left: 40px;">void <big><b>AddDocumentToVocabulary</b></big>(const char* szText, int nLength)<br><div style="margin-left: 80px;"><font color=brown> Extract all the words from the document and add all the non-stop-words to the vocabulary</font></div><br>void <big><b>AddWordToVector</b></big>(const char* szWord, int nLen)<br><div style="margin-left: 80px;"><font color=brown> Internal helper method--don't call it</font></div><br>void <big><b>AddWordToVocabulary</b></big>(const char* szWord, int nLen)<br><div style="margin-left: 80px;"><font color=brown> Internal helper method--don't call it</font></div><br>int <big><b>FindStemIndex</b></big>(const char* szStem)<br><div style="margin-left: 80px;"><font color=brown> Returns the index of the stem word. Returns -1 if not in the vocabulary.</font></div><br>int <big><b>GetMaxWordFrequency</b></big>(int nIndex)<br><div style="margin-left: 80px;"><font color=brown> Returns the maximum number of occurrences of a word in any training document</font></div><br>int <big><b>GetNumberOfDocsContainingWord</b></big>(int nIndex)<br><div style="margin-left: 80px;"><font color=brown> Returns the number of training documents which contain a word</font></div><br>int <big><b>GetTrainingDocumentCount</b></big>()<br><div style="margin-left: 80px;"><font color=brown> Return the number of documents used to train the vocabulary</font></div><br>void <big><b>GetVector</b></big>(double* pOutVector, const char* szText, int nLength)<br><div style="margin-left: 80px;"><font color=brown> pOutVector should be an array of n doubles where n = GetVocabSize().</font></div><br>int <big><b>GetVocabSize</b></big>()<br><div style="margin-left: 80px;"><font color=brown> Returns the number of words in the vocabulary</font></div><br>char* <big><b>GetVocabWord</b></big>(const int* nIndex)<br><div style="margin-left: 80px;"><font color=brown> Returns a word in the vocab</font></div><br></div><br></body></html>
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -