?? helpfile.txt
字號:
Windows Help File Format / Annotation File Format / SHG and MRB File Format
This documentation describes the file format parsed by HELPDECO, because
Microsoft did not publish the file formats used by WinHelp and MultiMedia
Viewers, and created by HC30, HC31, HCP, HCRTF, HCW, MVC, MMVC and WMVC.
This way it is not an official reference, but the result of many weekends
of work dumping 500+ help files and trying to understand what all the bytes
may mean.
I would like to thank Pete Davis, who first tried to describe 'The Windows
Help File Format' in Dr. Dobbs Journal, Sep/Oct 1993, and Holger Haase, who
did a lot of work on picture file formats and Bent Lynggaard for the infor-
mation on free lists in help files and unused bytes in B+ trees.
Revision 1: Fixed hash value calculation and |FONT, minor additions
Revision 2: Transparent bitmaps, {button}, and {mci} commands
Revision 3: Unknown in Paragraphinfo changed, minor additions
Revision 4: CTXOMAP corrected, bitmap dimensions dpi - not PelsPerMeter
Revision 5: MacroData in HotspotInfo added, Annotation file format added
Revision 6: [MACROS] section / internal file |Rose added, MVB font structure
Revision 7: [GROUPS] section *.GRP and [CHARTAB] section *.tbl file format
Revision 8: free list, clarified TOPICPOS/TOPICOFFSET
Revision 9: B+ tree unused bytes and what I found out about GID files
A help file starts with a header, the only structure at a fixed place
long Magic 0x00035F3F
long DirectoryStart offset of FILEHEADER of internal directory
long FirstFreeBlock offset of FREEHEADER or -1L if no free list
long EntireFileSize size of entire help file in bytes
----
char HelpFileContent[EntireFileSize-16] the remainder of the help file
At offset DirectoryStart the FILEHEADER of the internal directory is located
long ReservedSpace size reserved including FILEHEADER
long UsedSpace size of internal file in bytes
unsigned char FileFlags normally 4
----
char FileContent[UsedSpace] the bytes contained in the internal file
char FreeSpace[ReservedSpace-UsedSpace-9]
The FILEHEADER of the internal directory is followed by UsedSpace bytes
containing the internal directory which is used to associate FileNames and
FileOffsets. The directory is structured as a B+ tree.
A B+ tree is made from leaf-pages and index-pages of fixed size, one of which
is the root-page. All entries are contained in leaf-pages. If more entries
are required than fit into a single leaf-page, index-pages are used to locate
the leaf-page which contains the required entry.
A B+ tree starts with a BTREEHEADER telling you the size of the B+ tree pages,
the root-page, the number of levels, and the number of all entries in this
B+ tree. You must follow (NLevels-1) index-pages before you reach a leaf-page.
unsigned short Magic 0x293B
unsigned short Flags bit 0x0002 always 1, bit 0x0400 1 if directory
unsigned short PageSize 0x0400=1k if directory, 0x0800=2k else, or 4k
char Structure[16] string describing format of data
'L' = long (indexed)
'F' = NUL-terminated string (indexed)
'i' = NUL-terminated string (indexed)
'2' = short
'4' = long
'z' = NUL-terminated string
'!' = long count value, count/8 * record
long filenumber
long TopicOffset
short MustBeZero 0
short PageSplits number of page splits B+ tree has suffered
short RootPage page number of B+ tree root page
short MustBeNegOne 0xFFFF
short TotalPages number of B+ tree pages
short NLevels number of levels of B+ tree
long TotalBtreeEntries number of entries in B+ tree
----
char Page[TotalPages][PageSize] the pages the B+ tree is made of
If NLevel is greater than 1, RootPage is the page number of an index-page.
Index-pages start with a BTREEINDEXHEADER and are followed by an array of
BTREEINDEX structures, in case of the internal directory containing pairs
of FileNames and PageNumbers.
(STRINGZ is a NUL-terminated string, sizeof(STRINGZ) is strlen(string)+1).
PageNumber gets you to the next page containing entries lexically starting
at FileName, but less than the next FileName. PreviousPage gets you to the
next page if the desired FileName is lexically before the first FileName.
unsigned short Unused number of free bytes at end of this page
short NEntries number of entries in this index-page
short PreviousPage page number of previous page
----
struct and this is the structure of directory index-pages
{
STRINGZ FileName varying length NUL-terminated string
short PageNumber page number of page dealing with FileName and above
}
DIRECTORYINDEXENTRY[NEntries]
After NLevels-1 of index-pages you will reach a leaf-page starting with a
BTREENODEHEADER followed by an array of BTREELEAF structures, in case of the
internal directory containing pairs of FileNames and FileOffsets.
You may follow the PreviousPage entry in all NLevels-1 index-pages to reach
the first leaf-page, then iterate thru all entries and use NextPage to
follow the double linked list of leaf-pages until NextPage is -1 to retrieve
a sorted list of all TotalBtreeEntries entries contained in the B+ tree.
unsigned short Unused number of free bytes at end of this page
short NEntries number of entries in this leaf-page
short PreviousPage page number of previous leaf-page or -1 if first
short NextPage page number of next leaf-page or -1 if last
----
struct and this is the structure of directory leaf-pages
{
STRINGZ FileName varying length NUL-terminated string
long FileOffset offset of FILEHEADER of internal file FileName
relative to beginning of help file
}
DIRECTORYLEAFENTRY[NEntries]
At offset FreeListBlock the first FREEHEADER is located. It contains
long FreeSpace number of bytes unused, including this header
long NextFreeBlock offset of next FREEHEADER or -1L if end of list
----
char Unused[FreeSpace-8] unused bytes
All unused portions of the help file are linked together using FREEHEADERs.
Now that you are able to locate the position of an internal file in the
help file, let's describe what they contain. Remember that each FileOffset
first takes you to the FILEHEADER of the internal file. The structures
described next are located just behind this FILEHEADER.
|SYSTEM
The first one to start with is the |SYSTEM file. This is the SYSTEMHEADER,
the structure of the first bytes of this internal file:
short Magic 0x036C
short Minor help file format version number
15 = HC30 Windows 3.0 help file
21 = HC31 Windows 3.1 help file
27 = WMVC/MMVC media view file
33 = MVC or HCW 4.00 Windows 95
short Major 1
time_t GenDate help file created seconds after 1.1.1980, or 0
unsigned short Flags see below
Use Minor and Flags to find out how the help file was compressed:
Minor <= 16 not compressed, TopicBlockSize 2k
Minor > 16 Flags=0: not compressed, TopicBlockSize 4k
Flags=4: LZ77 compressed, TopicBlockSize 4k
Flags=8: LZ77 compressed, TopicBlockSize 2k
Additionally the help file may use phrase compression (oldstyle or Hall).
If Minor is 16 or less, the help file title follows the SYSTEMHEADER:
STRINGZ HelpFileTitle
If Minor is above 16, one or more SYSTEMREC records follow instead up to the
internal end of the |SYSTEM file:
struct
{
unsigned short RecordType type of data in record
unsigned short DataSize size of data
----
char Data[DataSize] dependent on RecordType
}
SYSTEMREC[]
There are different RecordTypes defined, each storing different Data.
They mainly contain what was specified in the help project file.
RecordType Data
1 TITLE STRINGZ Title help file title
2 COPYRIGHT STRINGZ Copyright copyright notice shown in AboutBox
3 CONTENTS TOPICOFFSET Contents topic offset of starting topic
4 CONFIG STRINGZ Macro all macros executed on opening
5 ICON Windows *.ICO file See WIN31WH on icon file format
6 WINDOW struct Windows defined in the HPJ-file
{
struct
{
unsigned short TypeIsValid:1
unsigned short NameIsValid:1
unsigned short CaptionIsValid:1
unsigned short XIsValid:1
unsigned short YIsValid:1
unsigned short WithIsValid:1
unsigned short HeigthIsValid:1
unsigned short MaximizeWindow:1
unsigned short RGBIsValid:1
unsigned short RGBNSRIsValid:1
unsigned short WindowsAlwaysOnTop:1
unsigned short AutoSizeHeight:1
}
Flags
char Type[10] type of window
char Name[9] window name
char Caption[51] caption of window
short X x coordinate of window (0..1000)
short Y y coordinate of window (0..1000)
short Width width of window (0..1000)
short Height height of window (0..1000)
short Maximize maximize flag and window styles
COLORREF Rgb color of scrollable region
COLORREF RgbNsr color of non scrollable region
}
Window
6 WINDOW typedef struct Viewer 2.0 Windows defined in MVP-file
{
unsigned short Flags
char Type[10] /* type of window */
char Name[9] /* window name */
char Caption[51] /* caption for window */
unsigned char MoreFlags
short X /* x coordinate of window (0..1000) */
short Y /* y coordinate of window (0..1000) */
short Width /* width of window (0..1000) */
short Height /* height of window (0..1000) */
short Maximize /* maximize flag and window styles */
COLORREF Rgb1
char Unknown
COLORREG Rgb2
COLORREF Rgb3
short X2
short Y2
short Width2
short Height2
short X3
short Y3
}
Window;
8 CITATION STRINGZ Citation the Citation printed
9 LCID short LCID[4] language ID, Windows 95 (HCW 4.00)
10 CNT STRINGZ ContentFileName CNT file name, Windows 95 (HCW 4.00)
11 CHARSET unsigned short Charset charset, Windows 95 (HCW 4.00)
12 DEFFONT struct default dialog font, Windows 95 (HCW 4.00)
{
unsigned char HeightInPoints
unsigned char Charset
STRINGZ FontName
}
DefFont
12 FTINDEX STRINGZ dtype Multimedia Help Files dtypes
13 GROUPS STRINGZ Group defined GROUPs, Multimedia Help File
14 INDEX_S. STRINGZ IndexSeparators separators, Windows 95 (HCW 4.00)
14 KEYINDEX struct Multimedia Help Files
{
char btreename[10]; btreename[1] is footnote character
char mapname[10];
char dataname[10];
char title[80];
}
KeyIndex
18 LANGUAGE STRINGZ language defined language, Multimedia Help Files
19 DLLMAPS struct defined DLLMAPS, Multimedia Help Files
{
STRINGZ Win16RetailDLL
STRINGZ Win16DebugDLL
STRINGZ Win32RetailDLL
STRINGZ Win32DebugDLL
}
DLLNames
|Phrase
If the help file is phrase compressed, it contains an internal file named
|Phrases. Windows 3.0 help files generated with HC30 use the following
uncompressed structure to store phrases. A phrase is not NUL-terminated,
instead use the next PhraseOffset to locate the end of the phrase string
(there is one more phrase offset stored than phrases are defined to allow
for this).
unsigned short NumPhrases number of phrases in table
unsigned short OneHundred 0x0100
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]
Windows 3.1 help files generated using HC31 and later always LZ77 compress
the Phrase character array. Read NumPhrases, OneHundred, DecompressedSize,
and NumPhrases+1 PhraseOffset values. Allocate DecompressedSize bytes for
the Phrase character array and decompress the UsedSpace-2*NumPhrases-10
remaining bytes into the allocated space to retrieve the phrase strings.
unsigned short NumPhrases number of phrases in table
unsigned short OneHundred 0x0100
long DecompressedSize
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
---- the remaining part is LZ77 compressed
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]
The LZ77 decompression algorithm can best be described like this:
Take the next byte
Start at the least significant bit
If the bit is cleared
Copy 1 byte from source to destination
Else
Get the next WORD into the struct { unsigned pos:12; unsigned len:4; }
Copy len+3 bytes from destination-pos-1 to destination
Loop until all bits are done
Loop until all bytes are consumed
See end of this file for a detailed algorithm.
Some MVBs use a slightly different layout of internal |Phrases file:
unsigned short EightHundred 0x0800
unsigned short NumPhrases number of phrases in table
unsigned short OneHundred 0x0100
long DecompressedSize
char unused[30]
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
---- the remaining part is LZ77 compressed
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]
|PhrIndex
Windows 95 (HCW 4.00) may use Hall compression and the internal files
|PhrIndex and |PhrImage to store phrases. Both must be used to build a
table of phrases and PhraseOffsets. |PhrIndex starts with this header:
long Magic 1L
long NEntries
long CompressedSize
long PhrImageSize
long PhrImageCompressedSize
long Always0 0L
unsigned short BitCount:4
unsigned short UnknownBits:12
unsigned short Always4A00 not really always
The remaining data is bitcompressed. Use this algorithm to build a table
of PhraseOffsets:
short n,i; long mask=0,*ptr=(long *)(&always4A00+1);
int GetBit(void)
{
ptr+=(mask<0);
mask=mask*2+(mask<=0);
return (*ptr&mask)!=0;
}
PhaseOffset[0]=0;
for(i=0;i<NEntries;i++)
{
for(n=1;GetBit();n+=1<<BitCount) ;
if(GetBit()) n+=1;
if(BitCount>1) if(GetBit()) n+=2;
if(BitCount>2) if(GetBit()) n+=4;
if(BitCount>3) if(GetBit()) n+=8;
if(BitCount>4) if(GetBit()) n+=16;
PhraseOffset[i+1]=PhraseOffset[i]+n;
}
Just behind the bitcompressed phrase length information (on a 32-bit boundary,
that's why GetBit consumed longs) follow NumPhrases bits (one bit for each
phrase). It is assumed that this information is used for the full text search
capability to exclude certain phrases.
|PhrImage
The |PhrImage file stores the phrases. A phrase is not NUL-terminated. Use
PhraseOffset[NumPhrase] and PhraseOffset[NumPhrase+1] to locate beginning
and end of the phrase string. We generated one more PhraseOffset to allow
for this. |PhrImage is LZ77 compressed if PhrImageCompressedSize is not
equal to PhrImageSize. Otherwise you may take it as stored.
|FONT
The next internal file described is the |FONT file, which uses this header:
unsigned short NumFacenames number of face names
unsigned short NumDescriptors number of font descriptors
unsigned short FacenamesOffset start of array of face names
relative to &NumFacenames
unsigned short DescriptorsOffset start of array of font descriptors
relative to &NumFacenames
--- only if FacenamesOffset >= 12
unsigned short NumStyles number of style descriptors
unsigned short StyleOffset start of array of style descriptors
relative to &NumFacenames
--- only if FacenamesOffset >= 16
unsigned short NumCharMapTables number of character mapping tables
unsigned short CharMapTableOffset start of array of character mapping
table names relative to &NumFacenames
The face name array is located at FacenamesOffset and contains strings, which
are Windows font names or in case of multimedia files a Windows font name
concatenated with ',' and the character mapping table number. Short strings
are NUL-terminated, but a string may use all bytes for characters.
char FaceName[NumFacenames][(DescriptorsOffset-FacenamesOffset)/NumFacenames]
At DescriptorsOffset is an array located describing all fonts used in the help
file. If this kind of descriptor appears in a help file, any metric value is
given in HalfPoints.
struct oldfont
{
struct
{
unsigned char Bold:1
unsigned char Italic:1
unsigned char Underline:1
unsigned char StrikeOut:1
unsigned char DoubleUnderline:1
unsigned char SmallCaps:1
}
Attributes
unsigned char HalfPoints PointSize * 2
unsigned char FontFamily font family. See values below
unsigned short FacenameIndex index into FaceName array
unsigned char FGRGB[3] RGB values of foreground
unsigned char BGRGB[3] unused background RGB Values
}
FontDescriptor[NumDescriptors]
#define FAM_MODERN 0x01 This is a different order than
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -