?? unicode.sgml
字號:
<!-- ##### SECTION Title ##### -->Unicode Manipulation<!-- ##### SECTION Short_Description ##### -->functions operating on Unicode characters and UTF-8 strings<!-- ##### SECTION Long_Description ##### --><para>This section describes a number of functions for dealing withUnicode characters and strings. There are analogues of thetraditional <filename>ctype.h</filename> character classificationand case conversion functions, UTF-8 analogues of some string utility functions, functions to perform normalization, case conversion andcollation on UTF-8 strings and finally functions to convert betweenthe UTF-8, UTF-16 and UCS-4 encodings of Unicode.</para><para>The implementations of the Unicode functions in GLib are basedon the Unicode Character Data tables, which are available from<ulink url="http://www.unicode.org">www.unicode.org</ulink>.GLib 2.8 supports Unicode 4.0, GLib 2.10 supports Unicode 4.1,GLib 2.12 supports Unicode 5.0.</para><!-- ##### SECTION See_Also ##### --><para><variablelist><varlistentry><term>g_locale_to_utf8(), g_locale_from_utf8()</term><listitem><para>Convenience functions for converting between UTF-8 and the locale encoding. </para></listitem></varlistentry></variablelist></para><!-- ##### SECTION Stability_Level ##### --><!-- ##### TYPEDEF gunichar ##### --><para>A type which can hold any UCS-4 character code. </para><!-- ##### TYPEDEF gunichar2 ##### --><para>A type which can hold any UTF-16 code point<footnote id="utf16_surrogate_pairs">UTF-16 also has so called <firstterm>surrogate pairs</firstterm> to encode characters beyond the BMP as pairs of 16bit numbers. Surrogate pairs cannot be stored in a single gunichar2 field, but all GLib functions accepting gunichar2 arrays will correctly interpret surrogate pairs.</footnote>.</para><!-- ##### FUNCTION g_unichar_validate ##### --><para></para>@ch: @Returns: <!-- ##### FUNCTION g_unichar_isalnum ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isalpha ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_iscntrl ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isdefined ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isdigit ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isgraph ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_islower ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_ismark ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isprint ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_ispunct ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isspace ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_istitle ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isupper ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_isxdigit ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_iswide ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_iswide_cjk ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_iszerowidth ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_toupper ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_tolower ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_totitle ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_digit_value ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_xdigit_value ##### --><para></para>@c: @Returns: <!-- ##### ENUM GUnicodeType ##### --><para>These are the possible character classifications from the Unicode specification.See <ulink url="http://www.unicode.org/Public/UNIDATA/UnicodeData.html">http://www.unicode.org/Public/UNIDATA/UnicodeData.html</ulink>.</para>@G_UNICODE_CONTROL: General category "Other, Control" (Cc)@G_UNICODE_FORMAT: General category "Other, Format" (Cf)@G_UNICODE_UNASSIGNED: General category "Other, Not Assigned" (Cn)@G_UNICODE_PRIVATE_USE: General category "Other, Private Use" (Co)@G_UNICODE_SURROGATE: General category "Other, Surrogate" (Cs)@G_UNICODE_LOWERCASE_LETTER: General category "Letter, Lowercase" (Ll)@G_UNICODE_MODIFIER_LETTER: General category "Letter, Modifier" (Lm)@G_UNICODE_OTHER_LETTER: General category "Letter, Other" (Lo)@G_UNICODE_TITLECASE_LETTER: General category "Letter, Titlecase" (Lt)@G_UNICODE_UPPERCASE_LETTER: General category "Letter, Uppercase" (Lu)@G_UNICODE_COMBINING_MARK: General category "Mark, Spacing Combining" (Mc)@G_UNICODE_ENCLOSING_MARK: General category "Mark, Enclosing" (Me)@G_UNICODE_NON_SPACING_MARK: General category "Mark, Nonspacing" (Mn)@G_UNICODE_DECIMAL_NUMBER: General category "Number, Decimal Digit" (Nd)@G_UNICODE_LETTER_NUMBER: General category "Number, Letter" (Nl)@G_UNICODE_OTHER_NUMBER: General category "Number, Other" (No)@G_UNICODE_CONNECT_PUNCTUATION: General category "Punctuation, Connector" (Pc)@G_UNICODE_DASH_PUNCTUATION: General category "Punctuation, Dash" (Pd)@G_UNICODE_CLOSE_PUNCTUATION: General category "Punctuation, Close" (Pe)@G_UNICODE_FINAL_PUNCTUATION: General category "Punctuation, Final quote" (Pf)@G_UNICODE_INITIAL_PUNCTUATION: General category "Punctuation, Initial quote" (Pi)@G_UNICODE_OTHER_PUNCTUATION: General category "Punctuation, Other" (Po)@G_UNICODE_OPEN_PUNCTUATION: General category "Punctuation, Open" (Ps)@G_UNICODE_CURRENCY_SYMBOL: General category "Symbol, Currency" (Sc)@G_UNICODE_MODIFIER_SYMBOL: General category "Symbol, Modifier" (Sk)@G_UNICODE_MATH_SYMBOL: General category "Symbol, Math" (Sm)@G_UNICODE_OTHER_SYMBOL: General category "Symbol, Other" (So)@G_UNICODE_LINE_SEPARATOR: General category "Separator, Line" (Zl)@G_UNICODE_PARAGRAPH_SEPARATOR: General category "Separator, Paragraph" (Zp)@G_UNICODE_SPACE_SEPARATOR: General category "Separator, Space" (Zs)<!-- ##### FUNCTION g_unichar_type ##### --><para></para>@c: @Returns: <!-- ##### ENUM GUnicodeBreakType ##### --><para>These are the possible line break classifications.The five Hangul types were added in Unicode 4.1, so, has beenintroduced in GLib 2.10. Note that new types may be added in the future.Applications should be ready to handle unknown values.They may be regarded as %G_UNICODE_BREAK_UNKNOWN.See <ulink url="http://www.unicode.org/unicode/reports/tr14/">http://www.unicode.org/unicode/reports/tr14/</ulink>.</para>@G_UNICODE_BREAK_MANDATORY: @G_UNICODE_BREAK_CARRIAGE_RETURN: @G_UNICODE_BREAK_LINE_FEED: @G_UNICODE_BREAK_COMBINING_MARK: @G_UNICODE_BREAK_SURROGATE: @G_UNICODE_BREAK_ZERO_WIDTH_SPACE: @G_UNICODE_BREAK_INSEPARABLE: @G_UNICODE_BREAK_NON_BREAKING_GLUE: @G_UNICODE_BREAK_CONTINGENT: @G_UNICODE_BREAK_SPACE: @G_UNICODE_BREAK_AFTER: @G_UNICODE_BREAK_BEFORE: @G_UNICODE_BREAK_BEFORE_AND_AFTER: @G_UNICODE_BREAK_HYPHEN: @G_UNICODE_BREAK_NON_STARTER: @G_UNICODE_BREAK_OPEN_PUNCTUATION: @G_UNICODE_BREAK_CLOSE_PUNCTUATION: @G_UNICODE_BREAK_QUOTATION: @G_UNICODE_BREAK_EXCLAMATION: @G_UNICODE_BREAK_IDEOGRAPHIC: @G_UNICODE_BREAK_NUMERIC: @G_UNICODE_BREAK_INFIX_SEPARATOR: @G_UNICODE_BREAK_SYMBOL: @G_UNICODE_BREAK_ALPHABETIC: @G_UNICODE_BREAK_PREFIX: @G_UNICODE_BREAK_POSTFIX: @G_UNICODE_BREAK_COMPLEX_CONTEXT: @G_UNICODE_BREAK_AMBIGUOUS: @G_UNICODE_BREAK_UNKNOWN: @G_UNICODE_BREAK_NEXT_LINE: @G_UNICODE_BREAK_WORD_JOINER: @G_UNICODE_BREAK_HANGUL_L_JAMO: @G_UNICODE_BREAK_HANGUL_V_JAMO: @G_UNICODE_BREAK_HANGUL_T_JAMO: @G_UNICODE_BREAK_HANGUL_LV_SYLLABLE: @G_UNICODE_BREAK_HANGUL_LVT_SYLLABLE: <!-- ##### FUNCTION g_unichar_break_type ##### --><para></para>@c: @Returns: <!-- ##### FUNCTION g_unichar_combining_class ##### --><para></para>@uc: @Returns: <!-- ##### FUNCTION g_unicode_canonical_ordering ##### --><para></para>@string: @len: <!-- ##### FUNCTION g_unicode_canonical_decomposition ##### --><para></para>@ch: @result_len: @Returns: <!-- ##### FUNCTION g_unichar_get_mirror_char ##### --><para></para>@ch: @mirrored_ch: @Returns: <!-- ##### ENUM GUnicodeScript ##### --><para>The #GUnicodeScript enumeration identifies different writingsystems. The values correspond to the names as defined in theUnicode standard. The enumeration has been added in GLib 2.14,and is interchangeable with #PangoScript.Note that new types may be added in the future. Applications should be ready to handle unknown values.See <ulinkurl="http://www.unicode.org/reports/tr24/">Unicode Standard Annex#24: Script names</ulink>.</para>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -