?? library_6.html

?? Glibc的中文手冊
?? HTML
?? 第 1 頁 / 共 3 頁
字號:
上一頁 1 23
beginning at <VAR>string</VAR> to its corresponding wide character code.  It
stores the result in <CODE>*<VAR>result</VAR></CODE>.
<P>
<CODE>mbtowc</CODE> never examines more than <VAR>size</VAR> bytes.  (The idea is
to supply for <VAR>size</VAR> the number of bytes of data you have in hand.)
<P>
<CODE>mbtowc</CODE> with non-null <VAR>string</VAR> distinguishes three
possibilities: the first <VAR>size</VAR> bytes at <VAR>string</VAR> start with
valid multibyte character, they start with an invalid byte sequence or
just part of a character, or <VAR>string</VAR> points to an empty string (a
null character).
<P>
For a valid multibyte character, <CODE>mbtowc</CODE> converts it to a wide
character and stores that in <CODE>*<VAR>result</VAR></CODE>, and returns the
number of bytes in that character (always at least <CODE>1</CODE>, and never
more than <VAR>size</VAR>).
<P>
For an invalid byte sequence, <CODE>mbtowc</CODE> returns <CODE>-1</CODE>.  For an
empty string, it returns <CODE>0</CODE>, also storing <CODE>0</CODE> in
<CODE>*<VAR>result</VAR></CODE>.
<P>
If the multibyte character code uses shift characters, then
<CODE>mbtowc</CODE> maintains and updates a shift state as it scans.  If you
call <CODE>mbtowc</CODE> with a null pointer for <VAR>string</VAR>, that
initializes the shift state to its standard initial value.  It also
returns nonzero if the multibyte character code in use actually has a
shift state.  See section <A HREF="library_6.html#SEC75" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_6.html#SEC75">Multibyte Codes Using Shift Sequences</A>.
<P>
<A NAME="IDX355"></A>
<U>Function:</U> int <B>wctomb</B> <I>(char *<VAR>string</VAR>, wchar_t <VAR>wchar</VAR>)</I><P>
The <CODE>wctomb</CODE> ("wide character to multibyte") function converts
the wide character code <VAR>wchar</VAR> to its corresponding multibyte
character sequence, and stores the result in bytes starting at
<VAR>string</VAR>.  At most <CODE>MB_CUR_MAX</CODE> characters are stored.
<P>
<CODE>wctomb</CODE> with non-null <VAR>string</VAR> distinguishes three
possibilities for <VAR>wchar</VAR>: a valid wide character code (one that can
be translated to a multibyte character), an invalid code, and <CODE>0</CODE>.
<P>
Given a valid code, <CODE>wctomb</CODE> converts it to a multibyte character,
storing the bytes starting at <VAR>string</VAR>.  Then it returns the number
of bytes in that character (always at least <CODE>1</CODE>, and never more
than <CODE>MB_CUR_MAX</CODE>).
<P>
If <VAR>wchar</VAR> is an invalid wide character code, <CODE>wctomb</CODE> returns
<CODE>-1</CODE>.  If <VAR>wchar</VAR> is <CODE>0</CODE>, it returns <CODE>0</CODE>, also
storing <CODE>0</CODE> in <CODE>*<VAR>string</VAR></CODE>.
<P>
If the multibyte character code uses shift characters, then
<CODE>wctomb</CODE> maintains and updates a shift state as it scans.  If you
call <CODE>wctomb</CODE> with a null pointer for <VAR>string</VAR>, that
initializes the shift state to its standard initial value.  It also
returns nonzero if the multibyte character code in use actually has a
shift state.  See section <A HREF="library_6.html#SEC75" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_6.html#SEC75">Multibyte Codes Using Shift Sequences</A>.
<P>
Calling this function with a <VAR>wchar</VAR> argument of zero when
<VAR>string</VAR> is not null has the side-effect of reinitializing the
stored shift state <EM>as well as</EM> storing the multibyte character
<CODE>0</CODE> and returning <CODE>0</CODE>.
<P>
<H2><A NAME="SEC74" HREF="library_toc.html#SEC74" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC74">Example of Character-by-Character Conversion</A></H2>
<P>
Here is an example that reads multibyte character text from descriptor
<CODE>input</CODE> and writes the corresponding wide characters to descriptor
<CODE>output</CODE>.  We need to convert characters one by one for this
example because <CODE>mbstowcs</CODE> is unable to continue past a null
character, and cannot cope with an apparently invalid partial character
by reading more input.
<P>
<PRE>
int
file_mbstowcs (int input, int output)
{
  char buffer[BUFSIZ + MB_LEN_MAX];
  int filled = 0;
  int eof = 0;

  while (!eof)
    {
      int nread;
      int nwrite;
      char *inp = buffer;
      wchar_t outbuf[BUFSIZ];
      wchar_t *outp = outbuf;

      /* Fill up the buffer from the input file.  */
      nread = read (input, buffer + filled, BUFSIZ);
      if (nread &#60; 0) {
        perror ("read");
        return 0;
      }
      /* If we reach end of file, make a note to read no more. */
      if (nread == 0)
        eof = 1;

      /* <CODE>filled</CODE> is now the number of bytes in <CODE>buffer</CODE>. */
      filled += nread;

      /* Convert those bytes to wide characters--as many as we can. */
      while (1)
        {
          int thislen = mbtowc (outp, inp, filled);
          /* Stop converting at invalid character;
             this can mean we have read just the first part
             of a valid character.  */
          if (thislen == -1)
            break;
          /* Treat null character like any other,
             but also reset shift state. */
          if (thislen == 0) {
            thislen = 1;
            mbtowc (NULL, NULL, 0);
          }
          /* Advance past this character. */
          inp += thislen;
          filled -= thislen;
          outp++;
        }

      /* Write the wide characters we just made.  */
      nwrite = write (output, outbuf,
                      (outp - outbuf) * sizeof (wchar_t));
      if (nwrite &#60; 0)
        {
          perror ("write");
          return 0;
        }

      /* See if we have a <EM>real</EM> invalid character. */
      if ((eof &#38;&#38; filled &#62; 0) || filled &#62;= MB_CUR_MAX)
        {
          error ("invalid multibyte character");
          return 0;
        }

      /* If any characters must be carried forward,
         put them at the beginning of <CODE>buffer</CODE>. */
      if (filled &#62; 0)
        memcpy (inp, buffer, filled);
      }
    }

  return 1;
}
</PRE>
<P>
<H2><A NAME="SEC75" HREF="library_toc.html#SEC75" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC75">Multibyte Codes Using Shift Sequences</A></H2>
<P>
In some multibyte character codes, the <EM>meaning</EM> of any particular
byte sequence is not fixed; it depends on what other sequences have come
earlier in the same string.  Typically there are just a few sequences
that can change the meaning of other sequences; these few are called
<DFN>shift sequences</DFN> and we say that they set the <DFN>shift state</DFN> for
other sequences that follow.
<P>
To illustrate shift state and shift sequences, suppose we decide that
the sequence <CODE>0200</CODE> (just one byte) enters Japanese mode, in which
pairs of bytes in the range from <CODE>0240</CODE> to <CODE>0377</CODE> are single
characters, while <CODE>0201</CODE> enters Latin-1 mode, in which single bytes
in the range from <CODE>0240</CODE> to <CODE>0377</CODE> are characters, and
interpreted according to the ISO Latin-1 character set.  This is a
multibyte code which has two alternative shift states ("Japanese mode"
and "Latin-1 mode"), and two shift sequences that specify particular
shift states.
<P>
When the multibyte character code in use has shift states, then
<CODE>mblen</CODE>, <CODE>mbtowc</CODE> and <CODE>wctomb</CODE> must maintain and update
the current shift state as they scan the string.  To make this work
properly, you must follow these rules:
<P>
<UL>
<LI>
Before starting to scan a string, call the function with a null pointer
for the multibyte character address--for example, <CODE>mblen (NULL,
0)</CODE>.  This initializes the shift state to its standard initial value.
<P>
<LI>
Scan the string one character at a time, in order.  Do not "back up"
and rescan characters already scanned, and do not intersperse the
processing of different strings.
</UL>
<P>
Here is an example of using <CODE>mblen</CODE> following these rules:
<P>
<PRE>
void
scan_string (char *s)
{
  int length = strlen (s);

  /* Initialize shift state. */
  mblen (NULL, 0);

  while (1)
    {
      int thischar = mblen (s, length);
      /* Deal with end of string and invalid characters. */
      if (thischar == 0)
        break;
      if (thischar == -1)
        {
          error ("invalid multibyte character");
          break;
        }
      /* Advance past this character. */
      s += thischar;
      length -= thischar;
    }
}
</PRE>
<P>
The functions <CODE>mblen</CODE>, <CODE>mbtowc</CODE> and <CODE>wctomb</CODE> are not
reentrant when using a multibyte code that uses a shift state.  However,
no other library functions call these functions, so you don't have to
worry that the shift state will be changed mysteriously.
<P>Go to the <A HREF="library_5.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_5.html">previous</A>, <A HREF="library_7.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_7.html">next</A> section.<P>
上一頁 1 23
?? 文件大小 403 K
?? 上傳用戶 kzdai22
?? 所屬分類其他書籍
??? 相關標簽

#Glibc
?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? library_6.html

?? 快捷鍵說明