The function type is integer.
The returned value is the number of UTF-8 or UTF-16 characters in argument-1.
If argument-1 is a national data item that contains UTF-16 data and argument-1 contains surrogate pairs, each pair of low and high surrogates will be counted as one UTF-16 character.
If the UTF-8 argument contains composed characters (for example, ä, ê, and ü), the combining characters are counted individually in determining the length. See the following example where the returned value may vary for a composed character:
Character | Unicode encoding | UTF-8 encoding | Returned value |
---|---|---|---|
ä |
U+00E4 (precomposed form, Latin small letter a with diaeresis) |
x'C3A4' | 1 |
U+0061 + U+0308 (canonical decomposition, Latin small letter a + combining diaeresis) |
x'61CC88' | 2 |