UTF16,string,,UCS4,delphi

Omar 10/12/2016 0

This example demonstrates a routine which can be used to convert an UTF16 string to a UCS4 string. This routine uses the functionality exposed by the Character unit.

Delphi
 function ConvertStringToUTF32(const S: String): UCS4String;
var
  I, Next: Integer;
  C4: UCS4Char;
begin
  { Set the result at length of S   (one #0 terminator) }
  SetLength(Result, Length(S)   1);
 
  if Length(S) = 0 then
    Exit;
 
  { Start the conversion }
  I := 1;
  Next := 0;
  while I <= Length(S) do
  begin
    if IsSurrogate(S, I) then
    begin
      {
        The character at position I is a surrogate, this
        means that I and I 1 must be a surrogate pair.
      }
      if not IsSurrogatePair(S, I) then
        raise EConvertError.Create('Bad UTF16 input string!');
 
      { S[I] -> high surrogate and S[I 1] -> low surrogate }
      if (IsHighSurrogate(S, I)) and (IsLowSurrogate(S, I   1)) then
      begin
        { Create the UCS4 chacarter from the pair }
        C4 := ConvertToUtf32(S, I);
        { Skip one more char }
        Inc(I);
      end
      else
        raise EConvertError.Create('Bad UTF16 input string!');
    end else
      C4 := ConvertToUtf32(S, I); { Create the UCS4 chacarter from the UTF16 char }
 
    { Add the character }
    Result[Next] := C4;
 
    { Increase positions }
    Inc(Next);
    Inc(I);
  end;
 
  { Adjust size }
  if Length(Result) <> (Next   1) then
    SetLength(Result, Next   1);
 
  { Add a trailing \0 }
  Result[Next] := 0;
end; 

Report Bug

Please Login to Report Bug

Reported Bugs

Comments

Please Login to Comment

Comments