normalizeText

Typefunction
DictionaryLCS
LibraryLiveCode Script
Syntax
normalizeText(<text>, <normalForm>)
Summary

Converts a text string into a specific 'normal form'.

Introduced7.0
OSmac, windows, linux, ios, android
Platformsdesktop, server, mobile
Parameters
NameTypeDescription
text
normalForm"NFC": precomposed
"NFD": decomposed
"NFKC": compatibility precomposed
"NFKD": compatibility decomposed
Example
local tExample
set the formSensitive to true
put "e" & numToCodepoint("0x301") into tExample    -- Acute accent
put tExample is "é"                          -- Returns false
put normalizeText(tExample, "NFC") is "é"    -- Returns true
RelatedProperty: formSensitive
Function: textDecode, textEncode
Glossary: property, Unicode
Description

Use the normalizeText function when you require a specific normal form of text.

Normalization is a process defined by the Unicode standard for removing minor encoding differences for a small set of characters. In Unicode text, the same visual string can be represented by different character sequences. A prime example of this is precomposed characters and decomposed characters: an "e" followed by a combining acute character is visually indistinguishable from a precombined "Ž" character. Because of the confusion that can result, Unicode defined a number of "normal forms" that ensure that character representations are consistent. The "compatibility" normal forms are designed by the Unicode Consortium for dealing with certain legacy encodings and are not generally useful otherwise.

Note: Normalization does not avoid all problems with visually-identical characters; Unicode contains a number of characters that will (in the majority of fonts) be indistinguishable but are nonetheless completely different characters (a prime example of this is M and U+2164 ROMAN NUMERAL ONE THOUSAND).

Unless the formSensitive property is set to true, LiveCode ignores text normalization when performing comparisons.

Tagstext processing