lundi 6 avril 2015

List of unicode code points that receive special treatment

Vote count:


Disclaimer: Gross oversimplification follows:

"Normally" (coming from a Latin alphabet based background) glyphs in a font have some path(s) associated with them that describe the shape of the letter and an advance width that specifies how far the cursor needs to move to get to the starting point of the next letter (ignoring kerning, ligatures, composite glyphs, ...).

There are several unicode code points, though, that seem to get special treatment, so that when I simply place a "normal" glyph in their place, the rendering doesn't look as expected. For example, if I misuse the Combining Diacritical Marks block (0x300-0x36f) Chrome will simply ignore any advance width information associated with the glyphs in the font file and renders them all on top of each other. Windows notepad does not. But for some Hebrew code points (e.g. 0x5C6) Windows (notepad, charmap, ...) ignores the glyph path completely and renders something else entirely. And all hell breaks loose if I mix Hebrew code points with other ones (probably because Hebrew is written right to left?).

Is there a comprehensive list of unicode code points that might get some sort of special treatment (maybe right-to-left rendering or ignoring of the advance width or whatever else there may be) from different font renderers? Is this maybe something that is standardized in the unicode specifications?

Aside: I know that I can just stick to the Private Use Areas and be safe. This is more just a question out of general interest.

asked 47 secs ago

Markus A.


List of unicode code points that receive special treatment

Aucun commentaire:

Enregistrer un commentaire