John D. Cook at The Endeavour wrote a piece about how to shorten URL by using unicode. In one particular example, the unicode happened to be 38F8 and this is what appeared:
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=38F8
Although Unihan.org has catalogged this character into its database, there aren't additional linguistic information.
Characters likes these are known as "ghost characters", where they only exist basing on unicode algrithm but not used in linguistic sense.
Matter of fact, Unihan Grid Index has many of these:
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=3403
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=351B
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=360F
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=208D5
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment