Discussion:
Test, Agent 1.93, Western with Unicode
Add Reply
Ruud Harmsen
2018-11-03 06:37:58 UTC
Reply
Permalink
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóú
Accent circonflex: âêôîû
Diaeresis: äëïöü
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúâêôîûäëïöüãõç
Adding double dotted y: ÿ, euro €.
Ruud Harmsen
2018-11-03 06:38:54 UTC
Reply
Permalink
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóú
Accent circonflex: âêôîû
Diaeresis: äëïöü
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúâêôîûäëïöüãõç
Adding double dotted y: ÿ, euro €.
Agent 1.93 has switched now, so, yes, I can send UTF-8 too!
Ruud Harmsen
2018-11-03 06:41:40 UTC
Reply
Permalink
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç

ACCENT GRAVE: ÀÈÌÒÙ
ACCENT AIGU: ÁÉÍÓÚÝ
ACCENT CIRCONFLEX: ÂÊÔÎÛ
DIAERESIS: ÄËÏÖÜŸ
TILDE: ÃÕ
CEDILLA: Ç

àèìòùáéíóúâêôîûäëïöüãõç
Dirk T. Verbeek
2018-11-03 08:45:52 UTC
Reply
Permalink
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Dat zijn trema's, heb je ook de Umlaut?
Ruud Harmsen
2018-11-03 09:05:09 UTC
Reply
Permalink
Post by Dirk T. Verbeek
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Dat zijn trema's, heb je ook de Umlaut?
Unicode maakt (eigenlijk ten onrechte) dat verschil niet.
Chris Jacobs
2018-11-03 12:39:44 UTC
Reply
Permalink
Post by Ruud Harmsen
Post by Dirk T. Verbeek
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Dat zijn trema's, heb je ook de Umlaut?
Unicode maakt (eigenlijk ten onrechte) dat verschil niet.
Maar er is een hack om dat verschil indien nodig toch te maken:

While recognizing the drawbacks to all of the alternatives to encoding a
new COMBINING UMLAUT character outlined in WG2 N2766, we believe that
there is a workable alternative solution which has, to date, been
overlooked. The solution consists, essentially, of using U+034F
COMBINING GRAPHEME JOINER (CGJ), in its intended semantics in
10646/Unicode, to make the relevant sorting, searching, and data mapping
distinctions required for umlaut versus tréma. In particular, the
distinction we propose is:

U+0308 → umlaut
<CGJ U+0308> → tréma
<a U+0308> → a umlaut
<a CGJ U+0308> → a tréma
http://archives.miloush.net/michkap/archive/2006/09/04/738263.html
s|b
2018-11-03 21:55:15 UTC
Reply
Permalink
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóú
Accent circonflex: âêôîû
Diaeresis: äëïöü
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúâêôîûäëïöüãõç
Hier is het toch net iets anders ingesteld...
--
test
Loading...