Discussion:
How to support German, French and other characters.
(too old to reply)
Andrei Bintintan
2004-11-10 08:59:41 UTC
Permalink
Hi to all,

We are using pgsql as a "multilanguage" database. But I noticed yesterday a strange problem with the german umlaut characters. I cannot convert the in upper case in lowercase. Probably there are also other bad functionalities.

Now. I searched the internet for answers but they were not quite exact, so I could't find a solution.

We use ASCII encoding for the database, but I tried the Latin1 -> Latin 10 and the behavior is the same. In some forums there is written something about some "locale" setting... etc etc.

What settings do I have to make so that we won't have these problems in the future. We intend to work with French characters also, and in the future with Hungarian.

As system the pgsql is v.7.4 running on a Suse 9.1 machine. The locale command gives me:

linz:/var/lib/pgsql/data # locale
LANG=
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=


Thank you in advance.
Andy.
Andrei Bintintan
2004-11-11 08:22:34 UTC
Permalink
Hi Ivo,

If I use UNICODE encoding for the DB I get some errors:
For example the following query: select lower('MöBÜEL')
returns ERROR: Unicode characters greater than or equal to 0x10000 are not
supported.

I found this on a forum:
----------------------------------------------------------------
----------------------------------------------------------------
postgres 7.4 on linux, glibc 2.2.4-6
I've a table containing unicode-data and the lower()-function does not
work proper. While it lowers standard letters like A->a,B->b ... it
fails on special letters like german umlauts (Ä , Ö ...) that are
simply
keeped untouched.
upper() and lower() didn't support multibyte character sets before 8.0.

regards, tom lane
----------------------------------------------------------------
----------------------------------------------------------------

We are using these king of comparations and character translations in our
DB.

I really cannot figure out what solution to use.

Best regards,
Andy.



----- Original Message -----
From: "Ivo Rossacher" <***@bluewin.ch>
To: "Andrei Bintintan" <***@ar-sd.net>
Sent: Wednesday, November 10, 2004 9:57 PM
Subject: Re: [ADMIN] How to support German, French and other characters.
Dear Andy,
ASCII encoding means that the database does not care (and know) about the
encoding. So the client is in full charge to deal with the encoding issue.
This is very uncomfortable within a multilanguage enviroment. Without the
encoding the caption can not be determined correctly by the database it
self.
Suse 9.1 does use unicode as the default encoding for all the desctop. For
my
multilanguage projects I do use therefore UNICODE as encoding for the
database. (createdb -EUNICODE dbname will generate a unicode database more
precisly a UTF8 database)
Most of the Microsoft clients are internal UNICODE anyway and can deal
with
this setting. Older Unix or Linux installations need some tweaking
probably.
Best regards
Ivo Rossacher
Post by Andrei Bintintan
Hi to all,
We are using pgsql as a "multilanguage" database. But I noticed
yesterday a
Post by Andrei Bintintan
strange problem with the german umlaut characters. I cannot convert the
in
Post by Andrei Bintintan
upper case in lowercase. Probably there are also other bad
functionalities.
Post by Andrei Bintintan
Now. I searched the internet for answers but they were not quite exact,
so
Post by Andrei Bintintan
I could't find a solution.
We use ASCII encoding for the database, but I tried the Latin1 -> Latin
10
Post by Andrei Bintintan
and the behavior is the same. In some forums there is written something
about some "locale" setting... etc etc.
What settings do I have to make so that we won't have these problems in
the
Post by Andrei Bintintan
future. We intend to work with French characters also, and in the future
with Hungarian.
As system the pgsql is v.7.4 running on a Suse 9.1 machine. The locale
linz:/var/lib/pgsql/data # locale
LANG=
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
Thank you in advance.
Andy.
--
Ivo Rossacher
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ***@postgresql.org)
Loading...