Has anyone had success using utf-8 character maps on Solaris 8 for templatized content delivery to Oracle 8
Notes:
Background:
As part of a globalization project currently in progress, we will utilize TeamSite and a translation management system to translate us english content into 8 different languages
French, Korean, Portuguese, Simplified Chinese, Japanese, German, Spanish, and Italian
In order to support all the different languages from content creation to delivery through a browser, Unicode UTF-8 is needed.
Process:
Content is entered via the TeamSite GUI as encoding=UTF-8
The information is saved in an xml file which resides in the ufs filesystem.
Perl + Java scripts are used to parse the contents and deliver to Oracle db.
The content is served, via jhtml to clients browser
Issues
Although UTF-8 is specified through all parts of the data capture, parsing, delivery process, ISO 8859-1 characters are appearing in the database.
The parsing of records, prior to delivery to an Oracle database, occurs at the Unix OS level.
From the command line on svleitq1 we can specify en_US.UTF-8 locale for env variable, but the character map remains iso-8859
ex.
svleitq1: [~]
# export LANG=en_US.UTF-8
svleitq1: [~]
# locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=
svleitq1: [~]
# locale -m
iso_8859_1/charmap.src
Resolution:
Install en_US.UTF-8 character map (see. localdef, locale, charset) onsvlietq1 for testing
install on svleitp1 if it works
Other info:
Solaris UTF-8 support
http://wwws.sun.com/software/whitepapers/wp-unicode/Solaris International Language Environments guide
http://docs.sun.com/db/doc/806-0169Unicode HOWTO for Linux
http://en.tldp.org/HOWTO/Unicode-HOWTO-3.htmlExcerpt from a newsgroup post:
Locales are compiled files residing usually in a system-wide directory
/usr/share/locale, but you can also easily compile your own locale
definition file using the localedef tool. I did this for en_GB.UTF-8
with
localedef -v -c -i en_GB -f UTF-8 $HOME/local/locale/en_GB.UTF-8
and then in order to inform libc about where it can find my new
home-made locale definition files, I had to use
export LOCPATH=$HOME/local/locale
You can also use iconv to convert 8859 to utf8, iconv must be upgraded to do this GNU v 1.8 supports it.
ex.
iconv -f 8859 -t utf8 file...
System Info:
# iwserver -V
[Fri Feb 14 09:42:34 2003]
*** IWServer Locale is --> English_UnitedStates.UTF-8@Binary
iwserver: 5.5.2 Build 9036 Interwoven 20020608
# uname -a
SunOS svleitq1 5.8 Generic_108528-14 sun4u sparc SUNW,Sun-Fire-280R
svleitq1: [/logs]
# env
PWD=/logs
TZ=US/Pacific
WR=/default/main/www-redesign/WORKAREA/products/
HOSTNAME=svleitq1
IW_OPENAPI_HOME=/apps/iw-home/iwopenapi
LD_LIBRARY_PATH=/apps/oracle/u001/product/8.1.7.1/lib
CLASSPATH=/apps/iw-home/lib/tst-xerces.jar:/apps/iw-home/iwopenapi/openapi_client.jar:/apps/iw-home/iwturbo/lib/IWATGTurbo.jar:/apps/iw-home/iwturbo/lib/ejbclient_ejbs.jar:/apps/iw-home/iwturbo/lib/classes.jar
PS1=\h: [\w] \n \$
SUDO_GID=1066
IWTURBOHOME=/apps/iw-home/iwturbo
USER=root
MACHTYPE=sparc-sun-solaris
SSH2_SFTP_LOG_FACILITY=-1
IWHOME=/apps/iw-home
MAIL=/var/mail/jlawn
SSH2_CLIENT=10.137.11.55 2282 10.137.57.59 22
OLDPWD=/home1/jlawn
SUDO_UID=1108
LANG=en_US.UTF-8
ORACLE_HOME=/apps/oracle/u001/product/8.1.7.1/
LOGNAME=root
SHLVL=2
SUDO_COMMAND=/bin/su
NLS_33=/apps/oracle/u001/product/8.1.7.1/ocommon/nls/admin/data
SHELL=/bin/ksh
HOSTTYPE=sparc
OSTYPE=solaris
HOME=/home1/jlawn
TERM=xterm
NLS_LANG=AMERICAN_AMERICA.UTF-8
PATH=/usr/local/bin:/usr/sbin:/apps/iw-home/bin:/apps/iw-home/iw-perl/bin:/apps/iw-home/bin/install:/usr/local/sbin:/apps/oracle/u001/product/8.1.7.1/bin:/usr/sbin:/usr/bin
TNS_ADMIN=/apps/oracle/u001/product/8.1.7.1/network/admin
SUDO_USER=jlawn
_=/usr/bin/env