Home
TeamSite
Reading UTF8 data from a PERL hash file
dpmv
Hi,
I have a PERL hash file which looks like this:
%breadcrumbs = (
"Home" => {
"bodyStyle" => "home",
"order" => ["Network Operators", "Consumers"],
"landing" => "/about/enabling/index.html",
"access" => "ENABLING WIRELESS",
"display" => "Home",
"Network Operators" => {
"landing" => "/about/enabling/operators.html",
"display" => "\303\211quipement du secteur des t\303\251l\303\251communications sans fil"
},
"Consumers" => {
"landing" => "/about/enabling/consumers.html",
"display" => "Consumers"
}
}
);
I am including this file inside a tpl using:
do "navhash.pl";
and retrieving the navigation display value using the following code:
my $navBlock = "Home";
my $navOrder = "Network Operators";
my $navName = $breadcrumbs{$navBlock}{$navOrder}{display};
Problem is when there are UTF8 chars in the hash the script is not able to read and displays an empty string instead.
Example value: "\303\211quipement du secteur des t\303\251l\303\251communications sans fil"
How should I read the UTF8 data from the hash file?
TIA
Find more posts tagged with
Comments
Adam Stoller
Is that structure hard-coded or is it being built up while reading a data file or some other form of data processing?
If you're using TS 6.x which uses Perl 5.8.2 - when opening a file for reading - use the following format:
...
if (open(IN,
'<:utf8'
, $infile)){
# read data from < ...
close(IN);
}
else {
# handle error
}
...
Similarly, when writing data out to a file:
...
if (open(OUT,
'>:utf8'
, $ofile)){
# write data to OUT ...
close(OUT);
}
else {
# handle error
}
...
Is that what you were looking for?
--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com
brandon1
Do you have the use utf8; pragma at the top of your program?
Is this a typo
am including this file inside a tpl using:
did you really mean ipl?
Current Project: TS 6.1
Environment: Windows
dpmv
This structure is hard coded in .pl file
dpmv
No its not a typo. I did mean tpl. I including this perl hash inside a presentation template and writing out the left navigation from it.
Adam Stoller
Well - the way you're doing things is, IMO, rather strange and non-standard - but perhaps if you put the
use utf8;
pragma in your PT (within the iw_perl section, before calling your hash script) - it *might* resolve the problem for you.
--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com
dpmv
I tried using the use utf8 pragma in my PT but it did not make any difference I still get a blank space for a string which is utf8. I also tried including this pragma in the pl which contains the hash. In this case only non utf8 parts of the string get displayed.
All i am trying to do is store utf8 string in a perl hash and then read from it. Did not think it would be this difficult
Thanks for your help!
Adam Stoller
The storing and accessing of the UTF-8 string is probably not the issue. The manner in which this hash is getting assimilated into your PT is more likely the issue - or at least that's my guess. Maybe someone else has some ideas or sees something I've missed in all this.
--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com
Johnny
This is with the assumption you're running PERL 5.8 (TS 6.x)
If this is code running inside a .tpl file then you should have no problems reading/writing UTF-8
If you are running a standalone script (for example you are executing another perl script within in your .tpl) then you need to tell perl to read/write as UTF-8. perl does not read UTF-8 from file handles/STDIN/STDOUT/STDERR by default but most iwov files and command line tools are UTF-8 (dcr's for example).
We have a common module that we include in all of our scripts, it contains the following for processing UTF-8
use open ':utf8';
# TeamSite files are mostly UTF-8 so we'll default our file handles to UTF-8
# Note: the user of "use open ':utf8'" above...
binmode (STDOUT, ":utf8");
binmode (STDIN, ":utf8");
binmode (STDERR, ":utf8");
Then you can read/write files as you normally would and perl will assume UTF-8 by default.
See how you go with that.
FYI... use UTF8 tells perl to interpret the source code as UTF-8 encoded. It has nothing to do with how it interprets files/strings etc during run time.
If you are writing files within a .tpl you really should be using the <iw_ostream> tag as it will manage all these issues for you as well as include the files in the iwpt_compile.ipl manifest if you choose to process it.
John Cuiuli