OK,
This one is a bit of a noodle scratcher. I suspect the root of the problem is that when you pull string data into an Oscript string it doesn't interpret the character set correctly.
What I'm trying to do is consume a REST call from Apache HTTP client (java library) from within Oscript. With JSON responses it works fine. My Oscript looks like this:
String body = .Handler().InvokeMethod( 'handleResponse', { response } )
The Handler() function gets me an instance of
org.apache.http.impl.client.BasicResponseHandler
and the Response object is the result from executing
org.apache.http.client.methods.HttpPost
What I noticed, and I noticed this when I cast the body Oscript string to a Byte Array, is that if there are any UTF-8 encoded characters in the response they are translated as literal bytes, so if I had say _Table des matèires_ it gets interpreted as _Table des matières_
In the Oscript string, the UTF-8 encoding of the è gets interpreted as literal ANSI string è
I tried saving this content to file using both ASCII mode and BIN mode. They give me the same result. the above sequence gets translated into 4 bytes: c3 83 c2 a8 rather than the two UTF-8 bytes I was expecting.
I suspect I've hit upon an Oscript limitation with the handling of JavaObjects and bringing Java strings over to UTF-8. Is there another approach I should consider? I think there may be a way in Java to directly stream the response entity (when the entire response is in fact a file to be downloaded), but I was hoping to only use available methods in the Apache HTTP client library which already ships with Content Server - I'm using CS 22.1 BTW.
-Hugh