Discussions
Categories
Groups
Community Home
Categories
INTERNAL ENABLEMENT
POPULAR
THRUST SERVICES & TOOLS
CLOUD EDITIONS
Quick Links
MY LINKS
HELPFUL TIPS
Back to website
Home
Web CMS (TeamSite)
MS Word's special characters
JackB
Teamsite 6.0, Windows
We copy and paste a lot of MS Word documents. Our templates have already been set up with script to translate French characters properly (without the need of Visual Format) but in the case of MS Word's curly quotes, curly apostrophe, or emdash - no luck. The generated output html file doesn't recognize them at all and treats it as nothing so a word like "I'm" becomes Im, without the apostrophe. This only happens in the title field of the DCR though, that doesn't use Visual Format, like the body field does. Should I just use Visual Format for the title field as well or is there another way to do this (rather than typing in the ISO or HTML extended character set for the particular character)?
Thanks,
...Jack
Find more posts tagged with
Comments
gzevin
VF in 6 uses only one instance of ActiveX, so if I were you, I'd turn the title field into VF. You already have a process in place, that works, so I'd stick to it.
Greg Zevin, Ph.D. Comp. Sc.
Independent Interwoven Consultant/Architect
Sydney, AU
JackB
Thanks for the advice.
Perusing yesterday, I was glad to see that there was something in the Tech Library about removing unwanted paragraph tags in the textarea content of a DCR, because that's been an issue with me.
Also, a quick question about another tweak to the visualformatconfig.xml file that I read about. By changing the preservewordstyles to "false", MS Word styles will be disabled. Will this disable everything, like bold & hyperlink tags? The problem I have with MS Word are the span styles that override the font & font size style sheets that are in my tpl. I'd like to keep any bolds or hyperlinks from the MS Word document, but not the junky span styles that come with it. Is there a solution around this or will turning the preservewordstyles to false only affect the span tags (he says hopefully, knowing it probably affects everything)?
...Jack
gzevin
I am not sure what does VF do with cleaning code/etc - I believe only in my code that cleans HTML in a TPL. This ensures that I am in control and can do whatever I want.
Greg Zevin, Ph.D. Comp. Sc.
Independent Interwoven Consultant/Architect
Sydney, AU
JackB
I agree. Any suggestions you have would be greatly appreciated. Our company uses MS Word a lot so this will be an issue as more and more people use our templates.
VF does some cleaning of code, but not enough. I just need some way to get rid of the span styles only.
Thanks,
...Jack
gzevin
yes, this is exactly my point// why bother with cleaning at all, if it's not sufficient at all.
I have a very simple module that uses regexes. It's not perfect, but does the trick. Johny has wrote a devnet article about using HTML:
arser to do the same trick.
Also, in 6.X there is a change in string handling, I had to rewrite the code a bit with a tip from Johny, actually, to take care of UTF8 representation of some codes that VF tends to put in the textarea
Greg Zevin, Ph.D. Comp. Sc.
Independent Interwoven Consultant/Architect
Sydney, AU
JackB
Looks like the HTML Parser article example replaces span tags with an italics tag. Can I just leave it blank? I don't have to replace it with another tag, do I?
...Jack
gzevin
you don't have to. it was just an example. Feel free to do whatever your requirements are
Greg Zevin, Ph.D. Comp. Sc.
Independent Interwoven Consultant/Architect
Sydney, AU