Discussions
Categories
Groups
Community Home
Categories
INTERNAL ENABLEMENT
POPULAR
THRUST SERVICES & TOOLS
CLOUD EDITIONS
Quick Links
MY LINKS
HELPFUL TIPS
Back to website
Home
Web CMS (TeamSite)
TeamSite Training Solutions
khalid1
I have taken the TeamSite 5.0 Template development class. One of the exercises in the book is called "Code-Generated Content" where the exercise shows how to dynamically generate a "news article table of contents" page based upon the news article pages (.html files) located in a particular folder on the server. (See Page 92 End of module 4). I was wondering if anybody has the solution to this exercise. If you do, could you please post it here on the forum. Specifically I am looking for the perl script which scans through the .html pages and picks up the content between <title> tags and generates a dynamic table of contents page.
Thanks in advance.
Find more posts tagged with
Comments
akshathp
I do not have the script but if you know PERL programming, it will be pretty straightforward script you could develop.
All you need to do is use FILE::FIND and run the subroutines from this library. You could set the directory path and then have the subroutine browse through it and pick the .html files. Within your custom subroutine you could read from the files the text within <title> tag using a basic regex.
Hope this helps!
Akshat Pramod Sharma
Interwoven Inc.
Adam Stoller
I'd recommend using XML:
arser or HTML:
arser routines rather than "simple regex's" - because people (and authoring applications) tend to do all sorts of odd things when generating the HTML section of a document - and you're better off using a real parser than spending a lot of time trying to write a regex to handle all the various cases you might come up against.
--fish
(Interwoven Senior Technical Consultant)
akshathp
Certainly using HTML:
arser would be the better way to fetch your information. I agree with Ghoti.
I was thinking that since you are looking for just <title>...</title> tag then why not simple regex but Ghoti's point is very true that the html constructs would not always be proper and also sometimes regex's might fetch undesired information from badly constructed html code.
And yeah, why take the trouble fo writing and perfecting something when it is already available.
Akshat Pramod Sharma
Interwoven Inc.