Discussions
Categories
Groups
Community Home
Categories
INTERNAL ENABLEMENT
POPULAR
THRUST SERVICES & TOOLS
CLOUD EDITIONS
Quick Links
MY LINKS
HELPFUL TIPS
Back to website
Home
Web CMS (TeamSite)
Using Filter Regexes to choose content processors
Aryan
Hello All,
The mt.411.user.pdf document quotes this as a reference solution in Page number 30.
"Create two content processors associated with the PDF file type. The first processor removes unwanted bits from memos (for example, the “To:” and “From:” lines), and the second processor removes tables of numbers. Create two preprocessor groups, associating each content processor with a different processor group. Add a filter regular expression (regex) field that matches on the path to the memos to the content processor for handling
memos.Next, add a filter regex field that matches on the path to the papers with tables to remove to the content processor for handling those papers.Both content processors share the same set of fields, but each provides optimal results because the d ocuments are properly preprocessed. "
I do not understand how filter regexes can be used to choose content processors.
Could somebody please clarify/eloborate on the approach?(How to go about doing this.)
Thanks,
Aryan.
Find more posts tagged with
Comments
Migrateduser
Consider two different document types - memo and report - with different prefixes in the file name (memo-23.doc and report-41.doc). Using a filter regex in the content processor definition (.*memo-.*.doc and .*report-.*.doc) would allow you to use different models to analyze the different types of files.
Aryan
Thanks a lot.
Aryan