Can anyone help me out with trying to create a phrase index for a summary.
I have set up the following file:
phrasetrain.cfg
<index>
<lexicon>notes</lexicon>
<script>tokenize -html; splitter; index -create; index_phrases -create -ignorepunctuation -ignorenumbers -ignoretags</script>
<extension>xml</extension>
<path>$DB_HOME/contents</path>
</index>
~
and
phraseindex.cfg
<index>
<lexicon>notes</lexicon>
<script>tokenize -html; splitter; phrase; tagger; index -create -indexphrases</script>
<extension>xml</extension>
<path>$DB_HOME/contents</path>
</index>
when I run
iwgenindex -config phrasetrain.cfg -verbose
I get:
Extract candidate phrases...
Sorting ...
Sorting ...
Compute global term weights...
Compute local term weights...
Compute document vectors...
Compute document lengths...
Commit changes.
ERROR: lexicon is empty
when I run
iwgenindex -config phraseindex.cfg -verbose
I get:
ERROR: Failed to open index 'notes_phrases': Db:

pen: No such file or direct
ory
Usage: iwgenindex -config fn.cfg [f1...fn] [-help] [-noconvert] [-v] [-verbose]
My file type for xml is set up in metatagger.cfg as follows:
<fileType>
<label>XML Document</label>
<extension>xml</extension>
<converter>iwhtml2txt -all -converter "iwmtconverter -metadata "
quot;%FILE%"</converter>
<preScanner>HTMLGeneric.ipr</preScanner> <preprocessor>discoverPreProcessor.ipr</preprocessor>
<htmlMode>false</htmlMode>
</fileType>
Any ideas would be greatly appreciated
thanks
cjk