How to Remove Bad Document from Operation?

Asif2012
Asif2012 Member
edited January 14, 2013 in Documentum #1

Hello, friends:

A couple of weeks ago I posted a question on how to import multiple documents to a repository, and got great help.  Today we ran into another issue, and would greatly appreciate your help.

The migration program, which I developed here at a Federal agency in Washington DC, works fine except for one situation.  When it finds a bad document, it still loads the dodcument in my home cabinet (from where it would be auto-promoted to the final state) but does not set attributes on it and then proceeds to process the next document(s).  The problem is that even though the next document is good, the program does not set attributes on it.  And so it goes on ignoring all subsequent documents for attributes.  As a result, even though all those subsequent documents make it to Documentum, they do not have any attributes set on them.  Here are the entries from the log, which show the culprit file names (that fails to import).  The log shows that this bad file somehow remained attached to the operation.  When the program found the next bad file name, this file also got attached to the operation, and finally the third bad file got attached:


2013-01-12 01:24:17,692 ERROR [ DocumentLoader.java:843 ] - Import operation failed.2013-01-12 01:24:17,692 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I38947_1_FR Doc. 2011-30045_2011-30045.pdf' as the content for document 'I38947_1_FR Doc. 2011-30045_2011-30045.pdf'.

The above set of 2 lines  kept appearing in the log until the program found the 2nd bad file, and then the log entries becamse this:

2013-01-12 01:24:39,865 ERROR [ DocumentLoader.java:843 ] - Import operation failed.2013-01-12 01:24:39,865 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I38947_1_FR Doc. 2011-30045_2011-30045.pdf' as the content for document 'I38947_1_FR Doc. 2011-30045_2011-30045.pdf'.2013-01-12 01:24:39,865 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I39085_1_Information_IRAC_Schedule - 02-22-2012.doc' as the content for document 'I39085_1_Information_IRAC_Schedule - 02-22-2012.doc'.

Above set of 3 lines then kept appearing in the log until the program found the 3rd and last bad file, and then the log entries became this:

2013-01-12 01:25:00,162 ERROR [ DocumentLoader.java:843 ] - Import operation failed.2013-01-12 01:25:00,162 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I38947_1_FR Doc. 2011-30045_2011-30045.pdf' as the content for document 'I38947_1_FR Doc. 2011-30045_2011-30045.pdf'.2013-01-12 01:25:00,162 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I39085_1_Information_IRAC_Schedule - 02-22-2012.doc' as the content for document 'I39085_1_Information_IRAC_Schedule - 02-22-2012.doc'.2013-01-12 01:25:00,162 ERROR [ DocumentLoader.java:852 ] - Could not set file 'C:\temp2\doc_migration\TYPE2_IRAC\I39175_1_Intelsat-22 GRANT SAT-LOA-20110929-00193 120315.pdf' as the content for document 'I39175_1_Intelsat-22 GRANT SAT-LOA-20110929-00193 120315.pdf'.

I am curious to know if I should somehow clear the operation object off any previously attached documents or nodes in each iteration of the loop for files list?  If yes, could you please show how to do it?  I do remove the node from the operation at the end of each iteration, as the code below shows:

IDfSession sess = null;                try {            // Get session for the repository            sess = sessMgr.getSession( REPO_NAME );                        // Create client            IDfClientX clientX = new DfClientX( );                        // Get import operation from this client            IDfImportOperation operation = clientX.getImportOperation( );                        // Set the session for the import operation            operation.setSession( sess );                        LOGGER.info( "Got the session to the repo: " + sess.getDocbaseName( ) );                        String homeFolder = "/" + sess.getLoginUserName( );                        // Get the user's home folder by path            IDfFolder folder = sess.getFolderByPath( homeFolder  );                        // Folder could not be found, so bail out of here            if( folder == null ) {                LOGGER.error( "Could not obtain path to the home folder: " + homeFolder );                throw new DfException( "Could not obtain path to the home folder: " +                         homeFolder );            }                        LOGGER.info( "Obtained the folder with folder ID: " +                     folder.getObjectId( ).toString( ) );                        // Now set the destination folder ID on this operation            operation.setDestinationFolderId( folder.getObjectId( ) );                        // Create image file reader            FileReader imgFileReader = new FileReader( imageFile );                        // Create a CSV file reader to read ZYIMAGE file            CSVReader csvReader = new CSVReader( imgFileReader );                        // Read the first line and ignore it because it contains headers.  We            // do not need headers.            String[] nextLine = csvReader.readNext( );                        int lineCount = 1;                        // Read ZYIMAGE file until the end of it            while( ( nextLine = csvReader.readNext( ) ) != null ) {                lineCount++;                                // Read first field, which is record number.                String recordNum = trimString( nextLine[ 0 ] );                                // Record number is null, so ignore this line and continue                if( recordNum == null ) {                    LOGGER.error( "No record number was found on line " + lineCount +                            " of file " + imageFile.getPath( ) + ", so ignoring this line." );                    continue;                }                                // Read file name and docket number                String fileName = trimString( nextLine[ 1 ] );                String docketNumStr = trimString( nextLine[ 2 ] );                                // Docket number is null, so ignore this line and continue                if( docketNumStr == null ) {                    LOGGER.error( "No docket number was found on line " + lineCount +                            " of file " + imageFile.getPath( ) + ", so ignoring this line." );                    continue;                }                                            // To convert docket number string to an integer                int docketNum = -1;                                // Convert docket number string to an integer.  If conversion fails,                // then ignore this line and continue.                try {                    docketNum = Integer.parseInt( docketNumStr );                } catch( NumberFormatException nfe ) {                    LOGGER.error( "Invalid docket number was found on line " + lineCount +                            " of file " + imageFile.getPath( ) + ", so ignoring this line and document." );                    continue;                }                                String declassDateStr = trimString( nextLine[ 8 ] );                                // To convert declassification date string to a date                IDfTime declassDate = null;                                // Declassification date string not null and not blank, so convert it to date                if( declassDateStr != null && !"".equals( declassDateStr ) ) {                    declassDate = new DfTime( declassDateStr );                                        // This date is not valid, so ignore this line                    if( !declassDate.isValid( ) ) {                        LOGGER.error( "Declassification date " + declassDateStr + " on line " +                                 lineCount + " of file " + imageFile.getPath( ) + " is invalid, so " +                                "ignoring this line." );                        continue;                    }                }                                IDfSysObject busPolicyObj                     = (IDfSysObject) sess.getObject( new DfId( targetLifecycleId ) );                                 // Create data file                String dataFilePath = DATA_DIR + File.separator + inputFolder +                        File.separator + fileName;                                // Create a local file from this client                IDfFile localFile = clientX.getFile( dataFilePath );                                // Add this file to this operation and get the file node from the operation                IDfImportNode impNode                     = (IDfImportNode) operation.add( localFile );                                // Set custom object type.                impNode.setDocbaseObjectType( targetDocType );                                // Set custom object name                 impNode.setNewObjectName( localFile.getName( ) );                                // Now execute the operation.  It succeeds, so now set attributes on all documents.                if( operation.execute( ) ) {                    LOGGER.info( "Import operation succeeded.  " + localFile.getName( ) +                            " was imported successfully.  Now setting attributes on it." );                                        // Get all newly imported documents from the import operation                    IDfList newObjLst = operation.getNewObjects( );                                        // Loop through all documents that this operation just imported                    for( int i = 0; i < newObjLst.getCount( ); i++ ) {                        // Get the document object from the list                        IDfDocument newObj = (IDfDocument) newObjLst.get( i );                                                // Set docket number on this document                        newObj.setInt( "docket", docketNum );                                                // Set declassification date                        newObj.setTime( "decladte", declassDate );                                                // Now save the document in Documentum                        newObj.save( );                                                // Attach the lifecycle policy                        newObj.attachPolicy( busPolicyObj.getObjectId( ), "", "" );                                                // Now promote this document all the way to the final state                        newObj.promote( "", true, false );                        newObj.promote( "", true, false );                                                LOGGER.info( "Created object: " + newObj.getObjectId( ) );                    } // End for loop for all documents that were just imported                } // End if operation succeeds                                // Operation fails                else {                    LOGGER.error( "Import operation failed." );                                        // Get list of all errors from this operation                    IDfList errList = operation.getErrors( );                                        // Loop through all errors and log them                    for( int i = 0; i < errList.getCount( ); i++ ) {                        IDfOperationError err = (IDfOperationError) errList.get( i );                                                LOGGER.error( err.getMessage( ) );                    }                } // End else the operation failed                                // Now remove the node for this file                operation.removeNode( impNode );            } // End while ZYIMAGE file still has lines to read                        // Close image file reader            imgFileReader.close( );                        // Close the CSV file reader            csvReader.close( );                        LOGGER.info( "" + ( lineCount - 1 ) + " document records were processed from " +                     imageFile.getPath( ) );        } catch( Exception ex ) {            LOGGER.error( ex.getMessage( ) );        } finally {            // Session is not null, so release it.            if( sess != null ) {                sessMgr.release( sess );                LOGGER.info( "Session is closed." );            }        }

And this is the utility method used in the above code:

private String trimString( final String input ) {        String output = null;                if( input != null ) {            output = input.trim( );        }                return output;    }

Am I missing any cleanup work in each iteration (or outside it) in the above code?

Also, what could be wrong in the filenames, as shown in the above log?  The program appeared to have loaded those files in Documentum (despite the import failure), but when I click on those files in Documentum, I get error saying there is no content.  Thanks in advance.

Asif

Comments

  • sanjeev6282
    sanjeev6282 Member
    edited January 14, 2013 #2

    I see some of your files (possibly these are the ones you are calling as bad files) have a " . " in their name, is it possible to rename those files and then call the importOperation on them. Not very sure but you may try the below a well:

    {

                                                            //Importing the documents which are inside the folders.

                                                            IDfImportOperation importOperation = new DfImportOperation();    //Getting the import Opertaion.

                                                            importOperation.setSession(session);           //Setting the session for the import

                                                            importOperation.setDestinationFolderId(id);    //Adding the Destination folder.

                                                            IDfImportNode node = (IDfImportNode) importOperation.add(localpath);

                                                            node.setDocbaseObjectType("<Object_Type>");

                                                        if (node == null)

                                                        {

                                                             DfLogger.debug(this,"Node is empty", null, null);

                                                        }

                                                        else

                                                        {

                                                            if (!importOperation.execute())       //Checking if import has been Unsuccessfull.

                                                            {All your "set" code goes here.}

  • Asif2012
    Asif2012 Member
    edited January 14, 2013 #3

    Those 3 files that I listed from log entries are all bad.  I have to test them further to figure out what makes them bad.  There are hundreds of other files that have multiple dots in their names, but they all made it to Documentum successfully.  So the dot does not seem to be an issue.

    Yes, we probably will rename these bad files before we will try to import them.  But we have millions of documents that are waiting to be imported.  We cannot possibly fix all bad files among them.  That's why I am looking for a method from DFC API that can remove a bad file when the error occurs, so that the next good file can be imported successfully.  Ultimately, I hope that only bad files will be left and we will figure them out from the logs.

    I was reading the API documentation, and noticed that IDfOperation has a method called abort() that supposedly clears all file contents.  If I call it in the else part (when import fails), will this clear the bad file for the next operation?  What about iterating through the list of all nodes of this operation and then removing all IDfDocument objects?  Thanks.

    Asif