Hi All
I want to delete duplicate documents within a folder in documentum
The documtne have the same object_name.
Does anyone have a Java code I can use to do this?
Thank you
If there are Accepted Answers, those will be shown by default. You can switch to 'All Replies' by selecting the tab below.
Here you go:
String docbase = PluginState.getDocbase();IDfClientX clientx = PluginState.getClientX();IDfClient client = PluginState.getLocalClient();IDfSessionManager sMgr = PluginState.getSessionManager();IDfSession session = sMgr.getSession(docbase);IDfLoginInfo li = sMgr.getIdentity(docbase);System.out.println("DFC Version: " + DfClient.getDFCVersion());System.out.println("docbase=" + docbase);System.out.println("logged in as " + li.getUser());// folder id from which you want to delete duplicatesString folderId = "0b0f42e8801c7a86";IDfQuery query = clientx.getQuery();query.setDQL("SELECT object_name, count(*) as doc_count FROM dm_document where any i_folder_id = '" + folderId + "' GROUP BY object_name HAVING count(*) > 1");IDfCollection col = null;try { col = query.execute(session, IDfQuery.DF_QUERY); while (col.next()) { String docName = col.getString("object_name"); int docCount = col.getInt("doc_count"); System.out.println("Document Found: "+ docName + " with count: " + docCount); // this query lists documents that would be deleted // comment block below if you just want to delete { IDfQuery queryDoc = clientx.getQuery(); queryDoc.setDQL("SELECT r_object_id, r_modify_date, object_name FROM dm_document where any i_folder_id = '" + folderId + "' and object_name='" + docName + "' and r_modify_date < (select max(r_modify_date) FROM dm_document where any i_folder_id = '" + folderId + "' and object_name='" + docName + "')"); IDfCollection colDoc = null; try { colDoc = queryDoc.execute(session, IDfQuery.DF_QUERY); while (colDoc.next()) { System.out.println("\tShould delete document: "+ colDoc.getString("r_object_id") + ", last modified: " + colDoc.getString("r_modify_date")); } } catch (DfException dfEx) { System.out.println("ERROR Executing DQL: " + dfEx.getMessage()); } finally { if (colDoc != null) { try { colDoc.close(); } catch (DfException closeEx) { } } } } // this query deletes the documents // uncomment block below if you want to delete // { // IDfQuery deleteDoc = clientx.getQuery(); // deleteDoc.setDQL("DELETE dm_document OBJECTS where any i_folder_id = '" + folderId + "' and object_name='" + docName + "' and r_modify_date < (select max(r_modify_date) FROM dm_document where any i_folder_id = '" + folderId + "' and object_name='" + docName + "')"); // try { // deleteDoc.execute(session, IDfQuery.DF_QUERY); // } // catch (DfException dfEx) { // System.out.println("ERROR Executing DQL: " + dfEx.getMessage()); // } // } }} catch (DfException dfEx) { System.out.println("ERROR Executing DQL: " + dfEx.getMessage());} finally { if (col != null) { try { col.close(); } catch (DfException closeEx) { } } if(session != null) sMgr.release(session);}
This code deletes all documents which occur under a certain object_name more than once. The document with the latest r_modify_date is not deleted.
Before running the code provided by KJurkowski, just make sure to generate a report of how many documents will be affected by this operation. Here is how you can:
select object_name, count(*) as MYCOUNT from <your-object-type> group by object_name having count(*)>1
-Sarma