dump-n-load fixer

I've done a lot of Documentum migrations over the years. I've used the various best practices: upgrade migration, writing custom code to manage the migration, Bulldozer, McLaren DocLoader, etc. I've even used dump-n-load on occasion for small jobs like migrating data from PROD to UAT. Usage of dump-n-load comes with a lot of warnings and disclaimers, which I've found to be justified. All that being said, I happened upon a circumstance where dump-n-load was the most practical approach for a 800 GB migration task. One thing dump-n-load does really well is handle custom object types and inter-related documents/folders, which frankly saves a lot of custom code from being written.

So I decided to take my chances with dump-n-load, knowing full-well that there would be data and content problems to fix on the back-end of the project. I spoke to EMC support about this before proceeding. After discouraging me, they conceded one point of advice: do each dump-n-load interval at the cabinet or folder level. For instance, if my docbase has 5 cabinets, I would execute 5 separate dump-n-load instances by cabinet. In one case, where the cabinet itself contained 400 GB of data, I broke the dump-n-load into smaller instances by subfolder.

When all was said and done, the migration was fast and effective. For the most part, everything worked in terms of features and functions. But I identified the following problems:

the r_ancestor_id repeating values held references to objectids in the old system; this broke the breadcrump in the Webtop interface
the r_link_cnt values were incorrect or negative; this resulted in not being able to descend into a folder
the i_folder_id values were incorrectly referencing old/incorrect object IDs. So many folders and documents were successfully migrated but not visible because they were not linked properly to a folder in the new system.
Finally, in many cases the dmr_content object associated with a document was missing or the content itself was corrupted/incomplete.

Sounds like a mess, right? No worries; I had a plan.

I wrote code to systematically fix all these problems (attached). The key aspects of the architecture are that the code runs in an Eclipse environment, with dfc.properties files for the old and new content servers. I also made MS SQL ODBC connections to the old and new databases. I offer this code to the universe as a token of my gratitude to all the problems I've solved in dependence of information I've found in these and other forums. I'm sure you can find a way to adapt this code to your purposes.

fixer.java

Find more posts tagged with

Documentum

BPM

Comments

jsilver

Thanks for the contribution, and glad to see that your're finding this community helpful.