Hello OTCS Community,
I'm looking for a way to determine if there are any duplicate documents with a Content Server system. I would define duplicate as: Same Name and Same File Size.
Looking through old forum posts I see that I am not the first person to ask this question. Unfortunately reading through the posts it seems that not satisfactory answer has been posted.
My DB platform is MS SQL.
The ideal report would show:
Name, ObjectID1, ObjectID2, ... ObjectIDn (where is the number of duplicates)
The document name and the object ID of each of the offending instances.
Unfortunately I cannot figure out the complex relationship between DTree, DVersData and the SQL statement to produce a meaningful result.
I was disappointed to see that this report is not part of the standard reports that ship with Content Server v10.5.
The best I have been able to come up with is:
SELECT Name, COUNT(Name) AS NumberOfTimes
FROM DTree
WHERE SubType = 144
GROUP BY Name HAVING (COUNT(Name)>1)
ORDER BY Name DESC
This report returns the number of documents that have the same name and the number of times the name is used. It does NOT compare the filesize at all, or compare old versions of documents, etc.
Has anyone found a better solution to this expected problem with a Document Management system?
Regards,
-MC