Hi – I need to extract/split one multi-page pdf file on multiple single-page pdfs. I would prefer to do it inside Content server, using native tools, does anybody have experience with this?
Hi Marian,
I haven't done it within Content Server itself (maybe some of our partners have, with OScript?) but I used CWS (Content Web Services) to download specific PDFs, and I used the PDFSharp .NET library to merge, print, extract, etc, whatever I needed to do, in my C# application.
It's open-source:
http://www.pdfsharp.net/
Thanks,
Nizar
Thank you Nizar, this is good. Can you give me a quick idea on how to create automated solution, starting from user dropping a pdf in a dedicated CS folder.
An automated solution that would be easy to implement, would be something on a timer, in my mind.
In Windows, this would be task scheduler against a specific application. In *Nix land, a CRON Job, I'd imagine.
Have your .NET/Java application run every X minutes, check a specific parentID for any subtype 144s with .PDF extension or whatever, check the content/metadata.
If you need to do work on it, download the file, split/merge/extract, and then upload it wherever.
Set some sort of category data or something, as a flag. This will be the flag you will check, as metadata, for a PDF that has been "completed", so you can skip it, everytime.
Does that help?
Was thinking about this and if you wanted to get really fancy you could do a web application of some sort (.NET or Tomcat hosted) that sits listening and eats PDFs, does the transform, and uploads the results to Content Server. In my mental image of this you could send it PDFs via WebReport trigger – PDF gets uploaded to a particular folder, the WR can send it out using the web app as a destination or through its REST client (probably a few ways to do it, really). I would imagine you could even set arguments for things like destination folders, etc. through the WR too.
Just some food for thought that popped in my head reading this thread.
AK
From: eLink Entry: Content Server Development Forum <development@elinkkc.opentext.com>Sent: Tuesday, October 9, 2018 10:41 AMTo: eLink Recipient <devnull@elinkkc.opentext.com>Subject: Need to split one pdf on multiple single-page pdfs
Need to split one pdf on multiple single-page pdfs
Posted bynghazal@opentext.com (Ghazal, Nizar) On 10/09/2018 10:35 AM
Quoted Marian Farkas on 10/09/2018 10:35 AM:
[To post a comment, use the normal reply function]
Topic:
Forum:
Content Server Development Forum
Content Server:
My Support
Thank you Nizar and Alex. If possible, I’d like to have entire solution inside OpenText Content server:
Can step 3.a be done in CS, ex. can I call C# code from a WR? Or can I use JavaScript inside WR to do this?
You can't call the C# directly in a WebReport, but the WebReport can POST to a given address using the RESTClient, for processing. In the @Body you could conceivably provide the binary data, which you would certainly be provided using a WebReport trigger, when someone adds something to that folder. That would require a webapp like Alex suggested.
You could certainly import and use any javascript libraries you want, but that would require some sort of web browser to run the javascript, itself.
That basically meaning you need to run the WR manually to have the javascript do something.
Sorry, I can’t think of a way to do step 3 without leaving CS in some fashion. You could probably hit CWS through Javascript, but I’m not sure you can use that library Nizar mentioned that way, and it could end up being over-complicated and/or slow. Not to mention that throws out scheduling/triggers since JS in a WR only runs in the browser.
There might be some sort of 3rd party thing… maybe Adlib or Blazon (technical OT now, but it’s still new to me)? Not sure the integration would be like either. I know we typically use Adlib for the Renditions module, but I don’t think renditioning fits the use case here, at least.
From: eLink Entry: Content Server Development Forum <development@elinkkc.opentext.com>Sent: Tuesday, October 9, 2018 12:48 PMTo: eLink Recipient <devnull@elinkkc.opentext.com>Subject: RE Need to split one pdf on multiple single-page pdfs
RE Need to split one pdf on multiple single-page pdfs
Posted bymarian.farkas@orthoclinicaldiagnostics.com (Farkas, Marian) On 10/09/2018 12:45 PM
Thank you both. I think I will go c# way: will create a shared network folder on CS server where users can drop files, and have results posted to another folder visible to CS and process them with WR.