Extracting OCR content using REST API



Having just a node id, I was trying to get document's indexed OCR content, the one that is being displayed while searching in a browser's Classic View. I wasn't able to find any REST call to do this. Is anyone of you aware of a way to do this? Can be also some solution outside of REST.



  • are you after the Rendition or the content that is added to the Search Index ?

  • I am after the search index, e.g. when the uploaded PDF is an image and a text on the image is being OCR'd by Content Server. I am aware of /content REST call that returns the document in a binary form but that's not what I was looking for.

  • appuq

    That is coming from search summary search api may have that info