XpdfAnalyze is a very affordable developer's library/SDK that makes it easy to determine the object types and colors used on one or more pages in a PDF file. Object types are images, text strings, strokes (lines) and fills (filled polygons).
Object-type information can be used to categorize PDF files as image-only, text-only or image-and-text.
Color information includes color spaces (DeviceRGB, DeviceCMYK, Separation, etc.), as well as information on which process colors (CMYK) and/or custom colors (spot colors) are used.
XpdfAnalyze is easy to use.
#include "XpdfAnalyze.h"
PDFHandle pdf;
int err, n;
err = pdfLoadFile(&pdf, "c:/test/file.pdf");
if (err != pdfOk) {
/* handle the error */
}
/* analyze pages 1-3 */
pdfAnalyzePages(pdf, 1, 3);
/* get the number of images on pages 1-3 */
n = pdfGetNumImages(pdf);
THIS PARTICULAR PRODUCT RUNS ON WINDOWS COMPUTERS. HOWEVER, MAC AND LINUX SHARED LIBRARIES ARE ALSO AVAILABLE; PORTABLE C++ SOURCE CODE IS AVAILABLE TOO. 32-BIT AND 64-BIT VERSIONS ARE AVAILABLE FOR ALL PLATFORMS. CONTACT US FOR DETAILS.