Previous |  Up |  Next

Article

Keywords:
OCR; Google
Summary:
We discuss in some detail some of the drawbacks of PDF files obtained from mathematical papers prepared in TeX, particularly concerning indexing, copy/paste and OCR capabilities.
References:
1. Mijajlović, Ž., Ognjanović, Z.: , In: Sojka, P. (ed.) Proceedings of DML 2008, Birmingham, UK, July 27th.
2. Mijajlović, Ž., Ognjanović, Z., Pejović, A.: , NCD Review vol. 12, 2008, 43–48, http://elib.mi.sanu.ac.rs/
3. Google Webmaster Tools. http://www.google.com/webmasters/tools/
4. Baker, J.B., Sexton, A.P., Sorge, V.: , ibid, 75–79. Zbl 1176.68080
5. Adobe CMap and CIDFont Files Specification. http://www.adobe.com/devnet/font/pdfs/5014.CIDFont_Spec.pdf
7. PDF Reference 1.7. 6th Ed., , http://www.adobe.com/devnet/pdf/
8. The Comprehensive TeX Archive Network. http://www.ctan.org/
9. The cmap package developed by V. Volovich. http://tug.ctan.org/tex-archive/macros/latex/contrib/cmap/
10. PDF as a standard for archiving. white paper, http://www.adobe.com/enterprise/pdfs/pdfarchiving.pdf
Partner of
EuDML logo