Previous |  Up |  Next

Article

Title: PDF Enhancements Tools for a Digital Library (English)
Author: Hatlapatka, Radim
Author: Sojka, Petr
Language: English
Journal: Towards a Digital Mathematics Library. Paris, France, July 7-8th, 2010
Volume:
Issue: 2010
Year:
Pages: 45-55
.
Category: math
.
Summary: This paper describes several innovative PDF document enhancements and tools that can be used when building a digital library. The main result presented in this paper is the PDF re-compression tool, developed using the jbig2enc encoder called pdfJbIm. This re-compression tool enables the size of the original bitonal PDFs to be, on average, downsized by one third. Some modifications to the jbig2enc encoder that increase the compression ratio even further are also described here. Together with another program, the pdfsizeopt.py by Péter Szabó, we have managed to decrease PDF storage size to such an extent that the transmission needs of a digital library were significantly reduced. We report the storage saving results that we have achieved on The Czech Digital Mathematics Library DML-CZ—we have downsized the PDF corpus to 43% of its original size. We also describe pdfsign tool for batch digital signature stamping of PDF documents. (English)
Keyword: jbig2enc
Keyword: JBIG2
Keyword: PDF size optimization
Keyword: compression
Keyword: DML
Keyword: digital signature
Keyword: JB2
Keyword: DjVu
Keyword: pdfsign
Keyword: DML-CZ
Keyword: EuDML
Keyword: pdfsizeopt.py
Keyword: Google
Keyword: JB2 algorithm
MSC: 68-06
MSC: 68U10
MSC: 68U15
MSC: 68U99
.
Date available: 2011-07-18T09:44:46Z
Last updated: 2012-08-27
Stable URL: http://hdl.handle.net/10338.dmlcz/702572
.
Reference: 1. Bartošek, M., Lhoták, M., Rákosník, J., Sojka, P., Šárfy, M.: DML-CZ: The Objectives and the First Steps.In: Borwein, J., Rocha, E.M., Rodrigues, J.F. (eds.) CMDE 2006: Communicating Mathematics in the Digital Era, pp. 69–79. A. K. Peters, MA, USA (2008) MR 2590568
Reference: 2. Bloomberg, D.: Leptonica.[online] (2010), [cit. 2010-04-25], http://www.leptonica.com/jbig2.html
Reference: 3. Bočák, P.: Digitáne podpisované PDF dokumenty (Bachelor thesis written in Czech, Digital signatures of PDF documents).Masaryk University, Faculty of Informatics (advisor Petr Sojka), Brno, Czech Republic (2008)
Reference: 4. Bottou, L., Haffner, P., Howard, P.G., Simard, P., Bengio, Y., Le Cun, Y.: High Quality Document Image Compression with DjVu.Journal of Electronic Imaging 7(3), 410–425 (1998), http://leon.bottou.org/papers/bottou-98
Reference: 5. Bruno, L.: IText PDF.[online] (2009), http://www.itextpdf.com/
Reference: 6. Committee, J.: 14492 FCD.ISO/IEC JTC 1/SC 29/WG 1 (1999), http://www.jpeg.org/public/fcd14492.pdf
Reference: 7. Foundation, T.A.S.: Apache PDFBox – Java PDF Library.[online] (2010), http://pdfbox.apache.org/
Reference: 8. Hatlapatka, R.: JBIG2 komprese (Bachelor thesis written in Czech, JBIG2 compression).Masaryk University, Faculty of Informatics (advisor Petr Sojka), Brno, Czech Republic (2010)
Reference: 9. Hatlapatka, R.: PDF Recompression using JBIG2.[online] (2010), http://nlp.fi.muni.cz/projekty/eudml/pdfRecompression/
Reference: 10. Hatlapatka, R.: Source codes of pdfJbIm.[online] (2010), http://code.google.com/p/pdfrecompressor/
Reference: 11. Howard, P.: Text image compression using soft pattern matching.Computer Journal 40(2/3), 146–156 (1997)
Reference: 12. ISO/IEC JTC1/SC29/WG1: JBIG Maui Meeting Press Release.(December 1999), http://www.jpeg.org/public/mauijbig.pdf
Reference: 13. Langley, A.: Homepage of jbig2enc encoder.[online], http://github.com/agl/jbig2enc
Reference: 14. Sylwestrzak, W., Borbinha, J., Bouche, T., Nowiński, A., Sojka, P.: EuDML—Towards the European Digital Mathematics Library.In: Sojka, P. (ed.) Proceedings of DML 2010. Masaryk University Press, Paris, France (Jul 2010)
Reference: 15. Adobe Systems Incorporated: Adobe Systems Incorporated: PDF Reference.pp. 90–100. Adobe Systems Incorporated, sixth edn. (2006), http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf
Reference: 16. Szabó, P.: Optimizing PDF output size of TeX documents.TUGboat 30(3), 112–130 (2009), [cit. 2010-04-26], http://code.google.com/p/pdfsizeopt/
Reference: 17. Union, I.T.: ITU-T Recommendation T.88.ITU-T Recommendation T.88 (2000), http://www.itu.int/rec/T-REC-T.88-200002-I/en
.

Files

Files Size Format View
DML_003-2010-1_8.pdf 373.7Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo