How can I merge PDF files without duplicating fonts?
I need to merge about a 100 PDF files into one where each file uses more or less the same unsubsetted fonts. All the options I have tried so far (pdfunite
, gs
, etc.) are not intelligent about font duplication and the merged PDF ends up with a 100 copies of the same font and is therefore much larger than it needs to be.
Is there a way to do any one of the following:
- Merge the PDFs without duplicating fonts?
- De-duplicate the fonts in the PDF later?
- Remove fonts from the PDF entirely?
The ideal solution will have a commercial friendly open source license (eg. not APGL).
pdf ghostscript poppler
add a comment |
I need to merge about a 100 PDF files into one where each file uses more or less the same unsubsetted fonts. All the options I have tried so far (pdfunite
, gs
, etc.) are not intelligent about font duplication and the merged PDF ends up with a 100 copies of the same font and is therefore much larger than it needs to be.
Is there a way to do any one of the following:
- Merge the PDFs without duplicating fonts?
- De-duplicate the fonts in the PDF later?
- Remove fonts from the PDF entirely?
The ideal solution will have a commercial friendly open source license (eg. not APGL).
pdf ghostscript poppler
2
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
@TomBrossman iText'sPdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.
– user2771609
Nov 2 '18 at 20:23
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
1
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33
add a comment |
I need to merge about a 100 PDF files into one where each file uses more or less the same unsubsetted fonts. All the options I have tried so far (pdfunite
, gs
, etc.) are not intelligent about font duplication and the merged PDF ends up with a 100 copies of the same font and is therefore much larger than it needs to be.
Is there a way to do any one of the following:
- Merge the PDFs without duplicating fonts?
- De-duplicate the fonts in the PDF later?
- Remove fonts from the PDF entirely?
The ideal solution will have a commercial friendly open source license (eg. not APGL).
pdf ghostscript poppler
I need to merge about a 100 PDF files into one where each file uses more or less the same unsubsetted fonts. All the options I have tried so far (pdfunite
, gs
, etc.) are not intelligent about font duplication and the merged PDF ends up with a 100 copies of the same font and is therefore much larger than it needs to be.
Is there a way to do any one of the following:
- Merge the PDFs without duplicating fonts?
- De-duplicate the fonts in the PDF later?
- Remove fonts from the PDF entirely?
The ideal solution will have a commercial friendly open source license (eg. not APGL).
pdf ghostscript poppler
pdf ghostscript poppler
edited Dec 31 '18 at 19:53
Kurt Pfeifle
1,050711
1,050711
asked Nov 1 '18 at 21:32
user2771609user2771609
1094
1094
2
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
@TomBrossman iText'sPdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.
– user2771609
Nov 2 '18 at 20:23
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
1
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33
add a comment |
2
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
@TomBrossman iText'sPdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.
– user2771609
Nov 2 '18 at 20:23
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
1
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33
2
2
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
@TomBrossman iText's
PdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.– user2771609
Nov 2 '18 at 20:23
@TomBrossman iText's
PdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.– user2771609
Nov 2 '18 at 20:23
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
1
1
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33
add a comment |
1 Answer
1
active
oldest
votes
Contrary to what you say, recent versions of Ghostscript have become quite efficient when it comes to merging multiple PDFs into a single one, and at the same time avoiding to embed an identical font multiple times.
Inputs
Here are the details about 3 input PDFs, which I'll merge into a single output:
for i in {1..3}; do pdffonts ${i}.pdf ; echo ; done
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
Merging
Now merge these three PDF input files with the help of pdftk
.
pdftk 1.pdf 2.pdf 3.pdf cat output merged.pdf
Output
Now check the font status of the output merged.pdf:
pdffonts merged.pdf
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 5 0
Helvetica Type 1C WinAnsi yes no no 14 0
Helvetica Type 1C WinAnsi yes no no 23 0
Ok, not yet there...
Optimize with Ghostscript
gs -o optim.pdf -sDEVICE=pdfwrite merged.pdf
GPL Ghostscript GIT PRERELEASE 9.27 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 3.
Page 1
Page 2
Page 3
Check font statuses and file sizes
ls -lh {1..3}.pdf merged.pdf optim.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 1.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 2.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 3.pdf
-rw-r--r-- 1 kurtpfeifle staff 147K Dec 31 20:32 merged.pdf
-rw-r--r-- 1 kurtpfeifle staff 7.5K Dec 31 20:34 optim.pdf
Conclusion
I tested this with Ghostscript v9.25.
If this doesn't work for you, you'll need to...
- ...tell us the version of Ghostscript you are using;
- ...provide a link to (some of) your input PDFs for more detailed analysis.
I'm aware that this answer does not provide you with a solution that meets exactly your license requirements. -- But your false statement about Ghostscript prompted me to give this answer anyway, so other people interested in this topic can still benefit from it...
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1089320%2fhow-can-i-merge-pdf-files-without-duplicating-fonts%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Contrary to what you say, recent versions of Ghostscript have become quite efficient when it comes to merging multiple PDFs into a single one, and at the same time avoiding to embed an identical font multiple times.
Inputs
Here are the details about 3 input PDFs, which I'll merge into a single output:
for i in {1..3}; do pdffonts ${i}.pdf ; echo ; done
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
Merging
Now merge these three PDF input files with the help of pdftk
.
pdftk 1.pdf 2.pdf 3.pdf cat output merged.pdf
Output
Now check the font status of the output merged.pdf:
pdffonts merged.pdf
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 5 0
Helvetica Type 1C WinAnsi yes no no 14 0
Helvetica Type 1C WinAnsi yes no no 23 0
Ok, not yet there...
Optimize with Ghostscript
gs -o optim.pdf -sDEVICE=pdfwrite merged.pdf
GPL Ghostscript GIT PRERELEASE 9.27 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 3.
Page 1
Page 2
Page 3
Check font statuses and file sizes
ls -lh {1..3}.pdf merged.pdf optim.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 1.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 2.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 3.pdf
-rw-r--r-- 1 kurtpfeifle staff 147K Dec 31 20:32 merged.pdf
-rw-r--r-- 1 kurtpfeifle staff 7.5K Dec 31 20:34 optim.pdf
Conclusion
I tested this with Ghostscript v9.25.
If this doesn't work for you, you'll need to...
- ...tell us the version of Ghostscript you are using;
- ...provide a link to (some of) your input PDFs for more detailed analysis.
I'm aware that this answer does not provide you with a solution that meets exactly your license requirements. -- But your false statement about Ghostscript prompted me to give this answer anyway, so other people interested in this topic can still benefit from it...
add a comment |
Contrary to what you say, recent versions of Ghostscript have become quite efficient when it comes to merging multiple PDFs into a single one, and at the same time avoiding to embed an identical font multiple times.
Inputs
Here are the details about 3 input PDFs, which I'll merge into a single output:
for i in {1..3}; do pdffonts ${i}.pdf ; echo ; done
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
Merging
Now merge these three PDF input files with the help of pdftk
.
pdftk 1.pdf 2.pdf 3.pdf cat output merged.pdf
Output
Now check the font status of the output merged.pdf:
pdffonts merged.pdf
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 5 0
Helvetica Type 1C WinAnsi yes no no 14 0
Helvetica Type 1C WinAnsi yes no no 23 0
Ok, not yet there...
Optimize with Ghostscript
gs -o optim.pdf -sDEVICE=pdfwrite merged.pdf
GPL Ghostscript GIT PRERELEASE 9.27 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 3.
Page 1
Page 2
Page 3
Check font statuses and file sizes
ls -lh {1..3}.pdf merged.pdf optim.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 1.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 2.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 3.pdf
-rw-r--r-- 1 kurtpfeifle staff 147K Dec 31 20:32 merged.pdf
-rw-r--r-- 1 kurtpfeifle staff 7.5K Dec 31 20:34 optim.pdf
Conclusion
I tested this with Ghostscript v9.25.
If this doesn't work for you, you'll need to...
- ...tell us the version of Ghostscript you are using;
- ...provide a link to (some of) your input PDFs for more detailed analysis.
I'm aware that this answer does not provide you with a solution that meets exactly your license requirements. -- But your false statement about Ghostscript prompted me to give this answer anyway, so other people interested in this topic can still benefit from it...
add a comment |
Contrary to what you say, recent versions of Ghostscript have become quite efficient when it comes to merging multiple PDFs into a single one, and at the same time avoiding to embed an identical font multiple times.
Inputs
Here are the details about 3 input PDFs, which I'll merge into a single output:
for i in {1..3}; do pdffonts ${i}.pdf ; echo ; done
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
Merging
Now merge these three PDF input files with the help of pdftk
.
pdftk 1.pdf 2.pdf 3.pdf cat output merged.pdf
Output
Now check the font status of the output merged.pdf:
pdffonts merged.pdf
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 5 0
Helvetica Type 1C WinAnsi yes no no 14 0
Helvetica Type 1C WinAnsi yes no no 23 0
Ok, not yet there...
Optimize with Ghostscript
gs -o optim.pdf -sDEVICE=pdfwrite merged.pdf
GPL Ghostscript GIT PRERELEASE 9.27 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 3.
Page 1
Page 2
Page 3
Check font statuses and file sizes
ls -lh {1..3}.pdf merged.pdf optim.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 1.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 2.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 3.pdf
-rw-r--r-- 1 kurtpfeifle staff 147K Dec 31 20:32 merged.pdf
-rw-r--r-- 1 kurtpfeifle staff 7.5K Dec 31 20:34 optim.pdf
Conclusion
I tested this with Ghostscript v9.25.
If this doesn't work for you, you'll need to...
- ...tell us the version of Ghostscript you are using;
- ...provide a link to (some of) your input PDFs for more detailed analysis.
I'm aware that this answer does not provide you with a solution that meets exactly your license requirements. -- But your false statement about Ghostscript prompted me to give this answer anyway, so other people interested in this topic can still benefit from it...
Contrary to what you say, recent versions of Ghostscript have become quite efficient when it comes to merging multiple PDFs into a single one, and at the same time avoiding to embed an identical font multiple times.
Inputs
Here are the details about 3 input PDFs, which I'll merge into a single output:
for i in {1..3}; do pdffonts ${i}.pdf ; echo ; done
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 8 0
Merging
Now merge these three PDF input files with the help of pdftk
.
pdftk 1.pdf 2.pdf 3.pdf cat output merged.pdf
Output
Now check the font status of the output merged.pdf:
pdffonts merged.pdf
name type encoding emb sub uni object ID
-------------------------- ----------------- ---------------- --- --- --- ---------
Helvetica Type 1C WinAnsi yes no no 5 0
Helvetica Type 1C WinAnsi yes no no 14 0
Helvetica Type 1C WinAnsi yes no no 23 0
Ok, not yet there...
Optimize with Ghostscript
gs -o optim.pdf -sDEVICE=pdfwrite merged.pdf
GPL Ghostscript GIT PRERELEASE 9.27 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 3.
Page 1
Page 2
Page 3
Check font statuses and file sizes
ls -lh {1..3}.pdf merged.pdf optim.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 1.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 2.pdf
-rw-r--r-- 1 kurtpfeifle staff 51K Dec 31 20:25 3.pdf
-rw-r--r-- 1 kurtpfeifle staff 147K Dec 31 20:32 merged.pdf
-rw-r--r-- 1 kurtpfeifle staff 7.5K Dec 31 20:34 optim.pdf
Conclusion
I tested this with Ghostscript v9.25.
If this doesn't work for you, you'll need to...
- ...tell us the version of Ghostscript you are using;
- ...provide a link to (some of) your input PDFs for more detailed analysis.
I'm aware that this answer does not provide you with a solution that meets exactly your license requirements. -- But your false statement about Ghostscript prompted me to give this answer anyway, so other people interested in this topic can still benefit from it...
answered Dec 31 '18 at 19:43
Kurt PfeifleKurt Pfeifle
1,050711
1,050711
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1089320%2fhow-can-i-merge-pdf-files-without-duplicating-fonts%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
stackoverflow.com/questions/21979200/…
– Tom Brossman
Nov 2 '18 at 19:24
@TomBrossman iText's
PdfSmartCopy
that the solution you linked to relies on would have been an option, except for the AGPL license.– user2771609
Nov 2 '18 at 20:23
@TomBrossman You are not wrong, but please don't make askubuntu toxic and be polite, you are violating the code of conduct.
– user2771609
Nov 3 '18 at 15:38
1
Thank you for identifying this 'toxic' matter, I suggest you flag any code of conduct breaches you identify to the moderators of this site so they can take a look at them.
– Tom Brossman
Nov 3 '18 at 17:33