HiveBrain v1.2.0
Get Started
← Back to all entries
snippetbashTip

tabula — Extract tables from PDF files. More information: <https://github.com/tabulapdf/tabula-java#commandli

Submitted by: @import:tldr-pages··
0
Viewed 0 times
extractcommandtabulafilesfromclitablespdf

Problem

How to use the tabula command: Extract tables from PDF files. More information: <https://github.com/tabulapdf/tabula-java#commandline-usage-examples>.

Solution

tabula — Extract tables from PDF files. More information: <https://github.com/tabulapdf/tabula-java#commandline-usage-examples>.

Extract all tables from a PDF to a CSV file:
tabula {{[-o|--outfile]}} {{file.csv}} {{file.pdf}}


Extract all tables from a PDF to a JSON file:
tabula {{[-f|--format]}} JSON {{[-o|--outfile]}} {{file.json}} {{file.pdf}}


Extract tables from pages 1, 2, 3, and 6 of a PDF:
tabula {{[-p|--pages]}} 1-3,6 {{file.pdf}}


Extract tables from page 1 of a PDF, guessing which portion of the page to examine:
tabula {{[-g|--guess]}} {{[-p|--pages]}} 1 {{file.pdf}}


Extract all tables from a PDF, using ruling lines to determine cell boundaries:
tabula {{[-r|--spreadsheet]}} {{file.pdf}}


Extract all tables from a PDF, using blank space to determine cell boundaries:
tabula {{[-n|--no-spreadsheet]}} {{file.pdf}}

Code Snippets

Extract all tables from a PDF to a CSV file

tabula {{[-o|--outfile]}} {{file.csv}} {{file.pdf}}

Extract all tables from a PDF to a JSON file

tabula {{[-f|--format]}} JSON {{[-o|--outfile]}} {{file.json}} {{file.pdf}}

Extract tables from pages 1, 2, 3, and 6 of a PDF

tabula {{[-p|--pages]}} 1-3,6 {{file.pdf}}

Extract tables from page 1 of a PDF, guessing which portion of the page to examine

tabula {{[-g|--guess]}} {{[-p|--pages]}} 1 {{file.pdf}}

Extract all tables from a PDF, using ruling lines to determine cell boundaries

tabula {{[-r|--spreadsheet]}} {{file.pdf}}

Context

tldr-pages: common/tabula

Revisions (0)

No revisions yet.