Merge pull request #713 from oxguy3/tabula

Add tabula-extractor
2024-11-01 11:17:56 +01:00 · 2016-01-29 15:21:35 +02:00 · 2016-01-29 15:21:35 +02:00 · b48c6b94f5
commit b48c6b94f5
parent 787919b264 5a9da6c4d6
1 changed files with 27 additions and 0 deletions
--- a/pages/common/tabula.md
+++ b/pages/common/tabula.md
@ -0,0 +1,27 @@
+# tabula
+
+> Extract tables from PDF files.
+
+- Extract all tables from a PDF to a CSV file:
+
+`tabula -o {{file.csv}} {{file.pdf}}`
+
+- Extract all tables from a PDF to a JSON file:
+
+`tabula --format JSON -o {{file.json}} {{file.pdf}}`
+
+- Extract tables from pages 1, 2, 3, and 6 of a PDF:
+
+`tabula --pages {{1-3,6}} {{file.pdf}}`
+
+- Extract tables from page 1 of a PDF, guessing which portion of the page to examine:
+
+`tabula --guess --pages {{1}} {{file.pdf}}`
+
+- Extract all tables from a PDF, using ruling lines to determine cell boundaries:
+
+`tabula --spreadsheet {{file.pdf}}`
+
+- Extract all tables from a PDF, using blank space to determine cell boundaries:
+
+`tabula --no-spreadsheet {{file.pdf}}`