Friday, January 23, 2026
HomeLanguagesHow to Extract PDF Tables in Python?

How to Extract PDF Tables in Python?

This topic is about the way to extract tables from a PDF enter Python. At first, let’s discuss what’s a PDF file?

PDF (Portable Document Format) may be a file format that has captured all the weather of a printed document as a bitmap that you simply can view, navigate, print, or forward to somebody else. PDF files are created using Adobe Acrobat,

Example :

Suppose a PDF file contains a Table

User_ID Name Occupation
1 David Product Manage
2 Leo IT Administrator
3 John Lawyer

And we want to read this table into our Python Program. This problem can be solved using several approaches. Let’s discuss each one by one.

Method 1: Using tabula-py

The tabula-py is a simple Python wrapper of tabula-java, which can read tables in a PDF. You can install the tabula-py library using the command.

pip install tabula-py
pip install tabulate

The methods used in the example are :

read_pdf(): reads the data from the tables of the PDF file of the given address

tabulate(): arranges the data in a table format

The PDF file used here is PDF.

Python3




from tabula import read_pdf
from tabulate import tabulate
 
#reads table from pdf file
df = read_pdf("abc.pdf",pages="all") #address of pdf file
print(tabulate(df))


Output:

Method 2: Using Camelot

Camelot is a Python library that helps to extract tables from PDF files. You can install the camelot-py library using the command

pip install camelot-py

The methods used in the example are :

read_pdf(): reads the data from the tables of the pdf file of the given address

tables[index].df: points towards the desired table of a given index

The PDF file used here is PDF.

Python3




import camelot
 
# extract all the tables in the PDF file
abc = camelot.read_pdf("test.pdf")   #address of file location
 
# print the first table as Pandas DataFrame
print(abc[0].df)


 

 

Output:

 

 

RELATED ARTICLES

1 COMMENT

Most Popular

Dominic
32475 POSTS0 COMMENTS
Milvus
119 POSTS0 COMMENTS
Nango Kala
6847 POSTS0 COMMENTS
Nicole Veronica
11977 POSTS0 COMMENTS
Nokonwaba Nkukhwana
12065 POSTS0 COMMENTS
Shaida Kate Naidoo
6986 POSTS0 COMMENTS
Ted Musemwa
7221 POSTS0 COMMENTS
Thapelo Manthata
6934 POSTS0 COMMENTS
Umr Jansen
6912 POSTS0 COMMENTS