BeautifulSoup: Your Secret Weapon for Web Table Scraping Excellence!

Scraping table from WEB.

Let's proceed our web scraping journey and perform another practical task - scrap HTML table from web and save it to Excel file.

Web table Scraping with BeautifulSoup.
Table Scraping with BeautifulSoup meme.

- Updated: 2024-07-23 by Andrey BRATUS, Senior Data Analyst.

This time we will use BeautifulSoup - another useful Python library for scrapping tasks.

Let's start from importing bs4 and pandas, most probably you wll need to install them first in your environment.

import requests
from bs4 import BeautifulSoup
import pandas as pd

Then we create an URL object and page object

url = ''
page = requests.get(url)

Now we save page object in xml format.

usoup = BeautifulSoup(page.text, 'lxml')

Now we obtain information from table HTML tag .

table1 = soup.find('table')

Then we obtain every title of columns with th HTML tag .

headers = []
for i in table1.find_all('th'):
 title = i.text

Now we ready to create a dataframe with obtained headers.

mydata = pd.DataFrame(columns = headers)

It's time to fill data using for loop.

for j in table1.find_all('tr')[1:]:
 row_data = j.find_all('td')
 row = [i.text for i in row_data]
 length = len(mydata)
 mydata.loc[length] = row

And now its the end of our magic - saving table as XLS file.

mydata.to_excel("python-pro.xls", index=False) 

