Checking the website.
If you are a webmaster you may need to perform your website's URLs online availability status on a regular basis and Python is here for us for help. The script below uses sitemap of your site to get list of available URLs.
Python Knowledge Base: Make coding great again.
- Updated:
2024-12-20 by Andrey BRATUS, Senior Data Analyst.
The logic of the Python code below is really simple:
- First you need pip install advertools which will convert your sitemap to dataframe.
- Then import all necessary libraries.
- Convert sitemap to DF and then to list.
- Slice the list if you want to check just part of your website, e.g. just first 100 URLs.
- Using FOR loop check the status code of each URL, response 200 is oK.
Checking Website's URLs Status Python code:
import advertools as adv
import pandas as pd
import urllib.request
import time
sitemap_urls = adv.sitemap_to_df("https://python-code.pro/static/img/sitemap.xml")
url = sitemap_urls["loc"].to_list()
url=url[:5]
for i in url:
status_code = urllib.request.urlopen(i).getcode()
print(i)
print(status_code)
time.sleep(3)
print(f'Totally {len(url)} URLs checked, status code 200 is oK.')
print('Thats all, folks !!!')
OUT: INFO:root:Getting https://python-code.pro/static/img/sitemap.xml
https://python-code.pro/
200
https://python-code.pro/robots.txt
200
https://python-code.pro/regression-models-python-cheatsheets/
200
https://python-code.pro/regression-models-r-cheatsheets/
200
https://python-code.pro/classification-models-python-r-cheatsheets/
200
Totally 5 URLs checked, status code 200 is oK.
Thats all, folks !!!