Hello programmers, today we are going to perform web scrapping using python. In this post we are going to extract all the data presented on website using Header tags i.e. (h1,h2,h3,title etc). To perform this we need following libraries
Requirement:
Beautifulsoup
Requests
If you have already installed them just skip the installation part and start coding but if not the install them using following command
Installation:
Here we are going to perform installation of required libraries.
pip install beautifulsoup4
pip install requests
After entering above command one by one it will installed on your pc with their required libraries.
When done we can move to the coding section
Copy or type below code into your terminal or simply download .py file by clicking on download button.
Code:
import requests
from bs4 import BeautifulSoup
link = 'http://www.bitsolve.in/'
re = requests.get(link)
sup = BeautifulSoup(re.text, 'lxml')
print("List of all Header tag :")
for tag in sup.find_all(["h1","h2","h3","h4","h5","h6","title"]):
print(tag.name + ' ' + tag.text.strip())
Here we are going to extract all header tag values as well as title value of corresponding website. Here we have taken www.bitsolve.in as sample website, but you can change it to your desired website.
After executing above code into terminal following output will be printed.
Result:
That’s it. Here we have done the web scrapping. Try this with with your own self. If any problem persist feel free to comment below. We will try to solve your problem regarding this asap.
Happy Coding…!!