What is urllib?
urllib is a Python module that can be used for opening URLs. It defines functions and classes to help in URL actions.
With Python you can also access and retrieve data from the internet like XML, HTML, JSON, etc. You can also use Python to work with this data directly. In this tutorial we are going to see how we can retrieve data from the web. For example, here we used a guru99 video URL, and we are going to access this video URL using Python as well as print HTML file of this URL.
In this tutorial we will learn
Before we run the code to connect to Internet data, we need to import statement for URL library module or “urllib”.
- Import urllib
- Define your main function
- Declare the variable webUrl
- Then call the urlopen function on the URL lib library
- The URL we are opening is guru99 tutorial on youtube
- Next, we going to print the result code
- Result code is retrieved by calling the getcode function on the webUrl variable we have created
- We going to convert that to a string, so that it can be concatenated with our string “result code”
- This will be a regular HTTP code “200”, indicating http request is processed successfully
You can also read the HTML file by using the “read function” in Python, and when you run the code, the HTML file will appear in the console.
- Call the read function on the webURL variable
- Read variable allows to read the contents of data files
- Read the entire content of the URL into a variable called data
- Run the code- It will print the data into HTML format
Here is the complete code
Python 2 Example
# # read the data from the URL and print it # import urllib2 def main(): # open a connection to a URL using urllib2 webUrl = urllib2.urlopen("https://www.youtube.com/user/guru99com") #get the result code and print it print "result code: " + str(webUrl.getcode()) # read the data from the URL and print it data = webUrl.read() print data if __name__ == "__main__": main()
Python 3 Example
# # read the data from the URL and print it # import urllib.request # open a connection to a URL using urllib webUrl = urllib.request.urlopen('https://www.youtube.com/user/guru99com') #get the result code and print it print ("result code: " + str(webUrl.getcode())) # read the data from the URL and print it data = webUrl.read() print (data)