Python Internet Access using Urllib.Request and urlopen()

What is urllib?

urllib is a Python module that can be used for opening URLs. It defines functions and classes to help in URL actions.

With Python you can also access and retrieve data from the internet like XML, HTML, JSON, etc. You can also use Python to work with this data directly. In this tutorial we are going to see how we can retrieve data from the web. For example, here we used a guru99 video URL, and we are going to access this video URL using Python as well as print HTML file of this URL.

How to Open URL using Urllib

Before we run the code to connect to Internet data, we need to import statement for URL library module or “urllib”.

Open URL using Urllib

  • Import urllib
  • Define your main function
  • Declare the variable webUrl
  • Then call the urlopen function on the URL lib library
  • The URL we are opening is guru99 tutorial on youtube
  • Next, we going to print the result code
  • Result code is retrieved by calling the getcode function on the webUrl variable we have created
  • We going to convert that to a string, so that it can be concatenated with our string “result code”
  • This will be a regular HTTP code “200”, indicating http request is processed successfully

How to get HTML file form URL in Python

You can also read the HTML file by using the “read function” in Python, and when you run the code, the HTML file will appear in the console.

HTML file form URL in Python

  • Call the read function on the webURL variable
  • Read variable allows to read the contents of data files
  • Read the entire content of the URL into a variable called data
  • Run the code- It will print the data into HTML format

Here is the complete code

Python 2 Example

#  
# read the data from the URL and print it
#
import urllib2

def main():
# open a connection to a URL using urllib2
   webUrl = urllib2.urlopen("https://www.youtube.com/user/guru99com")
  
#get the result code and print it
   print "result code: " + str(webUrl.getcode()) 
  
# read the data from the URL and print it
   data = webUrl.read()
   print data
 
if __name__ == "__main__":
  main()

Python 3 Example

#
# read the data from the URL and print it
#
import urllib.request
# open a connection to a URL using urllib
webUrl  = urllib.request.urlopen('https://www.youtube.com/user/guru99com')

#get the result code and print it
print ("result code: " + str(webUrl.getcode()))

# read the data from the URL and print it
data = webUrl.read()
print (data)