Pages

Thursday, March 26, 2015

Python Module – URLLIB2 – Rest Calls

Scripting languages should provide certain default things that every programmer use every day and fetching content from Internet is one of them

Python provides modules that allow fetching urls. Urllib2 is a python module that helps in fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols.

The modules also offer ways for handling basic authentication, cookies, and proxies and so on. These are provided by objects called handlers and openers.

Case 1 – Make a call
In order to make a call to a URL we can use,

import urllib2

response = urllib2.urlopen(url)
print response.read();

This is the simplest way of making a Call to URL and gets the response.

HTTP Requests – As we know HTTP is based on request and response. The client makes a request and server’s sends back the response. Rullib2 has the way to create a request object which will represent the HTTP request and when sent to server, it returns the response object.

The request after creation can be called using the same urlopen() method. This response is a file-like object, which means can be processed using .read() on the response

req = urllib2.Request('http://www.nove.org')
response = urllib2.urlopen(req)
data_page = response.read()

urllib2 makes the use of same request interface to handle ftp too like,
req = urllib2.Request('ftp://example.com/')

Case 2 – Post requests
Urllib2 can be used in posting data too. When using HTML forms, the data needs to be encoded before sending and then passed to the request object as a data argument before using this request object. The encoding can be done using urllib module rather than urllib2.

This can done as, 

import urllib
import urllib2

url = 'http://www.nova.com/SameServlet'
values = {'name' : 'Nova', 'location' : 'Hyderbad', 'language' : 'Python Call' }

data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
data_page = response.read()

Once we have the values we need to use the urllib.urlencode() before appending them to the request object. Once we have the encoded data we can create a Request object passing the URL and data.  The urlopen() is called on the request object to get the response.

Case 3 – Delete Requests
There will be cases where we need to use other methods like PUT, DELETE etc to perform operations. Delete Operations can be done as,

url = "http://nova.com/"+assetID
req = urllib2.Request(url,data='1121')
req.get_method = lambda: 'DELETE'
urllib2.urlopen(req).read()

Case 4 – Headers
Headers play an important role when making calls to external web resource. The user-agent header element is one important piece of information that can identify the source of the hit.
So in order to add the header to the request we can use

request = urllib2.Request('http://localhost:8080/')
request.add_header('User-agent', 'www.nove.com')
response = urllib2.urlopen(request)
data = response.read()

After creating a Request object, use add_header() to set the user agent value before opening the request.

I hope this article on basics of urllib2 will help people to dig more into the library
Read More

Python Module – Minidom - Parsing XML

This is a first of the series of articles on python modules. We will have articles that I will explain the basics of the python module and how to use that.

Parsing XML is one of the important features that every programming language has to provide. We need to parse XML many times when we get response from a Rest call or parsing locally stored XML files.

Xml.dom.minidom is a minimal implementation of the Document Object Model interface. This is much simpler and also smaller. As said in the doc ,users who are not good at Full DOM can use other XML processing module called “xml.tree.ElementTree”.

In this article we will see how we can process a XML file using xml.dom.minidom module.
Consider Sample XML content as

<?xml version="1.0"?>
<company>
          <name>Animal Care Enterprise</name>
          <staff id="1">
                   <nickname>Rats</nickname>
                   <salary>100,000</salary>
          </staff>
          <staff id="2">
                   <nickname>Dogs</nickname>
                   <salary>200,000</salary>
          </staff>
          <staff id="3">
                   <nickname>Cats</nickname>
                   <salary>20,000</salary>
          </staff>
</company>

Case 1 – Printing values
In order to process this we write the code as,

from xml.dom import minidom
from xml.dom.minidom import parse, parseString
from xml.dom.minidom import Document

dot = minidom.parse('dot.xml')
staffs = dot.getElementsByTagName('staff')

for staff in staffs:
       sid = staff.getAttribute("id")
       nickname = staff.getElementsByTagName("nickname")[0]
       salary = staff.getElementsByTagName("salary")[0]
       print("id:%s, nickname:%s, salary:%s" %(sid, nickname.firstChild.data, salary.firstChild.data))

We need to import the necessary Minidom module for processing the XML files. In order to read a file we use,

dot = minidom.parse('dot.xml')
Then we get the staff elements using the getElementsByTagName() method passing the element name.

staffs = dot.getElementsByTagName('staff')

This gives us the array with all the staff element details and we just need to parse them. After executing the code we can see

id:1, nickname:Rats, salary:100,000
id:2, nickname:Dogs, salary:200,000
id:3, nickname:Cats, salary:20,000

If we need to parse xml that is obtained in the Rest response we can use
dom = parseString(assetXML)

This will parse the String as an XML. The parse() and parseString() functions do is connect an XML parser with a “DOM builder” that can accept parse events from any SAX parser and convert them into a DOM tree. 

Case 2 – Adding an Element
Now once we were able to parse the XML doc and get various details of

dot = minidom.parse('dot.xml')
element=dot.createElement("Staff")
dot.childNodes[0].appendChild(element)
print dot.toxml()

We can also see that the <Staff> element was added to the end of the node as below

<?xml version="1.0" ?>
<company>
    <name>Animal Care Enterprise</name>
    <staff id="1">
        <nickname>Rats</nickname>
        <salary>100,000</salary>
    </staff>
    <staff id="2">
        <nickname>Dogs</nickname>
        <salary>200,000</salary>
    </staff>
    <staff id="3">
        <nickname>Cats</nickname>
        <salary>20,000</salary>
    </staff>
<Staff/></company>

Case 3 – Adding an Text Node
As we know the text of an element node is stored in a text node. In order to create a Text node we can use

dot = minidom.parse('dot.xml')
element=dot.createElement("Staff")
txt = dot.createTextNode("hello, world!")
element.appendChild(txt)
dot.childNodes[0].appendChild(element)
print dot.toxml()

and we can see the output as,

<?xml version="1.0" ?>
<company>
    <name>Animal Care Enterprise</name>
    <staff id="1">
        <nickname>Rats</nickname>
        <salary>100,000</salary>
    </staff>
    <staff id="2">
        <nickname>Dogs</nickname>
        <salary>200,000</salary>
    </staff>
    <staff id="3">
        <nickname>Cats</nickname>
        <salary>20,000</salary>
    </staff>
<Staff>hello, world!</Staff></company>

Case 4 - Node Import
Nodes can be imported using Minidom. We can use this import feature to copy nodes between multiple xml files. This can be done as

dom1 = parse("foo.xml")
dom2 = parse("bar.xml")
element = dom1.importNode(dom2.childNodes[1], True) 
#  take 2nd node in "bar.xml" and do  deep copy
dom1.childNodes[1].appendChild(x)  # append to children of 2nd node in "foo.xml"
print dom1.toxml()

Using the above examples we can start working on minidom – a XML processing module available in Python.
Read More