Joomla TemplatesWeb HostingFree Joomla Templates
Home Blogs Web Automation using PyQt4 and jQuery

Web Automation using PyQt4 and jQuery

I would like to show how a web related task can be automated using PyQt4 and jQuery. The original intention is to automate the task of checking my broadband usage from the ISP portal. But here I am going to show how we can fetch google search result for given keyword. I know there are better ways to do this but I want to explain the technique with simple example.

Qt4 provides a widget called QWebView that is capable of loading and rendering web pages with the help of WebKit browser engine. But there is no simple way to manipulate the web page elements like clicking a link/button, entering values into input elements and etc. But the class QWebFrame provides the function "evaluateJavaScript" which allows us to execute arbitrary javascript code within current web page. By executing jQuery source within current web page we will get complete jQuery environment in which easily manipulate HTML elements. Below is the python script which receives search keyword from its command line argument and prints the google search result for the keyword.
 

#!/usr/bin/env python
import sys

from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *

class GoogleSearchBot(QApplication):
	ACTION_NONE = 0
	ACTION_SEARCH_KEYWORD = 1
	ACTION_FETCH_RESULTS = 2
	
	def __init__(self, argv):
		super(GoogleSearchBot, self).__init__(argv)
		jqueryFile = open("jquery-1.3.2.min.js") # make sure you have jquery source placed in the same directory
		self.__jquery = jqueryFile.read()
		jqueryFile.close()
		self.__webView = QWebView()
		self.__webView.show() # comment this line if you don't want to show the browser window
		self.connect(self.__webView, SIGNAL("loadFinished(bool)"), self.loadFinished)
		
	def search(self, keyword):
		self.__keyword = keyword
		self.__nextAction = self.ACTION_SEARCH_KEYWORD
		self.__webView.load(QUrl('http://www.google.com'))
		
	def loadFinished(self, ok):
		page = self.__webView.page()
		currentFrame = page.currentFrame()
		currentFrame.evaluateJavaScript(self.__jquery)
		
		if self.__nextAction == self.ACTION_SEARCH_KEYWORD:
			currentFrame.evaluateJavaScript('$("input[title=Google Search]").val("' + self.__keyword + '");')
			currentFrame.evaluateJavaScript('$("input[value=Google Search]").parents("form").submit();')
			self.__nextAction = self.ACTION_FETCH_RESULTS
		elif self.__nextAction == self.ACTION_FETCH_RESULTS:
			results = currentFrame.evaluateJavaScript('var results = ""; $("h3[class=r]").each(function(i) { results += $(this).text() + "\\n"; }); results');
			resultList = str(results.toString().toAscii()).splitlines()
			sno = 1
			print('Google search result\n====================')
			for result in resultList:
				print(str(sno) + ". " + result)
				sno += 1
			
			self.__webView.close()
			self.__nextAction = self.ACTION_NONE
			
if __name__ == '__main__':
	if len(sys.argv) != 2:
		print("Usage: GoogleSearchBot.py <keyword>")
		sys.exit(0)
		
	googleSearchBot = GoogleSearchBot(sys.argv)
	googleSearchBot.search(sys.argv[1])
	sys.exit(googleSearchBot.exec_())

So at line 15 I read jQuery source and have it for future use. Then I create QWebView and connect its "loadFinished"  signal. When we call the search function we first load www.google.com and wait for the signal "loadFinished". Once the page is loaded we inject jQuery source into current page. With the help of jQuery selector we find the input text element and enter the keyword. Then we submit the form and again wait for "loadFinished" signal. This time when loadFinished function called, we are presented with google search results for the keyword. Again we inject jQuery and collect the results(just the title alone, not the corresponding URL) and return it. In python side, we split string into list of lines and print it on

here is how it will look when we run

GoogleSearchBot Demo

Actually I started with mechanize but it is not working on all cases. Moreover debugging with mechanize is more difficult since the browser window is not visible and we won't know what is happening behind the scene.

Comments

avatar flipthefrog
+3
 
 
This doesnt necessarily work for non-US users, as Google by derfault serves a localized version of a page.
Changing the url in line line 25 to http://www.google.com/ncr makes it work outside the US by loaing the same page as US users get
Name *
Email (For verification & Replies)
URL
Code   
ChronoComments by Joomla Professional Solutions
Submit Comment
Cancel
avatar website design
0
 
 
This is also an other way to Implement,we can also Implement through Java.
Name *
Email (For verification & Replies)
URL
Code   
ChronoComments by Joomla Professional Solutions
Submit Comment
Cancel
avatar College Essay
0
 
 
Thank you for posting this step-by-step instruction. It is very useful for a newbie like me.
Name *
Email (For verification & Replies)
URL
Code   
ChronoComments by Joomla Professional Solutions
Submit Comment
Cancel
Huge thanks for posting this article here. What a shame that I have never thought about website task automation through jQuery and PytQt4. I was making those functions in simple php language, and I thought that there is no another way to do that. Oh and thanks for that QWebView widget, I wouldn't have found it by my own. The code was useful too, I have tried it on my own machine and it works like a charm. Please keep writing such a useful and interesting articles in the nearest future too. I will be waiting for sure.
Name *
Email (For verification & Replies)
URL
Code   
ChronoComments by Joomla Professional Solutions
Submit Comment
Cancel
Name *
Email (For verification & Replies)
URL
Code   
ChronoComments by Joomla Professional Solutions
Submit Comment

Last Updated (Monday, 28 September 2009 21:27)