A handy tool for our college staff

Hello people!! Its been a really long while since I posted anything here. Its been a really good 6 or 7 months for me both academically and personally. Been through a lot of changes, learnt new stuff, decided on my main field of interest etc! Well… the main thing is, Im Back! Ive jumped over to the Ruby on Rails platform now. And I think I love this platform. Its a joy to write code in Ruby and extra joy to use the goodies provided by Rails.

I also have undertaken a project for my college to manage marks and attendance which is almost complete. I will write an exclusive post about that in coming days.

Ok. Before I begin this post I cant help mentioning how good Chelsea are performing this season! πŸ˜€ We’re on course for a Title win! πŸ™‚

Ok okay… My college’s management is very demanding. When our university results come, the staff are required to get the results from the results websites, analyse the results and generate reports on the very same day which can be a really challenging and tiresome task given the large number of students in each batch.

I always wanted to help then out by giving them an easy way around this cumbersome job. Ive tried finding loopholes and vulnerabilities in the University Server but hit dead ends all the time. Finally I decided that the only way (atleast for me) to get the results was to parse the results from the website directly from the HTML page. I had this idea in mind but did not implement it until the results came for my 6th semester!

That night I was reading stuff when one of my friends texted me saying that the results have come. I opened the results website without any sorta excitement or expectations πŸ˜‰ coz I already know that Anna University (the uni to which my college is affiliated to) is known for throwing big surprises. And they did not dissapoint! An average result once again (for me). Well I had to do something to divert my mind. So I thought of writing the app to download the results! πŸ˜€ And my word! It was a brilliant idea πŸ™‚

So I set off! Started writing the code at about 23:30 PM that night. I used the Nokogiri Gem for Ruby to parse the HTML. Nokogiri accepts CSS selectors or XPath to select the HTML element to be parsed and it returns the elements as an array. So its really handy to use for small projects like this one. You can read more about Nokogiri here.

This is the code that I finally arrived at πŸ˜€

require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'

puts "Anna University Results Downloader for schools9.com"
puts "Author: Steve Robinson, Panimalar Institute of Technology"

puts "\n\n"
puts "Please provide the URL of the results"
puts "Follow Below steps to obtain URL"
puts "1. Open the schools9 website and browse to the required results page"
puts "2. Copy the url"
puts "3. Paste URL here and press 'RETURN' key"
puts "4. Enter the extension of the results page:"
puts "For example if the url is http://www.schools9.com/annagrade1808.htm"
puts "Then the extension is 'htm'. Please Enter your extension now"



puts "The new URL is #{url}"

#URL Processing

puts "================================================"
puts "Note: please provide ONLY CONSECUTIVE register numbers!!!"
puts "================================================"

puts "Enter starting reg no"
puts "Enter ending reg no"
while reg!=ending+1
  reg_nos<< reg
puts "Enter file name to store results"
CSV.open("#{filename}.csv", "wb") do |csv|

reg_nos.each  do |reg_no|


    len.times do |i|

      if cells[i].text==" Hall Ticket No "
      elsif cells[i-1].text==" Hall Ticket No "
      elsif cells[i-1].text==" Course "
      elsif cells[i].text==" Name "
        csv << ["#{cells[i-1].text}","'#{cells[i+1].text}'"]
      elsif cells[i].text==" Course " || cells[i].text=~ /B.E.(.*)/  || cells[i].text=="Marks Details"  || cells[i].text=="Subject" || cells[i].text=="Grade" || cells[i].text=="Status"
       # puts i
      elsif cells[i].text=="PASS" ||   cells[i].text=="RA" ||   cells[i].text=="A"||   cells[i].text=="B"||   cells[i].text=="C"||   cells[i].text=="D"||   cells[i].text=="E"||   cells[i].text=="S"||   cells[i].text=="U"||   cells[i].text=~ /WH(.*)/ ||   cells[i].text=="AB"||   cells[i].text=="W"||   cells[i].text=="SA" ||   cells[i].text=="SE"||   cells[i].text=="A.B"||   cells[i].text=="I"||   cells[i].text=="WD"||   cells[i].text==" "
      elsif cells[i+1].text==" Course "
        puts row
        csv<< ["'#{cells[i].text}'","'#{cells[i+1].text}'","'#{cells[i+2].text}'"]
    csv<< [",",",",","]
puts "#{count} Results Scraped!"
puts "Results saved in '#{filename}.csv'. You can import this file into Excel or any other spreadsheet software"
puts "================================================"
puts "Thanks for using!"
puts "For more of my stuff goto github.com/steverob"
puts "Follow me on twitter @stev4nity"

I have written this code specifically for schools9.com which is a very popular results website that runs on a relatively faster server than that of Anna University’s website. The parsed results are stored in a comma separated values file (CSV) which can be easily opened in any Spreadsheet application. This website provides results as follows.

Firstly a HTML page is shown where we are supposed to enter our register number. It contains a div in which the results is loaded via a AJAX request when the submit button is clicked. The page responding to the AJAX request is an ASP page. What I did was I tried accessing the asp page directly and I got a blank page with only the requested result in a single table. Now I had to make sure that all results provided by schools9 are in the same format. I checked other results pages and everything was implemented in the same way. So all I had to do was get the url of page. Strip the “.htm or .html” extension and then append the URL with the “.aspx” extension and I had to parse the page to get the results.

The register number is passed as a parameter with the name “htno”. So what are we waiting for? Lets jump into the code!

First off we get the URL of the page and the actual extension of the page from the user. Then we remove the actual extension from the url and we append it with aspx. Thus we have got our URL ready. Then we get the starting and ending register numbers from the user and I load all the register numbers between the start and end into an array. Then we read the file name in which the results should be written.

Now we open a CSV file using the CSV class. Now inside the csv block we parse the results one by one. We repeat the parsing operation for every register number in the reg_no array.

To parse a HTML file, we need to open it first. This is done using the Nokogiri::HTML(open(“URL”)) method. This loads the HTML file. Now we can select the particular parent element by passing the element’s CSS selector to the css method of the object returned by above method. This returns the array of objects representing each element under the parent. Now each element can iterated and its content can be obtained by using the text method of the element.

Since the CSS selector of the table containing the results is td, we pass it to the css method to get the list of elements. Now we read the required information from those elements.

Finally the results are written to the csv file. This is how we parse results from schools9.com πŸ™‚

This tool can be used for getting results of any department of any college. I hope this would be useful for many colleges and students.

As always your comments and suggestions are welcome.

GitHub: http://github.com/steverob/annauni-results-schools9

I am planning to write a series of posts on that Mark analysis project of mine. You can check it out at http://github.com/steverob/students-marks.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s