Monday, January 17, 2011

Monitor Comcast Usage Data

The good folks at Comcast have decided to put a 250GB cap on monthly usage as a direct assault against my beloved Roku.



I decided to put a script together to notify me in case I start approaching the monthly limit. My first thought was that this was a perfect task for mechanize, a Ruby library for interacting with web sites. I put that aside, however, to make an attempt to script the scraping with curl.

Those folks at Comcast are whack. When logging in to the home page, what is sent back is a redirect - no not an HTTP redirect. You are sent back a page that has a form in it with a "cima.ticket" that submits itself in the body onload event. Here is the somewhat functional script to pull the your Comcast bandwidth usage:

#!/usr/bin/env bash
USER=$1
PASSWD=$2
AGENT="Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.45 Safari/534.13"
function bail {
echo $1
exit $2
}
ticket=$(curl -s \
--user-agent "${AGENT}" \
--cookie-jar cjar \
--location \
--referer ";auto" \
-d "user=${USER}&passwd=${PASSWD}&rm=2&deviceAuthn=false" \
-d "forceAuthn=true&s=ccentral-cima&r=comcast.net" \
-d "continue=https://customer.comcast.com/Secure/Home.aspx" \
https://login.comcast.net/login | tee one.html | \
grep cima\.ticket | \
sed -e 's/.*cima.ticket" value="\([^"]*\).*$/\1/')
[ $? -eq 0 ] || bail "could not login and grab the ticket" $?
curl -s \
--user-agent "${AGENT}" \
--cookie-jar cjar \
--cookie cjar \
--location \
--referer "https://customer.comcast.com/Secure/Home.aspx;auto" \
--data-urlencode "cima.ticket=${ticket}" \
"https://customer.comcast.com/Secure/Home.aspx" > /dev/null
[ $? -eq 0 ] || bail "could not post the ticket" $?
usage=$(curl -s \
--user-agent "${AGENT}" \
--cookie-jar cjar \
--cookie cjar \
--location \
--referer "https://customer.comcast.com/Secure/Home.aspx;auto" \
https://customer.comcast.com/Secure/Users.aspx | \
grep "GB of" | sed 's/^[^>]*>\([^<]*\)<.*$/\1/')
[ $? -eq 0 ] || bail "could not find the usage" $?
echo "usage: ${usage}"
view raw gistfile1.sh hosted with ❤ by GitHub


I say somewhat functional because it doesn't work every time. It will fail to pull a result every now and then. There is a lot of wonkiness in the way the Comcast login works - and it appears this results in some kind of timing issue. Hopefully I have a chance to investigate a solution for this in the future.

Oh - back to mechanize. I googled for another solution to this problem and found this. It shows off the elegance and ease of use of mechanize, however, it seems to fail intermittently just like my script.