All posts by Dan Milne

Better World Books joins the list of sites Booko searches.

Thanks to Tyrone Butter and Andrew Wilson for suggesting another online bookstore – Better World Books.  This site has a flat $3.97 USD delivery fee to anywhere in the world. Even better, they have an easy to use API ( no screen scraping! ) and an Affilate program – which I’m yet to sign up for (requires American IRS paperwork!).

They also have a large selection of second hand books – I don’t include them in the results at the moment because it doesn’t seem fair to compare new pricing with old.  Any opinions out there on the value of including (somehow) second hand books?

In more geeky news, I’ve updated some of the testing code for Booko – it now tests the sites I scrape pricing information from to check if they have changed their HTML. Should make life easier.

Bug Fixes for IE

Well, I think I’ve found the bug that makes IE not work so well with Booko.  The section of the page which displays the prices is updated via AJAX (with a GET post), but the first chunk of text for that section (with the spinning icons) gets cached by IE. The solution is to add an Expires header to the response. The actions which respond to GET requests got the following code:

headers["Expires"] = "Mon, 26 Jul 1997 05:00:00 GMT"

I should probably put a more correct value in there – but it’s always going to be set in the past at some point.

A Mass of Booko related things.

It’s been a busy week for Booko. I sent out some email to almost everybody in my address book, advertising Booko. If that’s why you’re here then, “Hello!”. Hello if you’re a friend of a friend.  But that’s it.

Thanks to Wendy, who recommended a new store, which I’ve just added to Booko – The Book Depository. Looks like it’s pretty competitive. Based in the UK, but with free World Wide Shipping it’s a good addition to the lineup. This brings the total stores searched to 16 – a nice round, base 2 number.

Angus & Robertson have updated their site. Looks pretty nice. One big improvement is that it’s now possible to link directly to book pages. For some reason, the URL for all book pages all begin with “fiction.angusrobertson” – I’ll have to keep an eye on that.  Performance seems way better than their old site.

Thanks to everyone who found bugs. The biggest downside of including your boss in your mass mail out is if he manages to find a bug. And naturally he did (Rob also found this bug). Turns out IE 6 sucks. If you search for a book, you only ever get spinning icons indicating that it’s looking up the price. It actually caches that page and never displays the prices. For this, I’m sorry. I’m not sure if I’ll fix it. I’ll have to check the stats and see how many of you are still on IE6.

Other bugs reported include another IE bug related to Javascript (Thanks Gilly!) and Paul pointed out that Booko degrades poorly when JavaScript has been disabled.

Feel free to send in any errors / ommisions / faults / bugs to bugz@booko.com.au

Random Booko updates

Well, I’ve added a blurb to Booko’s front page. Should have done this a while ago – makes it feel more like a proper website. It’s probably a bit too much text, but I’ll work on it.

Fixed a bug reported by Timo and Dan B (And probably others) – clicking “Add to Cart” before the prices were fully loaded made the cart behave pretty strangely – usually you’d get several items in your cart or some such. The problem was that the cart price can’t be calculated until all the prices are available. I’ve hidden the link to add items to the cart until all the prices are correct.

Figured out how to set a timeout for getting prices from stores – and set it to 15 seconds.  May need to tweak this a bit. Some shops were taking too long to respond.  (Thanks Dan B for the suggestion)

Tweaked the HTML for the cart, adding another separator. This site is now, without doubt, very separated.

More Cover Art

I’ve updated Booko to now check multiple sites for cover art.  All the sites I use for data will return “No Image” style images for books which they don’t have covers for.

Such as this:

There’s no way to tell from the URL of the image if it’s one of these pretend covers, so I had to be a bit more clever about it. Since the image itself is always the same, I calculate the md5 sum of the image – if it matches a known “No Image” image, I move on to the next site.

Given an image URL, I calculate the md5 sum like this:

digest = Digest::MD5.hexdigest(open(url).read)

Which downloads the image and calculates the hash in one nice easy step.  So, from now on, Books will be far more likely to have correct cover art images.

Want to know the Format of a book?

Done! Well, where possible. I requested the guys at “The Nile” add the “Format” of a book to their API on Sunday. Literally 8am on Monday I got the response that they’d implemented it! That’s super fast! Amazon already has the feature and I’ve added it to the Fishpond scraper, so we would should be set. Unfortunately, not all books actually list the format of the book.  

I can see a common problem I have with getting book data from multiple sites – not all book “records” at each site contains all the information. Sometimes one site will have the format, but maybe not a good image of the cover. Sometimes a site will have neither. But generally, I just pick one sites data and go with that. It’s fast and easy. To get complete data, it looks like I’ll have to start making book records composites from multiple sites.

More little updates for Booko.

Some more small changes this arvo:

  • Changed the separator from solid to dotted, moved it around a bit, and added separators between the recent searches / your searches sections.
  • The “Last Updated” section now tells you how long ago the book was updated in relative terms.
  • The Cart has been shrunk and now has a dotted line to clearly separate it from the other content.

Another quick feature for Booko

I’ve added another feature to Booko – Booko now remembers books you’ve recently viewed. This should save you from having to search again, or add them to your cart to keep track of books you’re interested in. Of course, it’s only session based – lists will be different on different computers / browsers.

New Price grabber

I’ve updated the program which runs off and finds the prices on the 15 shops Booko scans. Previously, it would search for a single book at a time, but would search all shops at the same time.  Now it will check *all* books which are waiting for prices at the same time.  

One of the problems with the old approach is that some stores seem to be down as much as up – the price grabbing code will patiently wait until all shops have answered or the connection times out. When a site is down, other users may be waiting for over a minute for their books to be updated. The new approach means everyone should be getting prices as soon as they’re available.  

So, if you’re the kind of person who does a search, then opens multiple tabs from the search result, you should find the results are populated much, much faster.