Summer fixes
Couple of little fixes, the largest of which is the search functionality is no longer powered by AJAX. This means the Back button will work correctly after you’ve searched for a book, then gone and viewed a book, then clicked back. Surprisingly, Safari actually did clever stuff to make this work – clicking back in Safari would take you back to the search results – in every other browser you’d be taken an unexpected page – usually the front page.
Secondly, I’ve added HTML5 attributes to various fields as described by Mark Pilgrim. The most obvious will be to Safari and Chrome users (at least the Mac version) where the search box will have round corners and stuff. Other fields such as the email entry on login and register, and the OpenID field will now be easier to use on the iPhone, with alternative keyboard making it easier to enter that data.
Let me know what you think.
Google / Yahoo user?
Logging into Booko just got easier. If you have a Google or Yahoo account, just hit the appropriate button and you’re in, registration included.
Turns out, this was super easy to add to Booko since it already does vanilla OpenID logins. Basically, Booko just fills in the OpenID URL with “https://www.google.com/accounts/o8/id” or “http://yahoo.com/” and OpenID Directed Identity does the rest. You could also just type those URLs in yourself and it’ll work just the same. Sweet!
New Booko features
Hey everyone, Booko has some new features.
User accounts – you can now create an account to save your cart. Booko accepts OpenID also – so you can use your OpenID provider to log in to Booko.
If you have a Booko account, you can now create additional lists for keeping track of books. For example, you could create a wish list, or a scifi list, or a kids’ book list or a list of all Tintin books. Just go to the “Manage Lists” page and go to town.
Finally, you can make lists public and sharable. Once you’ve created a list, just tick the “Public?” checkbox. Booko will then create a public URL for you to use. For example, here’s a list I made of all Tintin comics:
http://www.booko.com.au/lists/view/nx9deGGd9Qgy2SE8
You can copy all the items from any public list into your own cart or into any of your lists.
Let me know what you think.
Brief outage for server upgrade.
Shouldn’t be long.
Update: And we’re back. Took a bit longer than expected though.
Updated cart implementation
Tonight I introduced a new shopping cart implementation to Booko. This new cart code is in preparation for allowing Booko users to create lists of books and to enable people to save these lists in their user accounts (coming soon). The shopping cart will be one of these lists.
The new cart includes the ability to increase the number of any particular title without needing to visit that book’s page.
I wasn’t able to migrate existing carts across to the new system, but any book you’ve viewed recently will show up in the ”Your Recently Viewed” section. Enjoy!
Playing on the Master branch
I’m working on adding a new feature to Booko, but I accidentally started working on the Git Master branch, which I like to keep sync’d with the production version of Booko. So after a few commits I want to be able to add some fixes to Booko, but I’ve polluted my master branch with untested, unfinished changes. What to do?
After reading up on Stack Overflow, I decided to fix things. What I want to do, is reset my master branch to the version in production, and take all the subsequent commits and create a new branch with them. Turns out, it’s easy.
1. Find the commit you want the master branch to be at. You can find the SHA-1 name with “git log”
2. Create a branch from that commit with: git checkout -b new_master <SHA-1 commit name> (hint: this will be the new master )
3. Rename your current master branch to the new name of the feature: git branch -m master branchname
4. Rename the new_master to master: git branch -m new_master master
And, job done. You’re no longer messing up your master.
So awesome.
There are times I really enjoy using Ruby on Rails. Recently, Fishpond started 403′ing http requests for cover images if the referrer isn’t fishpond.com.au. Sites do this so that other sites don’t steal their bandwidth. Really, Booko should be downloading the images and serving them itself (It’s on the todo list BTW). Since Booko had been using Fishpond image URLs to display covers, you may have noticed a bunch of missing cover images – some of them are caused by Fishpond’s new (completely reasonable) policy.
So I’ve updated the code so I don’t link to Fishpond images, but now I need to go through every product Booko’s ever seen and update those with a Fishpond image URL. This is laughably easy with ruby on rails. Just fire up the console and run this:
Product.find_each do |p|
if p.image_url =~ /fishpond/
puts "updating details for #{p.gtin}"
p.image_url=nil
p.get_detail
p.save
end
end
The Rails console gives you access to all the data and models of your application – and this code, just pasted in, will find links to all Fishpond images, find a replacement image, or set it to nil. Point of interest – Booko has 396,456 products in its database. Iterating with Product.all.each would load every product into memory before hitting the each – that would probably never return. On the other hand Product.find_each loads records in batches of 1000 by default. Pretty cool.
* Thanks to http://ryandaigle.com/ to posting about this feature.
Fun with git post-commit
While developing new features or bug fixes Booko, I usually work in branches. This makes keeping things separate easy, and means I can easily keep the current production version clean and easy to find. But when changing branches I often have to restart the rails server and the price grabber to pickup any changes. For example, if I’m adding a new shop in a branch, when I switch branches I want the price grabber to restart.
Turns out git makes this super easy. You just create a shell script: .git/hooks/post-checkout
That script gets called after checkout. So, mine is pretty simple:
#!/bin/sh ./bin/fetch_price.rb 0 restart; thin restart
There’s probably a better way to get Thin to reload itself, but this works nicely.
You can checkout all the hooks here: http://www.kernel.org/pub/software/scm/git/docs/v1.5.5.4/hooks.html
Now with all new REE + Phusion.
The excellent people at Phusion have released a 1.8.7 based version of their fantastic Ruby Enterprise Edition. I’ve just updated to it and Booko sure feels snappier. I’ve also upgraded to the latest mod_rails (aka, phusion-passenger) so we’re all up-to-date on my medium-ticket sysadmin work.
On Users and Passwords
Update: thanks to the commenters for pointing out some flaws in the logic of the previous version of this page. I’ve updated the page to incorporate their feedback.
I’ve been thinking about adding wish lists to Booko. Wishlists require an implementation of Users – after all, what’s the point of having Wishlists if you can’t change them or publish them? Booko’s built on Ruby on Rails, so I had a look around for plugins, but, truth be told, I’m too much of a Ruby n00b to trust other people’s plugins. I’m sure they’re easy to install, but how do you keep them up to date? Finding any kind of bug will mean reading and understanding the code and seriously, that’s as much work as implementing Users on my own. Plus, I have concerns about their implementation which I’ll discuss more later.
So, I figure having users requires two bits of data:
- Email address
- Password
Having the email address as the login name makes sense to me – it’s unique and if someone forgets their password I can email them a password reset link. No need to remember a separate username. One day I’ll add OpenID because I’m a freetard and like the concept.
Now, passwords are valuable bits of information and they need to be protected. This may sound obvious, but they need to be protected for a couple of reasons:
- only the actual user (or owner of the email address) can log in and manipulate Booko Wishlists
- many users use the same email address and password on other sites (PayPal & Amazon for example).
Maybe you don’t do this, but if you have a separate password for every site requiring login, you’re a better person than me. The consequence of someone getting hold of your email and password can lead to some … difficulties on sites you’ve used the same email address and password.
So, we need to protect passwords from prying eyes. There are two main ways for your password to be discovered:
- Database compromise
- Web sniffing proxies.
So, what to do? What we do is turn to hashing functions. Hashing functions (like MD5 & SHA*) can take information like a password and send it on a one-way trip. In effect, it is impossible to work backwards from the hash to the password, however, some clever people have calculated the hash of hundreds or millions of passwords and stored the hash – then they can simply lookup the hash stored in Booko with their pre-calculated hashes and find the password. This is known as a dictionary attack.
To thwart this attack, we introduce a “salt”. The salt is a random string of characters which is combined with the password before it is hashed. This means that the dictionary attack is now useless – they would need an entire dictionary which included your salt combined with all those guessed passwords. The correct way of combining a password with a salt is to use a HMAC function. In Ruby you can do this like this:
require 'hmac-sha1'
def self.do_hashing(password, salt)
passwd_hmac = HMAC::SHA1.new(salt)
passwd_hmac << (password)
passwd_hmac.hexdigest
end
Ideally, each User has their own salt. This means an attacker would need to generate an entire dictionary attack per user.
In any case, this is where the other implementation of Users seem to stop. When you type your password into a form on a web page, it gets sent to the server – the server hashes your password (along with the salt) and checks if that hash matches the stored hash. If they match, you’re in.
But there is still the matter of your password being sent over the internets possibly via proxy servers which can listen in on the traffic. There’s two ways to stop the password being sent in the clear. Either hash the password first, or use SSL. If you’re using SSL to send the passwords, you could stop here. Booko currenly doesn’t have SSL certificates so we need to stop passwords travelling over the internet in the clear. How do we do that? Easy, we hash the password before sending it over the internet. We’ll create a single salt for this hash and call it the transport salt.
On the server side, we’ll also hash the password with the transport salt, prior to hashing it with the per-user salt.
This means we send the browser the transport salt, and the browser calculates the hash. This hash is sent to the server. The server can validate the password as shown below:
require 'hmac-sha1'
def self.do_hashing(password, salt)
passwd_hmac = HMAC::SHA1.new(salt)
passwd_hmac << (password)
passwd_hmac.hexdigest
end
final_hash_to_check = do_hashing(password_hash_from_browser, user_salt)
if final_hash_to_check == stored_password_hash
# User provided correct password.
session[:user_id] = user.id
end
In the code above, when the response comes back from the client, the server calculates the final hash by using the user_salt. This final hash is compared with the stored_password_hash – if they’re the same the client provided the correct password.
So, where does this leave us?
- At no point is the password sent in the clear, nor stored in the clear.
- The client only sees the transport salt, not the per-user salt and can’t precalculate a dictionary attack
- Each user has a separate salt, making it far more difficult for an attacker to perform a dictionary attack
- Compromising the database and retrieving the hashes doesn’t allow you log in with that hash
- Compromising the database and retrieving the hashes doesn’t allow you to log onto any other sites
- Sniffing the hashed transport hash will allow an attacker to access your account
So, we’ve achieved some of our goals. The password is never sent in the clear. However, if an attacker snoops traffic between the client and server and get’s a copy of the hashed password, that password can be used to log on to the service.
If that’s unacceptable, then moving to SSL is the next step. You can use SSL to protect the hash as it moves between the client and server. However you can also use SSL to protect the plain password being discovered too. Is there any point in doing both? Hashing the password prior to SSL is slightly more secure. If you hash the password before it leaves the client, there’s no danger of the password appearing in the log files of your web sever or application server. If you have a dedicated host decrypting the SSL prior to passing it to your web server, the password could be sniffed between those servers.
After all that, I’ve decided that Booko will use both hashing the password before transmission and, eventually, SSL.
Notes on SHA1 and extra security:
SHA1 hashes are very secure and can be calculated very fast. The faster the hash, the faster you can create a dictionary attack. Ideally for this scenario, we want a slow hashing function, or, more correctly, we want the method that generates our hashes to be slow. This can easily be achieved by simply hashing the password multiple times. Here’s some timings of hashing:
>> require 'hmac-sha1'
>> require 'benchmark'
>> salt = (0..256).map { ((0..9).to_a + ('a'..'z').to_a + ('A'..'Z').to_a).rand }.join
>> hash = nil
>> Benchmark.realtime { hash = HMAC::SHA1.new(salt)<< "MyPassword" }
=> 6.60419464111328e-05
>> Benchmark.realtime { 100.times { hash = HMAC::SHA1.new(salt) << hash.hexdigest } }
=> 0.00437498092651367
>> Benchmark.realtime { 1000.times { hash = HMAC::SHA1.new(salt) << hash.hexdigest } }
=> 0.0426349639892578
>> Benchmark.realtime { 10000.times { hash = HMAC::SHA1.new(salt) << hash.hexdigest } }
=> 0.463771820068359
>> Benchmark.realtime { 100000.times { hash = HMAC::SHA1.new(salt) << hash.hexdigest } }
=> 4.64294099807739
So, 10,000 hashes took around 1/2 second on my 2.8Ghz Core 2 Duo MacBook Pro. You can see that performing 10,000 hashes will seriously slow any attempt at creating a dictionary attack on your passwords. I haven’t timed how long it takes in Javascript, but you’d want to keep it under a second to make user log in not too painful.
Links:
You can find javascript libraries for doing HMAC on PAJ’s Homepage or jsSSH on sourceforge.