Alex Pooley's Blog

Hello there, my name is Alex Pooley and I'm a freelance web developer residing in Perth, Western Australia. My passion is in the development of web sites that solve everyday problems. Here's a gallery of some of my notable work. If you need a web site designer or developer, contact me with further details. Lastly, you can read more about me.

Fuzzy Searching With Ferret

September 20th, 2008

Why don't you subscribe to my blog while you're here? I'm a freelance web developer and I blog about Ruby, Rails, and business online.

Go ahead and subscribe to my RSS feed. Thanks for visiting!

Searching With Fuzz

Searching With Fuzz

I’ve spent the better part of this afternoon trying to suss out a fuzzy searching system for my ruby on rails application. What I want to do is return results that include slightly miss-spelled words. I started playing with sphinx, but eventually realised that “fuzzy” in the land of sphinx really just means wildcards. So I settled on Ferret with the acts as ferret (AAF) rails plugin.

It was a bit of a battle to work out how to trigger a fuzzy search through AAF, and then a complete guess to work out how to change the minimum similarity score. So for your reference and mine, when making a multiple word fuzzy search using acts as ferret:

  • Suffix the two terms with a tilde (~) to indicate a fuzzy search
  • Suffx the tilde with a minimum similarity threshold between [0,1] to override the default threshold
  • Replace spaces with + signs. I’m not 100% sure on this one as I would have thought that surrounding the terms with quotes (”) would turn the query in to a fuzzy phrase search, but the results with quotes don’t match my thinking

E.g.

  • Company.find_with_ferret(’Sandalfr+Wine~0.7′)
  • Company.find_with_ferret(’name:Sandalfr+Wine~0.7′) # Search a column

Note that I have experienced, and read that others have also found Ferret to be unstable. Fortunately I only need ferret for offline processing. Sphinx looks really good for all other types of textual queries except for fuzzy searches.

Two Coding Gotchas: Javascript & Ruby

May 24th, 2008

Nailed by two coding gotchas in two days. Argh!

Here’s the one in Javascript.

js> var a = (100).toFixed(2);
js> a
100.00
js> var b = (20).toFixed(2);
js> b
20.00
js> a > b
false

a is 100.00, and b is 20.00. 100.00 is greater than 20.00 right? Yes, except that they’re strings! toFixed returns strings, and any comparison is a string comparison and not a numeric comparison!

Now, here’s the Ruby one that got me. If you’re not a Ruby person then this one might be a little harder to pick.

irb(main):001:0> class Klass
irb(main):002:1>   def meth=(value)
irb(main):003:2>     return 'return this please'
irb(main):004:2>   end
irb(main):005:1> end
=> nil
irb(main):006:0>
irb(main):007:0* obj = Klass.new
=> #<Klass:0x83180>
irb(main):008:0> obj.meth = "Do not return this please"
=> "Do not return this please"

Well, I wanted to return “return this please” on the assignment, but instead Ruby ignored me and returned the value of the assignment instead. What if I want to indicate that the assignment failed? My only option is to throw an exception.

I can appreciate that Ruby wants to keep things consistent and always return the assignment value, but can Ruby appreciate my view of consistency and return when I say to return!

If you can tell me why Ruby behaves this way I would appreciate it if you could leave your mark in my comments section.


Save To Delicious

Ruby Browser Hooks & Facebook Integration Testing

May 18th, 2008

Recently I had to write some integration tests to ensure that my Facebook application was communicating with Facebook correctly. I started out trying to use plain old Net::HTTP which was painful because Facebook appears to actively block programmatic access through their login page. Just as I was about to give up I discovered a suite of libraries that allow you to access your browser from Ruby! The library is fantastic as I can very simply control the forms and content in my browser, straight from my Ruby code. Nice!

The library is called Watir. Watir is a tool to hook in Ruby with IE, but there are ports for Safari and Firefox. I ended up going with the Safari port as the Firefox port looked a bit tricky to install.

Now, for the pièce de résistance …

require File.dirname(__FILE__) + '/../spec_helper'
require 'rubygems'
require 'safariwatir'

describe UserController do
  FB_EMAIL = 'xxx@xxxxx.com'
  FB_PASS = 'xxxxx'

  FB_URL = 'http://www.facebook.com'
  CALLBACK_URL = 'http://dev.xxxxxxxxx.com'
  FB_APP_URL = 'http://apps.facebook.com/xxxxx/'
  FB_LOGOUT_URL = 'http://www.facebook.com/logout.php'

  LOGOUT_REGEX = /http:\/\/www.facebook.com\/logout.php\?.+/

  it "should use UserController" do
    controller.should be_an_instance_of(UserController)
  end

  describe "POST 'index'" do
    before(:each) do
      @browser = Watir::Safari.new
      @browser.set_fast_speed
      @browser.goto(FB_APP_URL)
      @browser.link(:url, LOGOUT_REGEX).click rescue nil
    end

    it "should find authentication prompt" do
      @browser.goto(FB_APP_URL)
      @browser.form(:id, 'loginform').exist?.should_not == nil
      expected = 'Login to Facebook to enjoy the full functionality of'
      @browser.contains_text(expected).should > 0
    end

    it "should authenticate" do
      authenticate
      expected = /Login to the .* application?/
      @browser.contains_text(expected).should != nil
    end

    it "should log in to the application" do
      authenticate
      @browser.form(:index, 1).submit
    end

  end

  private

  def authenticate
    @browser.goto(FB_APP_URL)
    @browser.text_field(:id, 'email').set(FB_EMAIL)
    @browser.password(:name, 'pass').set(FB_PASS)
    @browser.form(:index, 1).submit
  end

end

The code above is a work in progress, but it is still fully functional rspec code. Feel free to use it.

Just a note if you do end up using the Safari Watir port. I had to make a change to the core libraries as there seemed to be a race condition. If you are affected, you will notice an exception thrown part way through entering the data in a form. My quick fix/hack is to extend the sleep time from 1 to something larger. I had no problems after increasing the sleep time to 4.

# safariwatir/scripter.rb
def page_load
	yield
	#sleep 1
	sleep 4
	....
end
buy mp3 music uk vpn