Alex Pooley's Blog

Hello there, my name is Alex Pooley and I'm a freelance web developer residing in Perth, Western Australia. My passion is in the development of web sites that solve everyday problems. Here's a gallery of some of my notable work. If you need a web site designer or developer, contact me with further details. Lastly, you can read more about me.

Fuzzy Searching With Ferret

September 20th, 2008

Why don't you subscribe to my blog while you're here? I'm a freelance web developer and I blog about Ruby, Rails, and business online.

Go ahead and subscribe to my RSS feed. Thanks for visiting!

Searching With Fuzz

Searching With Fuzz

I’ve spent the better part of this afternoon trying to suss out a fuzzy searching system for my ruby on rails application. What I want to do is return results that include slightly miss-spelled words. I started playing with sphinx, but eventually realised that “fuzzy” in the land of sphinx really just means wildcards. So I settled on Ferret with the acts as ferret (AAF) rails plugin.

It was a bit of a battle to work out how to trigger a fuzzy search through AAF, and then a complete guess to work out how to change the minimum similarity score. So for your reference and mine, when making a multiple word fuzzy search using acts as ferret:

  • Suffix the two terms with a tilde (~) to indicate a fuzzy search
  • Suffx the tilde with a minimum similarity threshold between [0,1] to override the default threshold
  • Replace spaces with + signs. I’m not 100% sure on this one as I would have thought that surrounding the terms with quotes (”) would turn the query in to a fuzzy phrase search, but the results with quotes don’t match my thinking

E.g.

  • Company.find_with_ferret(’Sandalfr+Wine~0.7′)
  • Company.find_with_ferret(’name:Sandalfr+Wine~0.7′) # Search a column

Note that I have experienced, and read that others have also found Ferret to be unstable. Fortunately I only need ferret for offline processing. Sphinx looks really good for all other types of textual queries except for fuzzy searches.

Sergey Brin Parkinson’s Predisposition

September 20th, 2008
Scientist at work in the lab

Scientist at work in the lab

Sergey Brin has started a personal blog and in his first post he discusses the discovery of his greater than average likelihood of developing Parkinson’s disease. Despite the frank and sober tone of his post, I couldn’t help but giggle when I came across this paragraph towards the end…

This leaves me in a rather unique position. I know early in my life something I am substantially predisposed to. I now have the opportunity to adjust my life to reduce those odds (e.g. there is evidence that exercise may be protective against Parkinson’s). I also have the opportunity to perform and support research into this disease long before it may affect me. And, regardless of my own health it can help my family members as well as others.

I wish I had enough cash to fund my own research in to whatever genetic diseases I may have a predisposition too. Alas, I do not. For now I’ll just keep my head in the sand and avoid handing over a sample of my DNA to a scientific clairvoyant.

New Blog Performance

September 17th, 2008

Over a year ago I started a web site to collect e-mail leads. The site was really just an experiment, but over time it started showing some promise. I’ve since re-launched the site and turned it in to a Wordpress powered blog. The blog has the usual news section, but with original content re-purposed from other news sources - all manually powered. The blog also has a static content section that I update with content recycled from posts made to the blog section. This has enabled me to leverage my time/efforts while still producing something of value.

It’s been interesting watching the transition from Google Analytics. The site has started making an ascent which starts pretty much at the time of swapping from a one page lead generating site to a dynamic and “content packed” blog. The first week of going live produced the most hits on the site so far, with roughly twice as many hits as typical.
 


New Blog Graph

New Blog Graph. Click For Full Size.


As can be seen from the graph, the numbers themselves are barely worth writing about. However, there is a clear break in pattern which is intriguing. Hopefully this trend continues and my new approach of slow and steady content generation proves successful. I’ll keep you updated.

I should note that the combination of a static “education” section combined with a current news section so far appears quite powerful. If you are familiar with TechCrunch, then this arrangement is a little like what CrunchBase is to TechCrunch.

buy mp3 music uk vpn