View Single Post
Old 05-14-2008, 07:33 PM   #4 (permalink)
drewbee
The Acquainted
 
drewbee's Avatar
 
Join Date: May 2008
Posts: 175
Thanks: 9
drewbee is on a distinguished road
Default

Here is how I would do it.

Step 1) Strip out common, unidentifable words to the article IE: and, but, I, we, are etc etc.
Step 2) out of each article, build an index of total words and how many times it appeared in the article.
Step 3) do the same for the new article being listed (build index of words & count)
Step 4) Find all keywords in new article which have a count greater then 3, then query against all of our previously submited articles of which contain all of the previously said keywords.
Step 5) Sort relevance by the articles which matched the most keywords from article new to article old

I personally would never do this with ajax though... way to resource intensive. I would only run this type of algorithim only when a new article is added, and cache the results in the database, in a new table containing a 1 > many relationship (article > related articles).
Send a message via AIM to drewbee
drewbee is offline  
Reply With Quote