I'm looking to find out the best way to give a list of suggested answers to a user when they enter a question. For example like Kayako does when a user enters text for a new support ticket.
Is there some magical way of matching accurately or is it just a case of going through each word in the users text word by word and matching articles, seeing which match the most, sorting in ascending order and then displaying them?
Yes, but the user is going to be entering a paragraph of text for their question and then it needs to display matching articles from the KB using an Ajax search. Surely searching for articles with every word wouldn't be accurate - how do we rate how much an article matches the question entered - like a relevance score with more matches = higher relevance?
Step 1) Strip out common, unidentifable words to the article IE: and, but, I, we, are etc etc.
Step 2) out of each article, build an index of total words and how many times it appeared in the article.
Step 3) do the same for the new article being listed (build index of words & count)
Step 4) Find all keywords in new article which have a count greater then 3, then query against all of our previously submited articles of which contain all of the previously said keywords.
Step 5) Sort relevance by the articles which matched the most keywords from article new to article old
I personally would never do this with ajax though... way to resource intensive. I would only run this type of algorithim only when a new article is added, and cache the results in the database, in a new table containing a 1 > many relationship (article > related articles).