Nitrospirae is a bot that I have written to crawl over 150+ RSS feeds and collect them in 1 single site.
Those 150+ RSS feeds are mostly IT News, PHP information, and other news which I personally favor, but it can be expanded of course, to grab many more information on the internet.
To put it simple, the purpose of this site is to gather the best of the best in 1 site and make them searchable. It can also act as a sort of IT "newspaper" which you can read everyday to get a grasp on what's going on around. You may find several articles on a same news which is written by several different sites.
Just a note, the comparison algorithm is very slow O(N**3); n := str.length. This means that if a string is 5 characters long it will take 125[units], a bubble sort has a complexity of O(N**2), and a quick sort has O(n * log(n)) [best case].
What i'm saying is the comparison will take a _long_ time, so making users wait for the entire article to be scanned and then compared would be silly. I would suggest a [1,2,6,12,24] hour cron job (dependent on the server load and processing power) to check and flag/delete any similar articles.