TalkPHP

TalkPHP (http://www.talkphp.com/forums.php)
-   Advanced PHP Programming (http://www.talkphp.com/advanced-php-programming/)
-   -   Designing a tagging system (http://www.talkphp.com/advanced-php-programming/2444-designing-tagging-system.html)

Alan @ CIT 03-09-2008 09:11 PM

Designing a tagging system
 
Hi all,

I need some advice :-)

I want to implement a tagging system to an article script instead of the traditional category-based system.

For example, user writes their article then enters one or more tags "php, database, zend" for it. These tags are then used when searching to bring up relavent articles.

I have come up with three ideas for the design of the tagging system:

Idea 1

The articles table holds an additional field called tags where the comma-seperated list of tags are stored.

When saving tags we just update that articles tags column.

When searching for tags, a simple WHERE tags LIKE %$tag% query is used to get matching articles

When generating tag clouds, a query is run to fetch everything in the tags column then PHP tags over by splitting/merging/removing dupes from the tags array until we have our final list of tags.

Idea 2

Our articles table remains un-touched and we add a second table called tags This table has a structure similar to the following:

tag_id - An auto_increment id for this tag
tag - The tag itself (eg, "php")
articles - a space-seperated list of article_id's that use this tag (eg, "1 4 15 18 19 76")

When saving tags we would have to fetch the article_ids column and add/remove numbers as needed then do an SQL update on it.

When searching for tags we just do a simple search of the tags table to return article_id's - we then do another query to fetch those articles.

When generating tag clouds we just select the tags column and a count(article_ids) to generate the weighting, etc.

Idea 3

This one has three tables:

articles
tags
article_tags


Articles is just the regular un-touched articles table.

Tags would have the following structure:

tag_id - An auto_incrment id for this tag
tag - the tag itself

article_tags would have the following structure:

tag_id - links to the tag in the tags table
article_id - links to the article in the [inline]articlesp/inline] table

When saving tags, you just add a new tag to the tags/article_tags tables. When removing a tag you would just remove the row from the articles_tags table.

When searching for tags you would do a simple join query since all fields have a single value it would be simple.

When generating tag clouds you would just do a join on the tags table with a count(tag_id) for weighting

--------

Those are the options so far and I am leaning towards number 3 as it seems to be the most logical.

What do you guys thing? Anything I'm missing? Any other options you can think of?

Thanks,
Alan

ReSpawN 03-09-2008 09:23 PM

At first I'm leaning towards idea #1, but on second read, I'm gonna go with idea #3. The most logical one to me that is. If you search, you search on articles, but through the usage of tags. You can either search on the title (perhaps split by the whitespace) or on tags. Of course, a lot of other options can be implemented like date or author, but non of that now.

I think the most wise way to approach this is to indeed link them together.
Code:

---- TAGS Table
tag_id    |    int(15)      |    primairy, auto_increment, unique
article_id |    int(15)      |    unique    |    linked to an article
tags      |    varchar(255) |

---- ARTICLE Table
article_id |    int(15)      |    primairy, auto_increment, unique    |    used to link a tagtable
article_.. |    .......      |    ....

Since not every post has a tag, I think it is more logical to (on display) first retrieve the post, then check if there are tags (no? display a message perhaps) and of course, show the contents (and tags if available).
The process of searching is then rather simple. If the searchquery matches a tag from the tags table, it then shows every article and retrieves the information. Of course, linking them by ID to simply "SELECT article_id FROM articles WHERE article_id = $tagsQueryID".

I hope this gives you a bit of perspective.

dylanfm 03-09-2008 10:48 PM

Whenever I have built a tagging system I have gone with #3.

DeMo 03-10-2008 01:12 AM

Number 3 is indeed the most logical one and will make it easier to count tags and select what articles belong to what tags.

Inserting tags will be easy, you'll probably do an explode in the data, remove spaces and unwanted characters and then add each one to the DB.

What you really have to think about is tag maintenance. If somebody wants to edit the tags of his/her article, are you going to display each tag in a different form field and let the user edit/delete them individually? You could implode the tags back into a comma-seperated list and then you first delete all tags for that article and add them again. What I'm trying to point out is that #3 has it's advantages but it also makes tag maintenance a little more complicated than #1 for example. :-)

ReSpawN 03-10-2008 03:25 PM

The best way, like DeMo suggested, is to make it comma seperated , or perhaps with a more common way. Although I would suggest that, for the users sake, a comma and a space like in meta tags is the best way.

alan, cit, alancit, talkphp, respawn

First you would make the user enter their information. Then you would split it on the space-comma value and make the array filter out all sorts of invalid chars, like quotes and slashes. Do not add slashes, since I had difficulty with the search when I did.


All times are GMT. The time now is 01:28 AM.

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0