TalkPHP

TalkPHP (http://www.talkphp.com/forums.php)
-   General (http://www.talkphp.com/general/)
-   -   Remove duplicates from text array? (http://www.talkphp.com/general/3257-remove-duplicates-text-array.html)

eHobayyeb 08-18-2008 07:56 PM

Remove duplicates from text array?
 
Hi guys,

How can I remove duplicates from a text (.txt) file that contain list of Arabic words consider that these words in Arabic alphabet not in Latin alphabet but they are in the (UTF-8) Encodeing?

Please help me I am in a project to provide a spell checker for Arabic in the hunspell project. It is under GPL too.

Thanks,
Mohammad Alhobayyeb

delayedinsanity 08-18-2008 08:08 PM

Duplicate words?

PHP Code:

$arrfile('file.txt');
array_unique($arr); 

Then write it back to the file, or put er where ever suits you (in a database, in your pocket, in your cars cup holder, or in your wifes bra for safekeeping).
-m

delayedinsanity 08-18-2008 08:09 PM

Duplicate words?

PHP Code:

$arrfile('file.txt');
array_unique($arr); 

Then write it back to the file, or put er where ever suits you (in a database, in your pocket, in your cars cup holder, or in your wifes bra for safekeeping).
-m

edit: Btw this method is assuming each word is on its own line. You may have to get a little more complex if they're all on the same line or delimited in a different way.

eHobayyeb 08-19-2008 09:29 AM

Hi,

It is very good, but see what I faced:

Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 4097 bytes) in /var/www/test/test.php on line 8

is there any way to check for duplicates from a file or from a MySQL table instead of array?

eHobayyeb 08-20-2008 05:27 AM

Up!up!up!up!

buggabill 08-20-2008 12:41 PM

In MySQL you could write a query like the following:

sql Code:
SELECT fieldwithdupes, Count(fieldwithdupes) AS NumOccurances
FROM atable
WHERE fieldwithdupes
IN (SELECT fieldwithdupes
    FROM atable AS tmp
    GROUP BY fieldwithdupes
    HAVING COUNT(*)>1)
    GROUP BY fieldwithdupes;

This will get you a list of the `fieldwithdupes` in `atable` and the number of times that they are repeated. You could adapt this query to your needs.

xenon 08-20-2008 05:43 PM

I think he wants to select only the distinct words. In which case, this query would work perfectly:

Code:

SELECT DISTINCT(word) FROM table_name GROUP BY word
...assuming that you have each word on its own row.


All times are GMT. The time now is 06:43 AM.

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0