TalkPHP

TalkPHP (http://www.talkphp.com/forums.php)
-   General (http://www.talkphp.com/general/)
-   -   Importing text (http://www.talkphp.com/general/2892-importing-text.html)

oMIKEo 06-03-2008 07:58 PM

Importing text
 
Hi,

Im importing alot of text that contains bits such as:

“ and ” and ’

These are not understand in IE/FF and come up as either squares or diamond question marks.

Any ideas how i can dynamically remove them just before echoing out the text?

Regards,
Michael

xenon 06-03-2008 09:56 PM

You could call htmlentities on that piece of text and see if that works. Otherwise, use a multibyte encoding.

Jim 06-04-2008 07:28 AM

htmlentities work i thought but (in my case) it gives errors in W3C. I recommend replacing them with " and ' which is what i simply do :>

Folio 06-04-2008 08:17 AM

I often encounter this problem when clients post a chunk of text from Word.
If you do a find and replace in word, often it will re-replace them with what was originally there.
What i've started doing is a find and replace within notepad... simple - but it works better.
Also, htmlentities shouldn't hinder your validation... it simply alters text. Maybe you have HTML tags in the text?
nl2br also helps when replacing "new lines" with "break tags"

hope that helps

oMIKEo 06-09-2008 08:37 PM

I have tried the following but it doesnt seem to do anything at all:

PHP Code:

$biog str_replace("’""'"$biog);
$biog str_replace("‘""'"$biog);
$biog str_replace("“""\""$biog);
$biog str_replace("”""\""$biog); 

I couldnt work out what options to use with htmlentities as it always seemed to do nothing.

There is too much information to go through manually and correct them by hand.

Thanks for any more advice.

Village Idiot 06-09-2008 09:34 PM

You could probably fix this error by setting your HTML encoding type to ISO-8859-1 (western). You are most likely on UTF-8, because when I change the page encoding to that, those question marks pop up.

oMIKEo 06-09-2008 10:09 PM

i changed to this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

But they still come up as squares?


All times are GMT. The time now is 07:53 AM.

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0