TalkPHP
 
 
Account Login
Latest Articles
» The basic usage of PHPTAL, a XML/XHTML template library for PHP
» Vulnerable methods and the areas they are commonly trusted in.
» Simple way to protect a form from bot
» The Basics On: How Session Stealing Works
» How to keep your forms from double posting data
IRC Channel
IRC Speech Bubble Join the friendly bunch on IRC...
(#TalkPHP on Freenode)

...Also available via a web interface.

See this thread for information on the TalkPHP Free Hugs Initiative™. Subject to availability.
Associates
Associates
CSS Tutorials
Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 02-22-2008, 09:26 AM   #1 (permalink)
The Contributor
 
Devels's Avatar
 
Join Date: Nov 2007
Posts: 27
Thanks: 2
Devels is on a distinguished road
Default Encoding (utf8) problems

I always have troubles with UTF8.

The scenario: Mulitple sources (like xml, csv, own cms), and one database: Mysql. Information needs to be displayed on a website or outputted in XML to another program.

Do I need to store text like they are? Special settings needed (collocation?) in the mysql database, what php functions need to be used?
To display these on a webpage I use a utf-8 header using php or a meta tag saying it is utf8. But still special characters are displayed incorrect or the xml is broken because of an invalid character.

What is the best consistent way to avoid these problems, or has someone really good reading about this topic?
Devels is offline  
Reply With Quote
Old 02-22-2008, 06:53 PM   #2 (permalink)
Super Moderator
Advanced Programmer 
 
bluesaga's Avatar
 
Join Date: Sep 2007
Posts: 165
Thanks: 0
bluesaga is on a distinguished road
Default

Quote:
Originally Posted by Devels View Post
I always have troubles with UTF8.

The scenario: Mulitple sources (like xml, csv, own cms), and one database: Mysql. Information needs to be displayed on a website or outputted in XML to another program.

Do I need to store text like they are? Special settings needed (collocation?) in the mysql database, what php functions need to be used?
To display these on a webpage I use a utf-8 header using php or a meta tag saying it is utf8. But still special characters are displayed incorrect or the xml is broken because of an invalid character.

What is the best consistent way to avoid these problems, or has someone really good reading about this topic?
Hi Devels,

The UTF-8 common issues are a major one, and there is a few things that can be done to get around issues and irregularitys with it.

1) Use the PHP DOM for xml, xml by default has to be utf-8 encoded to work and the DOM enforces this for that, so for XML all is well.
2) In your PHP settings ensure you have output and input set to use UTF-8
3) In Mysql set your database to use utf8_general or something very similar to that, not sure what its called exactly but has the words utf8 and general within it!
4) If you are scraping data from the web, ie websites you are going to need to go about some detection and conversion. Iconv, mbstring are two things you should look into RE that. You should first check the HTTP header for the charset (from the source server) and then the html for meta tags or some other indicator. Failing that, there is a mb_detect_encoding, which can be used but it is slow and pretty crummmy at standard detection!
5) Install mbstring, and any functions like strlen can be replaced with mb_strlen to be utf-8 compliant (until PHP 6 arrives!)

Hope that helps you!
__________________
Halo 3 Cheats
bluesaga is offline  
Reply With Quote
The Following User Says Thank You to bluesaga For This Useful Post:
Devels (02-22-2008)
Old 02-22-2008, 10:24 PM   #3 (permalink)
The Contributor
 
Devels's Avatar
 
Join Date: Nov 2007
Posts: 27
Thanks: 2
Devels is on a distinguished road
Default

Great info!
I saw a great article that just explains exactly my problems: Turning MySQL data in latin1 to utf8 utf-8 - O'Reilly ONLamp Blog
Devels is offline  
Reply With Quote
Old 02-24-2008, 03:19 PM   #4 (permalink)
The Contributor
 
Join Date: Nov 2007
Posts: 32
Thanks: 5
Morishani is on a distinguished road
Default

Thanks bluesaga.
__________________
מטבחים (hebrew)
Send a message via ICQ to Morishani Send a message via MSN to Morishani
Morishani is offline  
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 11:33 AM.

 
     

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Inactive Reminders By Icora Web Design