I often encounter this problem when clients post a chunk of text from Word.
If you do a find and replace in word, often it will re-replace them with what was originally there.
What i've started doing is a find and replace within notepad... simple - but it works better.
Also, htmlentities shouldn't hinder your validation... it simply alters text. Maybe you have HTML tags in the text?
nl2br also helps when replacing "new lines" with "break tags"
You could probably fix this error by setting your HTML encoding type to ISO-8859-1 (western). You are most likely on UTF-8, because when I change the page encoding to that, those question marks pop up.