![]() |
all about parsing
hi, where can i learn all about webpage parsing?
for example i had a variable that went to dictionary.com and grabbed the word and definition and stored it into the database something like that |
something i been wanting to look into myself.
|
Looks up the libcurl package php.net/curl and look into regex and that should get you on your way
|
Using regular expressions isn't necessarily the best, or easiest, method of grabbing content from HTML pages. Another alternative might be to use the DOM classes, perhaps in conjunction with XPath, to grab the required parts of the HTML documents.
|
I find simple XML very easy.
|
I wouldn't be using XML for parsing webpages to fetch a word like that, as if theres an XML parse error the script fails to solve your problem.
Personally I would use the string functions or regular expressions to match the markup where the data I would like to fetch exists. I've done this quite a few times when I needed to fetch online status on variously skinned forums to get my online status and such. But in the end it depends on what you want to grab ;) |
| All times are GMT. The time now is 09:55 AM. |
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0