![]() |
Parsing HTML
Hi guys,
I don't even know where to start with this one. I have about a hundred html directory pages, and I want to convert it into a mysql database. All of the listings are enclosed in <li> and <hr> tags, so I think this should be fairly easy. The hard part will probably be extracting the information out of the listing. I found lots of resources for scraping links, but I'm not experienced enough to convert that to my application. This is where I'm at so far: Code:
$url = "#"; |
Take a look at the PHP Simple HTML DOM Parser:
http://simplehtmldom.sourceforge.net/ It makes extracting information from HTML pages very easy. |
Quote:
Thanks, I already read into that, but I still don't know how to make that work with what I have. How can I get it to parse just from the <li> tag and stop at the <hr> tag, and repeat? |
So it would work with just getting the nodevalues? domdocument?
PHP Code:
cheers |
Thank you very much guys, but I still don't know how to get it to stop at the <hr> and repeat at the next <li>
|
Quote:
|
Okay, this is how they're all structured. This one has all of the information, some of them are missing parts.
Code:
If it is impossible, then I guess I could just copy the entire code to the database as a listing, and not allow the old users to have an edit feature. |
| All times are GMT. The time now is 12:14 PM. |
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0