08-08-2009, 08:55 AM
|
#17 (permalink)
|
|
The Contributor
Join Date: Jul 2009
Posts: 80
Thanks: 13
|
Hi guys I have the following code working,
Code:
<?php
$site = "http://www.techcrunch.com/2009/08/07/geopolitical-attacks-on-twitter-intensified-almost-tenfold-last-night/";
$html = file_get_contents($site);
$dom = new DOMDocument();
libxml_use_internal_errors(TRUE); // Shhhut up!
$dom->loadHTML($html);
libxml_use_internal_errors(FALSE); // Ok, you can complain now.
$xpath = new DOMXPath($dom);
$as = $xpath->evaluate("/html/body//a");
foreach($as as $a)
{
echo '<br />' . $a->getAttribute('href');
}
?>
As you will see when you run this script there are linkes like #comments
What I would like to do is DELETE any links that come out starting with #
Also is there anyway to send data to a site like when google spiders your site they show GoogleSpider is there any way to show mySpider?
|
|
|
|