Re: Keith Devens - Weblog: RSS auto-discovery with PHP - June 03, 2002

PHP での RSS auto-discovery サンプル



I'm in the process of writing my own RSS aggregator.

Keith Devens - Weblog: RSS auto-discovery with PHP - June 03, 2002





ただし、私が確認した上では、parse_url の区切り方が妖しかったので(URIがディレクトリ名で終わる場合)下記の通りに改良してみた。

(下記処理はiblog用weblog update pingサーバにて実装済み)
// RSS 自動取得

function getRSSLocation($html, $location){

    if(!$html or !$location){

        return false;

    }else{

        #search through the HTML, save all <link> tags

        # and store each link's attributes in an associative array

        preg_match_all('/<link\s+(.*?)\s*\/?>/si', $html, $matches);

        $links = $matches[1];

        $final_links = array();

        $link_count = count($links);

        for($n=0; $n<$link_count; $n++){

            $attributes = preg_split('/\s+/s', $links[$n]);

            foreach($attributes as $attribute){

                $att = preg_split('/\s*=\s*/s', $attribute, 2);

                if(isset($att[1])){

                    $att[1] = preg_replace('/([`\'"]?)(.*)\1/', '$2', $att[1]);

                    $final_link[strtolower($att[0])] = $att[1];

                }

            }

            $final_links[$n] = $final_link;

        }

        #now figure out which one points to the RSS file

        for($n=0; $n<$link_count; $n++){

            if(strtolower($final_links[$n]['rel']) == 'alternate'){

                if(strtolower($final_links[$n]['type']) == 'application/rss+xml'){

                    $href = $final_links[$n]['href'];

                }

                if(!$href and strtolower($final_links[$n]['type']) == 'text/xml'){

                    #kludge to make the first version of this still work

                    $href = $final_links[$n]['href'];

                }

                if($href){

                    if(strstr($href, "http://") !== false){ #if it's absolute

                        $full_url = $href;

                    }else{ #otherwise, 'absolutize' it

                        $url_parts = parse_url($location);

                        #only made it work for http:// links. Any problem with this?

                        $full_url = "http://$url_parts[host]";

                        if(isset($url_parts['port'])){

                            $full_url .= ":$url_parts[port]";

                        }

                        if($href{0} != '/'){ #it's a relative link on the domain

                     $dir = split("/", $url_parts['path']);

                     $count = count( $dir );

                     for ( $i=0; $i < $count-1; $i++ ) {

                        $full_url .= $dir[$i] . '/';

                     }

                        }

                        $full_url .= $href;

                    }

                    return $full_url;

                }

            }

        }

        return false;

    }

}

プログラミング関係 > LL : comments (0) : trackbacks (0) ブックマークに追加する

Comments

Comment Form

  

Trackbacks

Trackback url :
なかのひと
SiteSearch Google
Google
Web
underdone.net
blog.underdone.net
Blog Pet
Amazon
Categories
Profile
Other
  • Blog Ranking
  • RSS feed meter for http://blog.underdone.net/