Web Link List Example In RubyWebLinkListExample in RubyLanguage using RegularExpression matching (because I'm not aware of anything in the standard library for parsing HTML):
#!/usr/bin/env ruby
require 'net/http'
require 'uri'
# usage: linklister.rb url
# example: linklister.rb http://www.google.com/
url = ARGV[0] ? ARGV[0] : 'http://www.google.com/'
parsed_url = URI.parse url
page = Net::HTTP.get parsed_url.host, parsed_url.path
nested_hrefs = page.scan /href\s*=\s*\"([^\"]*)|href\s*=\s*\'([^\']*)|href\s*=\s*([^\s\"\'>]+)/i
puts "*** hrefs in #{url} ***"
nested_hrefs.flatten.each do |href|
puts href if href
end
Output on 2005-12-30:
*** hrefs in http://www.google.com/ *** /url?sa=p&pref=ig&pval=2&q=http://www.google.com/ig%3Fhl%3Den https://www.google.com/accounts/Login?continue=http://www.google.com/&hl=en /imghp?hl=en&tab=wi&ie=UTF-8 http://groups.google.com/grphp?hl=en&tab=wg&ie=UTF-8 http://news.google.com/nwshp?hl=en&tab=wn&ie=UTF-8 http://froogle.google.com/frghp?hl=en&tab=wf&ie=UTF-8 /lochp?hl=en&tab=wl&ie=UTF-8 /intl/en/options/ /advanced_search?hl=en /preferences?hl=en /language_tools?hl=en /ads/ /services/ /intl/en/about.html-- ElizabethWiethoff
EditText of this page
(last edited December 30, 2005)
or FindPage with title or text search