Issue with gsub method in my Ruby code when trying to replace HTML tags with the URL stripped from in it


标签: htmlrubyhtml-parsingnokogirigsub


Because Nokogiri::XML::Element is neither a string nor a regexp. Sticking .to_s works:

puts message.gsub(

However, you are going to all the trouble of parsing the HTML just to search the document again as if you didn't know anything about it. Also, it will give a wrong result if you have multiple links in one message, or if your anchor tag is not formatted canonically — e.g. if you have an extra space, like this: <a href="https://www.google.com" >https://www.google.com</a>

Why not let Nokogiri work?

puts Nokogiri::HTML.fragment(message).tap { |doc|
  doc.css("a").each { |node|

Note that I changed Nokogiri::HTML.fragment, since this is not a full HTML document (with doctype and all), which Nokogiri would feel obligated to add. Then, for each anchor node, replace it with the value of its href attribute.
