class Hpricot::Elements
Once you've matched a list of elements, you will often need to handle them as a group. Or you may want to perform the same action on each of them. Hpricot::Elements is an extension of Ruby's array class, with some methods added for altering elements contained in the array.
If you need to create an element array from regular elements:
Hpricot::Elements[ele1, ele2, ele3]
Assuming that ele1, ele2 and ele3 contain element objects (Hpricot::Elem, Hpricot::Doc, etc.)
Continuing Searches¶ ↑
Usually the Hpricot::Elements you're working on comes from a search
you've done. Well, you can continue searching the list by using the
same at
and search
methods you can use on plain
elements.
elements = doc.search("/div/p") elements = elements.search("/a[@href='http://hoodwink.d/']") elements = elements.at("img")
Altering Elements¶ ↑
When you're altering elements in the list, your changes will be reflected in the document you started searching from.
doc = Hpricot("That's my <b>spoon</b>, Tyler.") doc.at("b").swap("<i>fork</i>") doc.to_html #=> "That's my <i>fork</i>, Tyler."
Getting More Detailed¶ ↑
If you can't find a method here that does what you need, you may need to loop through the elements and find a method in Hpricot::Container::Trav which can do what you need.
For example, you may want to search for all the H3 header tags in a
document and grab all the tags underneath the header, but not inside the
header. A good method for this is next_sibling
:
doc.search("h3").each do |h3| while ele = h3.next_sibling ary << ele # stuff away all the elements under the h3 end end
Most of the useful element methods are in the mixins Hpricot::Traverse and Hpricot::Container::Trav.
Constants
- ATTR_RE
- BRACK_RE
- CATCH_RE
- CUST_RE
- FUNC_RE
Public Class Methods
Given two elements, attempt to gather an Elements array of everything between (and including) those two elements.
# File lib/hpricot/elements.rb, line 319 def self.expand(ele1, ele2, excl=false) ary = [] offset = excl ? -1 : 0 if ele1 and ele2 # let's quickly take care of siblings if ele1.parent == ele2.parent ary = ele1.parent.children[ele1.node_position..(ele2.node_position+offset)] else # find common parent p, ele1_p = ele1, [ele1] ele1_p.unshift p while p.respond_to?(:parent) and p = p.parent p, ele2_p = ele2, [ele2] ele2_p.unshift p while p.respond_to?(:parent) and p = p.parent common_parent = ele1_p.zip(ele2_p).select { |p1, p2| p1 == p2 }.flatten.last child = nil if ele1 == common_parent child = ele2 elsif ele2 == common_parent child = ele1 end if child ary = common_parent.children[0..(child.node_position+offset)] end end end return Elements[*ary] end
# File lib/hpricot/elements.rb, line 274 def self.filter(nodes, expr, truth = true) until expr.empty? _, *m = *expr.match(/^(?:#{ATTR_RE}|#{BRACK_RE}|#{FUNC_RE}|#{CUST_RE}|#{CATCH_RE})/) break unless _ expr = $' m.compact! if m[0] == '@' m[0] = "@#{m.slice!(2,1).join}" end if m[0] == '[' && m[1] =~ /^\d+$/ m = [":", "nth", m[1].to_i-1] end if m[0] == ":" && m[1] == "not" nodes, = Elements.filter(nodes, m[2], false) elsif "#{m[0]}#{m[1]}" =~ /^(:even|:odd)$/ new_nodes = [] nodes.each_with_index {|n,i| new_nodes.push(n) if (i % 2 == (m[1] == "even" ? 0 : 1)) } nodes = new_nodes elsif "#{m[0]}#{m[1]}" =~ /^(:first|:last)$/ nodes = [nodes.send(m[1])] else meth = "filter[#{m[0]}#{m[1]}]" unless m[0].empty? if meth and Traverse.method_defined? meth args = m[2..-1] else meth = "filter[#{m[0]}]" if Traverse.method_defined? meth args = m[1..-1] end end args << -1 nodes = Elements[*nodes.find_all do |x| args[-1] += 1 x.send(meth, *args) ? truth : !truth end] end end [nodes, expr] end
Public Instance Methods
Adds the class to all matched elements.
(doc/"p").add_class("bacon")
Now all paragraphs will have class=“bacon”.
# File lib/hpricot/elements.rb, line 226 def add_class class_name each do |el| next unless el.respond_to? :get_attribute classes = el.get_attribute('class').to_s.split(" ") el.set_attribute('class', classes.push(class_name).uniq.join(" ")) end self end
Just after each element in this list, add some HTML. Pass in an HTML
str
, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 154 def after(str = nil, &blk) each { |x| x.parent.insert_after x.make(str, &blk), x } end
Add to the end of the contents inside each element in this list. Pass in an
HTML str
, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 136 def append(str = nil, &blk) each { |x| x.html(x.children + x.make(str, &blk)) } end
Searches this list for the first element (or child of these elements)
matching the CSS or XPath expression expr
. Root is assumed to
be the element scanned.
See Hpricot::Traverse#at for more.
# File lib/hpricot/elements.rb, line 67 def at(expr, &blk) if expr.kind_of? Fixnum super else search(expr, &blk)[0] end end
Gets and sets attributes on all matched elements.
Pass in a key
on its own and this method will return the
string value assigned to that attribute for the first elements. Or
nil
if the attribute isn't found.
doc.search("a").attr("href") #=> "http://hacketyhack.net/"
Or, pass in a key
and value
. This will set an
attribute for all matched elements.
doc.search("p").attr("class", "basic")
You may also use a Hash to set a series of attributes:
(doc/"a").attr(:class => "basic", :href => "http://hackety.org/")
Lastly, a block can be used to rewrite an attribute based on the element it belongs to. The block will pass in an element. Return from the block the new value of the attribute.
records.attr("href") { |e| e['href'] + "#top" }
This example adds a #top
anchor to each link.
# File lib/hpricot/elements.rb, line 205 def attr key, value = nil, &blk if value or blk each do |el| el.set_attribute(key, value || blk[el]) end return self end if key.is_a? Hash key.each { |k,v| self.attr(k,v) } return self else return self[0].get_attribute(key) end end
Add some HTML just previous to each element in this list. Pass in an HTML
str
, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 148 def before(str = nil, &blk) each { |x| x.parent.insert_before x.make(str, &blk), x } end
Empty the elements in this list, by removing their insides.
doc = Hpricot("<p> We have <i>so much</i> to say.</p>") doc.search("i").empty doc.to_html => "<p> We have <i></i> to say.</p>"
# File lib/hpricot/elements.rb, line 130 def empty each { |x| x.inner_html = nil } end
# File lib/hpricot/elements.rb, line 351 def filter(expr) nodes, = Elements.filter(self, expr) nodes end
Returns an HTML fragment built of the contents of each element in this list.
If a HTML string
is supplied, this method acts like #inner_html=.
# File lib/hpricot/elements.rb, line 86 def inner_html(*string) if string.empty? map { |x| x.inner_html }.join else x = self.inner_html = string.pop || x end end
Replaces the contents of each element in this list. Supply an HTML
string
, which is loaded into Hpricot objects and inserted into every element
in this list.
# File lib/hpricot/elements.rb, line 99 def inner_html=(string) each { |x| x.inner_html = string } end
Returns an string containing the text contents of each element in this list. All HTML tags are removed.
# File lib/hpricot/elements.rb, line 107 def inner_text map { |x| x.inner_text }.join end
# File lib/hpricot/elements.rb, line 356 def not(expr) if expr.is_a? Traverse nodes = self - [expr] else nodes, = Elements.filter(self, expr, false) end nodes end
Add to the start of the contents inside each element in this list. Pass in
an HTML str
, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 142 def prepend(str = nil, &blk) each { |x| x.html(x.make(str, &blk) + x.children) } end
Remove all elements in this list from the document which contains them.
doc = Hpricot("<html>Remove this: <b>here</b></html>") doc.search("b").remove doc.to_html => "<html>Remove this: </html>"
# File lib/hpricot/elements.rb, line 119 def remove each { |x| x.parent.children.delete(x) } end
Remove an attribute from each of the matched elements.
(doc/"input").remove_attr("disabled")
# File lib/hpricot/elements.rb, line 239 def remove_attr name each do |el| next unless el.respond_to? :remove_attribute el.remove_attribute(name) end self end
Removes a class from all matched elements.
(doc/"span").remove_class("lightgrey")
Or, to remove all classes:
(doc/"span").remove_class
# File lib/hpricot/elements.rb, line 255 def remove_class name = nil each do |el| next unless el.respond_to? :get_attribute if name classes = el.get_attribute('class').to_s.split(" ") el.set_attribute('class', (classes - [name]).uniq.join(" ")) else el.remove_attribute("class") end end self end
Searches this list for any elements (or children of these elements)
matching the CSS or XPath expression expr
. Root is assumed to
be the element scanned.
See Hpricot::Traverse#search for more.
# File lib/hpricot/elements.rb, line 58 def search(*expr,&blk) Elements[*map { |x| x.search(*expr,&blk) }.flatten.uniq] end
Convert this group of elements into a complete HTML fragment, returned as a string.
# File lib/hpricot/elements.rb, line 78 def to_html map { |x| x.output("") }.join end
Wraps each element in the list inside the element created by HTML
str
. If more than one element is found in the string, Hpricot locates the deepest spot inside the
first element.
doc.search("a[@href]"). wrap(%Q{<div class="link"><div class="link_inner"></div></div>})
This code wraps every link on the page inside a div.link
and a
div.link_inner
nest.
# File lib/hpricot/elements.rb, line 166 def wrap(str = nil, &blk) each do |x| wrap = x.make(str, &blk) nest = wrap.detect { |w| w.respond_to? :children } unless nest raise "No wrapping element found." end x.parent.replace_child(x, wrap) nest = nest.children.first until nest.empty? nest.html([x]) end end
Private Instance Methods
# File lib/hpricot/elements.rb, line 366 def copy_node(node, l) l.instance_variables.each do |iv| node.instance_variable_set(iv, l.instance_variable_get(iv)) end end