Package CHEM :: Package DB :: Package rdb :: Module BeautifulSoup :: Class BeautifulSOAP
[hide private]
[frames] | no frames]

Class BeautifulSOAP



          PageElement --+        
                        |        
                      Tag --+    
                            |    
markupbase.ParserBase --+   |    
                        |   |    
       sgmllib.SGMLParser --+    
                            |    
           BeautifulStoneSoup --+
                                |
                               BeautifulSOAP
Known Subclasses:
SimplifyingSOAPParser

This class will push a tag with only a single string child into
the tag's parent as an attribute. The attribute's name is the tag
name, and the value is the string child. An example should give
the flavor of the change:

<foo><bar>baz</bar></foo>
 =>
<foo bar="baz"><bar>baz</bar></foo>

You can then access fooTag['bar'] instead of fooTag.barTag.string.

This is, of course, useful for scraping structures that tend to
use subelements instead of attributes, such as SOAP messages. Note
that it modifies its input, so don't print the modified version
out.

I'm not sure how many people really want to use this class; let me
know if you do. Mainly I like the name.



Instance Methods [hide private]
 
popTag(self)

Inherited from BeautifulStoneSoup: __getattr__, __init__, endData, handle_charref, handle_comment, handle_data, handle_decl, handle_entityref, handle_pi, isSelfClosingTag, parse_declaration, pushTag, reset, unknown_endtag, unknown_starttag

Inherited from Tag: __call__, __contains__, __delitem__, __eq__, __getitem__, __iter__, __len__, __ne__, __nonzero__, __repr__, __setitem__, __str__, __unicode__, append, childGenerator, fetch, fetchText, find, findAll, findChild, findChildren, first, firstText, get, has_key, prettify, recursiveChildGenerator, renderContents

Inherited from Tag (private): _getAttrMap

Inherited from PageElement: extract, fetchNextSiblings, fetchParents, fetchPrevious, fetchPreviousSiblings, findAllNext, findAllPrevious, findNext, findNextSibling, findNextSiblings, findParent, findParents, findPrevious, findPreviousSibling, findPreviousSiblings, insert, nextGenerator, nextSiblingGenerator, parentGenerator, previousGenerator, previousSiblingGenerator, replaceWith, setup, substituteEncoding, toEncoding

Inherited from PageElement (private): _findAll, _findOne, _lastRecursiveChild

Inherited from sgmllib.SGMLParser: close, convert_charref, convert_codepoint, convert_entityref, error, feed, finish_endtag, finish_shorttag, finish_starttag, get_starttag_text, goahead, handle_endtag, handle_starttag, parse_endtag, parse_pi, parse_starttag, report_unbalanced, setliteral, setnomoretags, unknown_charref, unknown_entityref

Inherited from sgmllib.SGMLParser (private): _convert_ref

Inherited from markupbase.ParserBase: getpos, parse_comment, parse_marked_section, unknown_decl, updatepos

Inherited from markupbase.ParserBase (private): _parse_doctype_attlist, _parse_doctype_element, _parse_doctype_entity, _parse_doctype_notation, _parse_doctype_subset, _scan_name

Class Variables [hide private]

Inherited from BeautifulStoneSoup: HTML_ENTITIES, MARKUP_MASSAGE, NESTABLE_TAGS, QUOTE_TAGS, RESET_NESTING_TAGS, ROOT_TAG_NAME, SELF_CLOSING_TAGS, XML_ENTITIES, XML_ENTITY_LIST, i

Inherited from Tag: XML_SPECIAL_CHARS_TO_ENTITIES

Inherited from sgmllib.SGMLParser: entity_or_charref, entitydefs

Inherited from sgmllib.SGMLParser (private): _decl_otherchars

Method Details [hide private]

popTag(self)

 
Overrides: BeautifulStoneSoup.popTag