string.html.HtmlParser Extends
An Html parser: parse takes a string and calls methods on goog.string.html.HtmlSaxHandler while it is visiting it.

Inheritance

Constructor

goog.string.html.HtmlParser()

Instance Methods

Public Protected Private
lookupEntity_(name) string
Decodes an HTML entity.
Arguments:
name : string
The content between the '&' and the ';'.
Returns: string  A single unicode code-point as a string.
code »
normalizeRCData_(rcdata) string
Escape entities in RCDATA that can be escaped without changing the meaning.
Arguments:
rcdata : string
The RCDATA string we want to normalize.
Returns: string  A normalized version of RCDATA.
code »
parse(handlerhtmlText)
Given a SAX-like goog.string.html.HtmlSaxHandler parses a htmlText and lets the handler know the structure while visiting the nodes.
Arguments:
handler : goog.string.html.HtmlSaxHandler
The HtmlSaxHandler that will receive the events.
htmlText : string
The html text.
code »
stripNULs_(s) string
Removes null characters on the string.
Arguments:
s : string
The string to have the null characters removed.
Returns: string  A string without null characters.
code »
unescapeEntities_(s) string
The plain text of a chunk of HTML CDATA which possibly containing. TODO(goto): use goog.string.unescapeEntities instead ?
Arguments:
s : string
A chunk of HTML CDATA. It must not start or end inside an HTML entity.
Returns: string  The unescaped entities.
code »

Static Properties

goog.string.html.HtmlParser.AMP_RE_ :
Regular expression that matches &s.
Code »
goog.string.html.HtmlParser.DECIMAL_ESCAPE_RE_ :
Regular expression that matches decimal numbers.
Code »
goog.string.html.HtmlParser.ENTITY_RE_ :
Regular expression that matches entities.
Code »
goog.string.html.HtmlParser.EQUALS_RE_ :
Regular expression that matches =.
Code »
goog.string.html.HtmlParser.Elements :
A map of element to a bitmap of flags it has, used internally on the parser.
Code »
goog.string.html.HtmlParser.GT_RE_ :
Regular expression that matches >.
Code »
goog.string.html.HtmlParser.HEX_ESCAPE_RE_ :
Regular expression that matches hexadecimal numbers.
Code »
goog.string.html.HtmlParser.INSIDE_TAG_TOKEN_ :
Regular expression that matches the next token to be processed.
Code »
goog.string.html.HtmlParser.LOOSE_AMP_RE_ :
Regular expression that matches loose &s.
Code »
goog.string.html.HtmlParser.LT_RE_ :
Regular expression that matches <.
Code »
goog.string.html.HtmlParser.NULL_RE_ :
Regular expression that matches null characters.
Code »
goog.string.html.HtmlParser.OUTSIDE_TAG_TOKEN_ :
Regular expression that matches the next token to be processed when we are outside a tag.
Code »
goog.string.html.HtmlParser.QUOTE_RE_ :
Regular expression that matches ".
Code »

Enumerations

goog.string.html.HtmlParser.EFlags :
The html eflags, used internally on the parser.
Constants:
CDATA
No description.
EMPTY
No description.
FOLDABLE
No description.
OPTIONAL_ENDTAG
No description.
RCDATA
No description.
UNSAFE
No description.
Code »
goog.string.html.HtmlParser.Entities :
HTML entities that are encoded/decoded. TODO(user): use goog.string.htmlEncode instead.
Constants:
Code »

Package string.html

Package Reference