Markup Parser Logic

Soup Parser

This module provides a very lenient HTML/XML lexer. The SoupLexer class is initialized with a listener object, which receives all low level events (like starttag, endtag, text etc). Listeners must implement the ListenerInterface.

On top of the lexer there's SoupParser class, which actually implements the ListenerInterface itself (the parser listens to the lexer). The parser adds HTML semantics to the lexed data and passes the events to a building listener (BuildingListenerInterface). In addition to the events sent by the lexer the SoupParser class generates endtag events (with empty data arguments) for implicitly closed elements. Furthermore it knows about CDATA elements like <script> or <style> and modifies the lexer state accordingly.

The actual semantics are provided by a DTD query class (implementing DTDInterface.)

License:

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Author: André Malo

package

Value:

'tdi.markup.soup'

Classes
	SoupLexer (X)HTML Tagsoup Lexer
	DEFAULT_LEXER (X)HTML Tagsoup Lexer
	SoupParser The parser is actually a tagsoup parser by design in order to process most of the "HTML" that can be found out there.
	DEFAULT_PARSER The parser is actually a tagsoup parser by design in order to process most of the "HTML" that can be found out there.

Module parser

Markup Parser Logic

Soup Parser

doc

package

Variables
	__doc__ = `__doc__.encode('ascii').decode('unicode_escape')`
	__package__ = `'tdi.markup.soup'`

Module parser

Markup Parser Logic

Soup Parser

__doc__

__package__

doc

package