Package tdi :: Module _htmldecode
[frames] | no frames]

Module _htmldecode

source code

HTML Decoder

HTML Decoder.


Copyright: Copyright 2006 - 2015 André Malo or his licensors, as applicable

License:

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Author: André Malo

Functions
unicode
decode(value, encoding='latin-1', errors='strict', entities=None)
Decode HTML encoded text
source code
Variables
  __doc__ = __doc__.encode('ascii').decode('unicode_escape')
  __package__ = 'tdi'
Function Details

decode(value, encoding='latin-1', errors='strict', entities=None)

source code 
Decode HTML encoded text
Parameters:
  • value (basestring) - HTML content to decode
  • encoding (str) - Unicode encoding to be applied before value is being processed further. If value is already a unicode instance, the encoding is ignored. If omitted, 'latin-1' is applied (because it can't fail and maps bytes 1:1 to unicode codepoints).
  • errors (str) - Error handling, passed to .decode() and evaluated for entities. If the entity name or character codepoint could not be found or not be parsed then the error handler has the following semantics:

    strict (or anything different from the other tokens below)
    A ValueError is raised.
    ignore
    The original entity is passed through
    replace
    The character is replaced by the replacement character (U+FFFD)
  • entities (dict) - Entity name mapping (unicode(name) -> unicode(value)). If omitted or None, the HTML5 entity list is applied.

Returns: unicode
The decoded content

Variables Details

__doc__

Value:
__doc__.encode('ascii').decode('unicode_escape')

__package__

Value:
'tdi'