utils¶
This module contains number of functions, which are used at multiple places in autoparser.
- harvester.autoparser.utils.handle_encodnig(html)[source]¶
Look for encoding in given html. Try to convert html to utf-8.
Parameters: html (str) – HTML code as string. Returns: HTML code encoded in UTF. Return type: str
- harvester.autoparser.utils.content_matchs(tag_content, content_transformer=None)[source]¶
Generate function, which checks whether the content of the tag matchs tag_content.
Parameters: - tag_content (str) – Content of the tag which will be matched thru whole DOM.
- content_transformer (fn, default None) – Function used to transform all tags before matching.
Returns: True for every matching tag.
Return type: bool
Note
This function can be used as parameter for .find() method in HTMLElement.
- harvester.autoparser.utils.is_equal_tag(element, tag_name, params, content)[source]¶
Check is element object match rest of the parameters.
All checks are performed only if proper attribute is set in the HTMLElement.
Parameters: Returns: True if everyhing matchs, False otherwise.
Return type: bool
- harvester.autoparser.utils.has_neigh(tag_name, params=None, content=None, left=True)[source]¶
This function generates functions, which matches all tags with neighbours defined by parameters.
Parameters: Returns: True for every matching tag.
Return type: bool
Note
This function can be used as parameter for .find() method in HTMLElement.