path_patterns

This module defines path-constructor functions and containers for data.

Containers are later used for validation of the paths in other examples and for generator, which creates the parser.

class harvester.autoparser.path_patterns.NeighCall(tag_name, params, fn_params)[source]

Class used to store informations about neighbour calls, generated by _neighbour_to_path_call().

tag_name

str – Name of the container for the data.

params

dict – Parameters for the fontainer.

fn_params

list – Parameters for the fuction which will find neighbour (see has_neigh()).

class harvester.autoparser.path_patterns.PathCall(call_type, index, params)[source]

Container used to hold data, which will be used as parameter to call search functions in DOM.

Parameters:
  • call_type (str) – Determines type of the call to the HTMLElement method.
  • index (int) – Index of the item after call_type function is called.
  • params (dict) – Another parameters for call_type function.
class harvester.autoparser.path_patterns.Chained(chain)[source]

Container to hold parameters of the chained calls.

Parameters:chain (list) – List of PathCall classes.
call_type[source]

Property added to make sure, that Chained is interchangeable with PathCall.

harvester.autoparser.path_patterns.neighbours_pattern(element)[source]

Look for negihbours of the element, return proper PathCall.

Parameters:element (obj) – HTMLElement instance of the object you are looking for.
Returns:List of PathCall instances.
Return type:list
harvester.autoparser.path_patterns.predecesors_pattern(element, root)[source]

Look for element by its predecesors.

Parameters:
  • element (obj) – HTMLElement instance of the object you are looking for.
  • root (obj) – Root of the DOM.
Returns:

[PathCall()] - list with one PathCall object (to allow use with .extend(predecesors_pattern())).

Return type:

list