This module is used to collect public metadata about new books published by selected czech publishers.


User guide / Uživatelská příručka


Whole module is divided into following parts:


Scrappers are used to download metadata from publisher’s webpages.


Filters are then used to filter data from Scrappers, before they are returned. This behavior can be turned off by USE_DUP_FILTER and USE_ALEPH_FILTER properties of settings submodule.

Other parts

There are also other, unrelated parts of this module, which are used to set behavior, or to define representations of the data.


Last submodule is Autoparser, which makes creating new parsers easier.

AMQP connection

AMQP communication is handled by the edeposit.amqp module, specifically by the edeposit_amqp_harvester.py script.

Source code

This project is released as opensource (GPL) and source codes can be found at GitHub:


Module is hosted at PYPI, and can be easily installed using PIP:

sudo pip install edeposit.amqp.harvester


Almost every feature of the project is tested in unit/integration tests. You can run this tests using provided run_tests.sh script, which can be found in the root of the project.


This script expects that pytest is installed. In case you don’t have it yet, it can be easily installed using following command:

pip install --user pytest

or for all users:

sudo pip install pytest

Indices and tables