Changelog

All notable changes will be listed in this file. This project adheres to Semantic Versioning.

If you want to see a change, we advise you to first read Contributing.

Unreleased

Below is a list of unreleased changes. Pay special attention to anything with a [DEPRECATION] next to it as this is something that will be deprecated and removed in future versions.

  • No current unreleased changes

0.19.2

2022-12-22

Fixed

  • Fixed bug where bots attempting to delete an SQS message would fail on jobs longer than 12 hours due to expired credentials.

0.19.1

2022-12-22

Updated

  • Updated tests for aggregation of data as file items.

0.19.0

2022-11-28

Added

  • Added functionality to allow user to add item with data stored as a file in S3.

0.18.6

2022-11-01

Fixed

  • Fix bug with deploying bots that are not defined in the workflow for the stage.

0.18.5

2022-09-21

Fixed

  • Update bot state to idle in the processing logic.

0.18.4

2022-09-15

Fixed

  • Minor bugfix where worker state didn’t revert to idle after finishing processing.

0.18.3

2022-09-13

Fixed

  • Minor bot and worker state updating bug fix.

0.18.2

2022-08-19

Fixed

  • Minor CLI tabulation bug fix.

0.18.1

2022-08-08

Changed

  • Enforce row content wrapping on CLI tabulation.

0.18.0

2022-07-26

Added

  • Added state property to Bot and Worker entities. Indicates whether a Bot/Worker is idle, active, stopped or terminated.

Changed

  • Updated CLI to show Bot state.

  • Updated CLI to show number of active workers out of total.

0.17.0

2022-07-06

Changed

  • Dropped Python 3.6 support.

  • Updated to use click ^8.0 and pre-commit ^2.9.

  • Pinned jinja2 to <3.1 due to incompatibility with sphinx ^2.4.5.

0.16.1

2022-07-05

Fixed

  • Fixed docs deployment issues

0.16.0

Changed

  • Updated all code, tests and docs to use dot notation for entity subclasses (item.data.width) rather than dict notation (item[“data”][“width”]). Both versions will continue to work but the dot notation is the recommended way of doing things

  • Fixed a bug when an error would be raised in factories during bot download. In the case of a missing dependency an error would be raised. This behaviour has been modified so that missing dependencies are ignored at that step and are dealt with during factory build.

  • Changed the exception raised when a file read from S3 fails from QalxError to QalxFileRetrievalError

Added

  • Added a property to Set to allow nicer lookups of items. Rather than having to do myset[“items”] you can now do myset.items_

  • Added a property to ItemAddManyEntity to allow nicer lookups of items. Rather than having to do resp[“items”] you can now do resp.items_ when using qalx.item.add_many()

  • Added an alias to queue on QalxJob. Now in a step function you can do job.queue rather than job.session.queue.get_by_name(job._bot_entity.config.queue_name)

  • If an error occurs when reading a file from S3 then qalx will sleep for 2 seconds before attempting the retrieval again. If the file still fails to download then a QalxFileRetrieval error is raised * The user can optional specify not to attempt to retry getting the file by doing item.read_file(retry=False)

  • Added extra logging when stopping, resuming or terminating workers. These extra logs will appear in the specific worker log file.

0.15.2

2021-12-3

Fixed

  • Upgraded the bson_extra dependency which has pymongo pinned to 3.12.2.

0.15.1

2021-11-09

Fixed

  • Upgrade the cryptography dependency to 35.0.0. The previous version would cause pyqalx to not be installable on windows when using python >=3.9

0.15.0

2021-07-02

Fixed

  • Fixed a bug where accessing a mising key (i.e. entity[‘metadata’][‘MISSING’] or entity.metadata.MISSING) would not raise a KeyError/AttributeError as expected. * This causes nested assignments to stop working - i.e. entity.meta.something.nested = 5 will raise an AttributeError if the something key doesn’t exist

  • Fixed a bug which prevented usage of pyqalx if git wasn’t installed on the system. git is now only required if using a GitBotSource in factories.

  • Fixed a bug when stopping bots or demolishing factories via the CLI if there was more than a single page returned and the user wanted to stop a bot/demolish a factory on a page other than the last page.

Added

  • WORKER_LOG_FILE_DIR now defaults to appdirs.user_log_dir(APPNAME, APPAUTHOR) to bring it in line with LOG_FILE_PATH

  • Added a new REGION config option. This can be set as part of qalx configure and will allow users to specify which region should be used when dealing with external services.

  • Queue entities now use the REGION defined on the user config for the session that got the queue from the API when building the broker client

  • Added a “debug” section to factories. Initially populated with “demolish_on_failure” which, if set to false, will not demolish any resources should a factory build command error

Changed

  • If there is an issue with creating an item when calling qalx.item.add_many an logging.ERROR is now logger rather than logging.WARNING

  • Improved API Docs

  • When installing bots from setup.py/requirements.txt in a factory the build command will now look in the lowest level directory, and work its way up to the bot path root directory. This resolves issues with setup.py/requirements.txt files not being found.

Removed

  • Removed QalxReturnedMultipleError. It was never being raised anywhere. Use QalxMultipleEntityReturned instead

0.14.4

2021-04-28

Fixed

  • Fixed a bug with configs when the values were read from environment variables. In that case, the values were not validated/converted to the correct type. This has now been fixed.

0.14.3

2021-04-26

Fixed

  • Fixed a bug with remote factories where if there was an error in the start command of the bot, this error wouldn’t be reported to the user. Instead the user would just see 1 None or 1 b’’. The error message now correctly gets reported to the user.

0.14.2

2020-11-10

Changed

  • Upgraded the version of dill to 0.3.3 to resolve issues when running on python>=3.8

0.14.1

2020-10-19

Fixed

  • Fixed a bug which prevented a user from using the same alarm on multiple sectors on a factory.

0.14.0

2020-10-08

Added

  • Added a new config option DOWNLOAD_DIR that specifies the default location for download files from qalx

  • Added a new CLI command qalx download that can be used to download files from the CLI directly

  • Added the ability for users to add Cloudwatch alarms to AWS sectors on a factory. See the factory docs for more information

Changed

  • The find method of QalxAdapter now accepts an optional include_session_tags parameter which defaults to True. If it is True, the session tags are added to the query.

  • The get, find_one and reload methods for subclasses of QalxUnpackableAdapter now accept a keyword boolean argument unpack that indicates whether to unpack the child entities or not. This defaults to True for Sets and Blueprints and to False for Groups

  • Replaced ObjectDict with addict. Dot notation works exactly as before and there is no API change for users, but addict is more stable and does not cause threading bugs in the way that ObjectDict does

  • Blueprints can now use dot notation through the whole schema.

  • All tabulated output from factory CLI commands now includes the stage for each factory

  • Updated the CLI docs to include all the CLI commands - not just bots

  • Changed pypi and pyproject.toml/setup.py installs to do the equivalent of pip install –upgrade. This enables a user to have an old version of their dependencies in the environment but guarantee they get upgraded to the latest version based on their bot source.

Removed

  • Removed the UNPACK_SET, UNPACK_GROUP and UNPACK_BLUEPRINT config options. Configuration for packing is now done at each method that returns an entity that can be unpacked(find_one, get, reload)

Fixed

  • Fixed a bug where the get method for an entity with tags would not return the kids of the entity in the case that they did not have matching tags. Now all kids are returned irrespective of their tags.

  • Fixed a bug where if running in multiprocessing the log rotation would fail with a PermissionError

  • Fixed a bug with entities not being instantiated with the correct type due to a threading issue in ObjectDict.

  • Fixed a bug where a large factory would cause an error with CloudFormation due to the size of the template. The generated CloudFormation template can now be up to 460KB (previously 51KB)

  • Fixed a bug whereby dict builtins on QalxEntity were overridden if the key from the API matched the builtin name. To access keys that match the builtin name use key lookup (myset[‘items’]) rather than dot notation lookup (myset.items).

0.13.0

2020-08-12

Changed

  • If there are tags configured on the QalxSession any call to find or find_one will use those tags as part of the query.

  • Removed the retry limit for throttled requests. These now always retry, but on a random increasing time period.

  • Moved aggregation to its own endpoint. Use qalx.<entity>.aggregate([pipeline]) now. Removed the aggregate argument from find

Fixed

  • Fixed a bug where items added via add_many weren’t getting tagged

  • Improved the speed with which factories warm terminate bots.

0.12.1

2020-07-20

Fixed

  • Fixed a bug with factories where a Throttling error could be raised for remote AWS factory sectors.

  • Fixed a bug with factories where remote sectors would fail due to not having the log path created.

0.12

2020-07-16

Added

  • Added a delete method to QalxItem, QalxFactory and QalxQueue adapters. This is syntactic sugar for self.session.rest_api.delete(“/<entity>/<guid>”)

  • Added a default class attribute on Config that is used for all definitions of “default” within a Config context

  • Rate limiting is now in place on the API. If a 429 response is received, pyqalx will sleep for 5 seconds before trying again, up to a maximum of two attempts. A DEBUG log is written whenever a 429 response is encountered

  • QalxJob now has a context dict attribute that can be used to store intermittent data between step functions

  • object type is now fully respected. All data given to and received from pyqalx will maintain the same type. See searching using mongo for more information. Note: Any existing data stored will maintain the original type.

  • All default config options in Config and its subclasses are now parsed and validated against the expected type for each value, when the config is loaded from a file

  • It is now possible to query for subsets of fields using dot notation

  • It is now possible to return a subset of fields for child instances (i.e. items on a set, and sets and items on a group).

  • Added purge method on Queue. This deletes all messages from the queue.

  • Added a generate-encryption cli command that handles generating a new encryption key file.

  • File data is now optionally encrypted, before uploading, if the KEYFILE parameter is defined in the UserConfig. KEYFILE should be a path to a compatible encryption key file that can be used for encryption

  • File data can be decrypted when reading from qalx, by passing a key_file argument to Item.read_file or Item.save_file_to_disk. key_file should be the path to the key file that was used for the encryption of that file

  • Added a QalxNotification adapter that can be used to send email notifications. It accepts a subject, message, recipient list and optional cc and bcc lists

  • Added an AggregationResult entity and that is the type that is returned from the find method of all adapters when the optional aggregate is provided. aggregate is a mongodb aggregation pipeline definition that can be used to perform aggregation

Changed

  • The input_file ` argument is now called `source when adding and saving Items and Factories

  • Entities should now be submitted to queues via the add_to_queue class method on Item, Set and Group.

  • Packable entities (QalxSet, QalxGroup, QalxBlueprint and QalxBot) are now returned in an unpacked state after add is called

  • Item#add_many now uploads files in parallel rather than sequentially. This should provide a significant performance improvement.

  • If a single file fails to upload an exception will not be thrown. Instead this failure is logged to the errors response. The item will still have been created - but the file will not have been uploaded. Previously, if a single file failed to upload all items were still created, but any subsequent files would not have attempted upload.

  • The external IP address is now available when doing QalxSession()._host_info. This means that bots will now have the IP address stored on them when creating

  • The key for the specifying a qalx user profile has been changed from qalx_profile to user_profile for consistency

  • Requests to the API now share connections which should lead to a significant performance improvement.

  • Changed the default path for LOG_FILE_PATH to be a system specific directory.

  • CLI now supports displaying more than 25 entities with pagination.

  • All occurrences of logger and _logger as arguments, variables or attributes have been renamed to log and _log respectively

  • QalxAPIResponseError for QalxAdapter is now raised from a static method for consistent formatting of the error message

  • Workers now get returned with bots - this removes the need to do qalx.bot.find() followed by qalx.bot.reload(bot) for each bot to get the workers

  • All entities now include the last modified datetime and the last user who modified the entity in the info key - all existing entities have a modified timestamp the same as the created and a modified user the same as the creating user

  • Exceptions that are subclasses of QalxError are no longer displayed as a full stacktrace in the CLI

  • Step results on QalxJob are now stored in an OrderedDict object in the form of <step>: <success_flag>

  • QalxJob.add_step_result is now only used to flag success of failure of a step function and not to add data to be used for other step functions. This can be achieved with the new context attribute

  • The configure cli command now also handles creating an encryption key file or specifying an existing one.

Removed

  • Removed the add_file method of QalxItem. File items can be added using the add method

  • Removed the add_queue argument from bot.start. Bots will always add a queue if it does not exist

  • Removed the internal __str__ method from QalxEntity and QalxListEntity

  • Removed status_dict as an argument from Job.publish_status

  • Removed status_str as an argument from Job.publish_status

  • Some fields on a worker have been removed - these are info, meta

  • It is no longer possible to filter specific workers on a bot via the API.

Fixed

  • Fixed a bug where items that have been pushed on a queue, are then deleted manually before the message is processed. This is now handled and the error is logged for the user, while the message is taken off the queue

  • Fixed a bug in delete_message method of QalxJob, where the wrong worker attribute was used to delete a message from the queue

  • Fixed a bug with f-strings missing the f token at the beginning of the string

  • Fixed a bug where a race condition between concurrent bots could cause the worker logging directory creation to error

  • Fixed unhandled exception where a user could provide a file path instead of a directory for the worker logging directory. This exception is now caught and a more meaningful exception is raised

  • Fixed a bug where a bot which didn’t need to be installed would cause a factory to not be built.

  • Fixed a bug whereby a file uploaded as a stream would be downloaded as an empty file.

  • Fixed a bug where a Group that has the same Set on it multiple times would not have the items unpacked properly.

0.11

2020-06-29

Added

  • Added factories. See factories for more info

  • API methods now go qalx <noun>-<verb>. I.E. qalx bot-start, qalx bot-terminate

Fixed

  • Fixed a bug preventing the same file being uploaded to multiple items when add_many was called

0.10

2020-03-02

Added

  • Added a validate() method to Blueprint that allows an entity to be validated against the blueprint schema.

  • Added an UNPACK_BLUEPRINT config option. This is true by default and is used to allow a user to specify whether they want to unpack the Item blueprints that are nested in a Set blueprint

Changed

  • Move all initialisation arguments for Bot to the start() method. This will allow easier sharing of bots as it is up to the implementer to worry about implementation details (such as the queue the bot is looking at)

  • The bot-start cli command now takes extra arguments which get passed to start. Do qalx bot-start –help to view further information

  • Modified the way that QalxBlueprint retrieves Set blueprints. These are now unpacked based on the value of UNPACK_BLUEPRINT at the point of retrieval. This brings them more into line with how a Set works.

  • Moved query to be the first argument for both find and find_one.

  • Made it clearer that query is a required argument for find_one and an optional argument for find

  • Bot names no longer have to be unique

Removed

  • Removed the UNPACK_BOT config option. Workers from bots are now always unpacked

Fixed

  • Fixed issues with docs regarding find_one and find. They now correctly use mongo syntax

  • Fixed bug with bot-terminate CLI command - this wasn’t correctly finding the bot.

  • Fixed bug with qalx bot-start –stop which was causing the bot to exit early.

0.9

2020-01-07

Hotfixes

  • 0.9.1 (2020-01-13) - Fixed a bug that occured when saving a packable entity (Set, Group, Bot) when UNPACK_* was set to False.

Added

  • Added a –version command to the cli. qalx –version will now give the current version number.

  • A QalxSession running using a BotConfig now attempts to create the WORKER_LOG_FILE_DIR if it doesn’t exist - raising a QalxError if it is unable to.

  • Added details in the docs to promote use of meta._class as a system of classification.

Fixed

  • Fixed unhandled exception with sending non-JSON serializable data to find when doing a mongo query.

  • Fixed an unhandled exception with the save method on QalxWorker

  • Fixed error where the put_url was not removed when the file was changed in QalxFileAdapter

  • Fixed an unhandled exception with the reload method on QalxWorker

  • Fixed error raised when file with no data is read in the Item.read_file() method

  • Fixed unhandled exception when saving a bot that had workers

  • Fixed an unhandled exception that occurred if a user tried to save a Set with a blueprint but no metadata

  • Fixed unhandled exceptions in QalxAdapter subclasses when the user tried to save an entity without all the required fields

Removed

  • Remove the ability to query via RQL syntax. Use mongo syntax from now on.

0.8

2019-11-26

Changed

  • Added the status argument to Job.publish_status for consistency with Bot status updates. This will replace status_str and should be used from now on.

  • Modified QalxWorker.update_status method to ensure that the bot_entity is provided as a keyword argument

Fixed

  • Ensure that we pass a user session through to the initialisation step function - not a bot session.

  • The begin step function now takes a job as its only argument instead of a session

0.7

2019-10-09

Added

  • Added a new command line argument qalx configure to make it easier for users to initially setup qalx
  • Tags now get automatically validated against the API when being added to a session.

  • pyqalx now automatically adds the queue_name to the config of a bot. This will make it easier for users to find bots running against a specific queue.

  • Users can now use dot notation to navigate through qalx entities rather than having to use dictionary lookups

  • Added a performance improvement when starting bots. Previously qalx did a request for each Worker that was started to get the queue. Now the queue is obtained before the workers are created meaning that only a single query is made for all Workers

  • Added documentation for Permissions

  • Allow users to query via a mongodb query..

Changed

  • Updated the Bots documentation to make it clear that a user needs to manually delete a message if there is a chance it can be processing for more than 12 hours

Fixed

  • Fixed a broken link on Quickstart

  • Fixed a potential bug which could result in a worker crashing due to a message being deleted and having its heartbeat extended at the same time.