Entities

The qalx platform is formed of a number of core entities. Their purpose and relationships are described below.

Note

These examples use an attribute style lookup (i.e. item.data.height) for getting and setting values on an entity. You can optionally use key lookup if you prefer (i.e. item[‘data’][‘height’])

Warning

All entities are subclasses of dict. Therefore certain builtin methods could clash with lookup attributes. In this scenario you will have to do a key lookup or a slightly modified attribute lookup.

>>> from pyqalx import Item, Set

>>> item = Item({"data": {"some": "data"}})
>>> myset = Set({"items": {"item1": item}})
>>> # This won't return the items on the set - instead returning the same as `dict().items`
>>> print(myset.items)
<built-in method items of Set object at 0x7fab8f731d00>

>>> # But this will return the items on a set
>>> myset['items']
{'item1': {'data': {'some': 'data'}}}

>>> # We've added a helper method to allow you to do an attribute lookup for items
>>> # and is functionality equivalent to doing `myset["items"]`
>>> myset.items_
{'item1': {'data': {'some': 'data'}}}

qalx maintains the type of all data passed to it. This means you don’t have to worry about casting your data to specific types - it’s all handled for you.

from pyqalx import QalxSession
from datetime import datetime

qalx = QalxSession()
time_run = datetime(2020, 6, 25, 12, 53, 23)
item = qalx.item.add(data={"time_run": time_run})

assert item.data.time_run == time_run

ITEM

Adding

An item is the core of qalx. They are structured data or a file or some combination of the two. For example:

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> dims = qalx.item.add(data={"height":5, "width":5}, meta={"shape":"square"})
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
...                       data={"rho":8000, "E":193e9},
...                       meta={"library":"materials", "family":"steel", "grade":"316"})

If you want to add multiple items in a single request you can also do that. qalx will validate each item before adding to ensure that it is the correct structure. Any errors encountered after validation (i.e. because a database constraint was encountered) will be logged for you to investigate futher.

Note

Although the qalx entities are created in a single request, be aware that files are uploaded one at a time after the entities are created.

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> #  Items take the same structure as if you were adding them individually
>>> items = [
...         {"data": {"height": 5, "width": 5}, "meta": {"shape": "square"}},
...         {"data": {"rho": 8000, "width": 193e9},
...          "input_file": "path/to/316_datasheet.pdf",
...          "meta": {"library":"materials", "family":"steel", "grade":"316"}},
...    ]
>>> qalx.item.add_many(items=items)

Storing data as file

You might have a use case where individual items have a large amount of data. qalx allows you to add these as pseudo file items by using the as_file flag, with the data being stored as a file in AWS S3:

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> large_item = qalx.item.add(
...     data={'lots': 'of', 'data': 'here'},
...     meta={'_class': 'large.item'},
...     as_file=True
... )

The principal benefit of doing this is that allows you save items with a large amount of data, without incurring the costs of storing the data in the qalx database. In addition, once an item has been save in this mode, executing the other adapter methods such as save, find and get on it are the same as for a normal item - qalx will handle the saving to and loading from file.

Adding an item with the as_file flag does have some contraints:

  • Because the data is saved as a file, you cannot execute queries to match, filter etc on anything stored in the data dictionary. Therefore, the meta must sensibly populated to allow you to query on its content.

  • Multiple items cannot be added this way using add_many.

  • Executing find, find_one or get methods is slightly slower due to the fact that qalx has to load the data from file.

Finding

We can then use the find_one and find methods to search for items

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> steel_316_item = qalx.item.find_one(query={"metadata.data.library": "materials",
...                                            "metadata.data.family": "steel",
...                                            "metadata.data.shape": "square"})
>>> steels = qalx.item.find(query={"metadata.data.family": "steel"})
>>> squares = qalx.item.find(query={"metadata.data.shape": "square"})
>>> quads = qalx.item.find(query={"$or": [{"metadata.data.shape": "square"},
...                                       {"metadata.data.shape": "rectangle"}]})

We can edit an item once we have retrieved it and save it back to qalx

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> my_shape = qalx.item.find_one(query={"data.height": 5, "data.width": 5})
>>> my_shape.data.height = 10
>>> my_shape.meta.shape = 'rectangle'
>>> qalx.item.save(my_shape)
>>> # If we think that someone else might have edited my_shape we can reload it:
>>> my_shape = qalx.item.reload(my_shape)

SET

A set is simply a collection of items. They are mapped with keys so that you can retrieve specific items from the set later:

Adding

>>> from pyqalx import QalxSession
>>>
>>> qalx = QalxSession()
>>> dims = qalx.item.add(data={"height":5, "width":5}, meta={"shape":"square"})
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
...                       data={"rho":8000, "E":193e9},
...                       meta={"library":"materials", "family":"steel", "grade":"316"})
>>> steel_square_set = qalx.add.set(items={"shape":dims, "material":steel}, meta={"profile":"square_steel"})

Finding

As with items we can then use the find_one and find methods to search for sets and easily get the item data:

>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> my_set = qalx.set.find_one(meta="profile=square_steel")
>>> youngs_mod = my_set.get_item_data("material").E

Note

Sets store a reference to an item rather than a copy of the item. Changes to the item will be reflected in all sets containing that item.

GROUP

A group is a collection of sets. These are useful for sending a lot of sets to a queue and being able to track when they have all been processed.

Adding

>>> from pyqalx import QalxSession
>>>
>>> qalx = QalxSession()
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
...                       data={"rho":8000, "E":193e9},
...                       meta={"library":"materials", "family":"steel", "grade":"316"})
>>> steel_squares = {}
>>> for size in range(2, 500):
...     dims = qalx.item.add(data={"height":size, "width":size}, meta={"shape":"square"})
...     steel_square_set = qalx.add.set(items={"shape":dims, "material":steel}, meta={"profile":"square_steel"})
...     steel_squares[size] = steel_square_set
>>> all_squares = qalx.group.add(sets=steel_squares)

CUSTOM ENTITIES

qalx provides the ability for you to register your own subclasses of an entity with an instance of QalxSession or Bot. This will enable you to reuse logic easily throughout qalx - even in your bots step functions

>>> from pyqalx import QalxSession, Item
>>>
>>> class AreaItem(Item):
>>>     """An item that has a custom method on it"""
>>>     def area(self):
>>>         return self.data.width * self.data.height
>>>
>>> qalx = QalxSession()
>>> assert qalx.item.entity_class == Item # We haven't registered our custom entity yet
>>>
>>> qalx.register(AreaItem)
>>>
>>> assert qalx.item.entity_class == AreaItem # We have registered our custom entity

Doing the above means that the area method will now be available whenever qalx returns an Item

>>> item = qalx.item.add(data={'width': 50, 'height': 20})
>>> assert item.area() == 1000

If you want to use your custom entity in a Bot then you need to register the entity class with the Bot directly. Your custom entity class will then be available on any step functions that expose an entity

>>> from pyqalx import Bot
>>>
>>> bot = Bot(bot_name='area-bot')
>>>
>>> @bot.process
>>> def process_func(job):
>>>     job.e.data.area = job.e.area() # The `area` method is available on the entity
>>>     job.save_entity()
>>> bot.start(queue_name='area-queue', entity_classes=[AreaItem, ])