Entities¶
The qalx
platform is formed of a number of core entities. Their purpose and relationships are described below.
Note
These examples use an attribute style lookup (i.e. item.data.height) for getting and setting values on an entity. You can optionally use key lookup if you prefer (i.e. item[‘data’][‘height’])
Warning
All entities are subclasses of dict. Therefore certain builtin methods could clash with lookup attributes. In this scenario you will have to do a key lookup or a slightly modified attribute lookup.
>>> from pyqalx import Item, Set
>>> item = Item({"data": {"some": "data"}})
>>> myset = Set({"items": {"item1": item}})
>>> # This won't return the items on the set - instead returning the same as `dict().items`
>>> print(myset.items)
<built-in method items of Set object at 0x7fab8f731d00>
>>> # But this will return the items on a set
>>> myset['items']
{'item1': {'data': {'some': 'data'}}}
>>> # We've added a helper method to allow you to do an attribute lookup for items
>>> # and is functionality equivalent to doing `myset["items"]`
>>> myset.items_
{'item1': {'data': {'some': 'data'}}}
qalx
maintains the type of all data passed to it. This means you don’t have to worry about casting your data to specific types - it’s all handled for you.
from pyqalx import QalxSession
from datetime import datetime
qalx = QalxSession()
time_run = datetime(2020, 6, 25, 12, 53, 23)
item = qalx.item.add(data={"time_run": time_run})
assert item.data.time_run == time_run
ITEM¶
Adding¶
An item is the core of qalx. They are structured data or a file or some combination of the two. For example:
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> dims = qalx.item.add(data={"height":5, "width":5}, meta={"shape":"square"})
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
... data={"rho":8000, "E":193e9},
... meta={"library":"materials", "family":"steel", "grade":"316"})
If you want to add multiple items in a single request you can also do that.
qalx
will validate each item before adding to ensure that it is the correct
structure. Any errors encountered after validation (i.e. because a database constraint was encountered)
will be logged for you to investigate futher.
Note
Although the qalx
entities are created in a single request, be aware that
files are uploaded one at a time after the entities are created.
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> # Items take the same structure as if you were adding them individually
>>> items = [
... {"data": {"height": 5, "width": 5}, "meta": {"shape": "square"}},
... {"data": {"rho": 8000, "width": 193e9},
... "input_file": "path/to/316_datasheet.pdf",
... "meta": {"library":"materials", "family":"steel", "grade":"316"}},
... ]
>>> qalx.item.add_many(items=items)
Storing data as file¶
You might have a use case where individual items have a large amount of data. qalx allows you to
add these as pseudo file items by using the as_file
flag, with the data
being stored as a file in
AWS S3:
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> large_item = qalx.item.add(
... data={'lots': 'of', 'data': 'here'},
... meta={'_class': 'large.item'},
... as_file=True
... )
The principal benefit of doing this is that allows you save items with a large amount of data, without incurring
the costs of storing the data in the qalx database. In addition, once an item has been save in this
mode, executing the other adapter methods such as save
, find
and get
on it are the same as for a normal
item - qalx will handle the saving to and loading from file.
Adding an item with the as_file
flag does have some contraints:
Because the data is saved as a file, you cannot execute queries to match, filter etc on anything stored in the
data
dictionary. Therefore, themeta
must sensibly populated to allow you to query on its content.Multiple items cannot be added this way using
add_many
.Executing
find
,find_one
orget
methods is slightly slower due to the fact that qalx has to load thedata
from file.
Finding¶
We can then use the find_one
and find
methods to search for items
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> steel_316_item = qalx.item.find_one(query={"metadata.data.library": "materials",
... "metadata.data.family": "steel",
... "metadata.data.shape": "square"})
>>> steels = qalx.item.find(query={"metadata.data.family": "steel"})
>>> squares = qalx.item.find(query={"metadata.data.shape": "square"})
>>> quads = qalx.item.find(query={"$or": [{"metadata.data.shape": "square"},
... {"metadata.data.shape": "rectangle"}]})
We can edit an item once we have retrieved it and save it back to qalx
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> my_shape = qalx.item.find_one(query={"data.height": 5, "data.width": 5})
>>> my_shape.data.height = 10
>>> my_shape.meta.shape = 'rectangle'
>>> qalx.item.save(my_shape)
>>> # If we think that someone else might have edited my_shape we can reload it:
>>> my_shape = qalx.item.reload(my_shape)
SET¶
A set is simply a collection of items. They are mapped with keys so that you can retrieve specific items from the set later:
Adding¶
>>> from pyqalx import QalxSession
>>>
>>> qalx = QalxSession()
>>> dims = qalx.item.add(data={"height":5, "width":5}, meta={"shape":"square"})
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
... data={"rho":8000, "E":193e9},
... meta={"library":"materials", "family":"steel", "grade":"316"})
>>> steel_square_set = qalx.add.set(items={"shape":dims, "material":steel}, meta={"profile":"square_steel"})
Finding¶
As with items we can then use the find_one
and find
methods to search for sets and easily get the item data:
>>> from pyqalx import QalxSession
>>> qalx = QalxSession()
>>> my_set = qalx.set.find_one(meta="profile=square_steel")
>>> youngs_mod = my_set.get_item_data("material").E
Note
Sets store a reference to an item rather than a copy of the item. Changes to the item will be reflected in all sets containing that item.
GROUP¶
A group is a collection of sets. These are useful for sending a lot of sets to a queue and being able to track when they have all been processed.
Adding¶
>>> from pyqalx import QalxSession
>>>
>>> qalx = QalxSession()
>>> steel = qalx.item.add(input_file="path/to/316_datasheet.pdf",
... data={"rho":8000, "E":193e9},
... meta={"library":"materials", "family":"steel", "grade":"316"})
>>> steel_squares = {}
>>> for size in range(2, 500):
... dims = qalx.item.add(data={"height":size, "width":size}, meta={"shape":"square"})
... steel_square_set = qalx.add.set(items={"shape":dims, "material":steel}, meta={"profile":"square_steel"})
... steel_squares[size] = steel_square_set
>>> all_squares = qalx.group.add(sets=steel_squares)
CUSTOM ENTITIES¶
qalx
provides the ability for you to register your own subclasses of an entity
with an instance of QalxSession or Bot. This will enable you to reuse logic
easily throughout qalx
- even in your bots step functions
>>> from pyqalx import QalxSession, Item
>>>
>>> class AreaItem(Item):
>>> """An item that has a custom method on it"""
>>> def area(self):
>>> return self.data.width * self.data.height
>>>
>>> qalx = QalxSession()
>>> assert qalx.item.entity_class == Item # We haven't registered our custom entity yet
>>>
>>> qalx.register(AreaItem)
>>>
>>> assert qalx.item.entity_class == AreaItem # We have registered our custom entity
Doing the above means that the area method will now be
available whenever qalx
returns an Item
>>> item = qalx.item.add(data={'width': 50, 'height': 20})
>>> assert item.area() == 1000
If you want to use your custom entity in a Bot then you need to register the entity class with the Bot directly. Your custom entity class will then be available on any step functions that expose an entity
>>> from pyqalx import Bot
>>>
>>> bot = Bot(bot_name='area-bot')
>>>
>>> @bot.process
>>> def process_func(job):
>>> job.e.data.area = job.e.area() # The `area` method is available on the entity
>>> job.save_entity()
>>> bot.start(queue_name='area-queue', entity_classes=[AreaItem, ])