Lookup submision IDs by strategic metadata#112
Conversation
Add rust ProducerClient.lookup_submission_ids_by_strategic_metadata
Add test for empty query
Add ORDER BY clause + PR cleanup
3b0ca52 to
6f9e1e3
Compare
|
Open question shouldn't an empty |
It depends on the API contract we want to provide. An argument in favour of throwing an error is that if a caller is using an API to "search for submissions matching some strategic metadata" but they haven't provided any strategic metadata, then they possibly have an issue in their code that they should be alerted to. Also, the SQLite query will need a special case to handle this, more code for something we don't have evidence for that it is a usecase we want to support. |
ReinierMaas
left a comment
There was a problem hiding this comment.
I have a few comment and would like to point out that for the other queries we also assert the query plan. This allows us to spot potentially very bad execution behaviour.
| Raises: | ||
| - `LookupIdsWithEmptyStrategicMetadataError` if the provided | ||
| 'strategic_metadata' contained no key-value pairs to look for. |
There was a problem hiding this comment.
Can this potentially also raise:
- `InternalProducerClientError` if there is a low-level internal error.
| def lookup_submission_ids_by_strategic_metadata( | ||
| self, strategic_metadata: dict[str, int] | ||
| ) -> list[SubmissionId]: |
There was a problem hiding this comment.
I checked the other APIs on the producer and they either provide a single element or an iterator so that it can be lazily evaluated and doesn't need to be materialized in memory all at once.
| async fn lookup_submission_ids_by_strategic_metadata( | ||
| State(state): State<ServerState>, | ||
| extract::Json(strategic_metadata): extract::Json<StrategicMetadataMap>, | ||
| ) -> Result<Json<Vec<SubmissionId>>, ServerError> { |
There was a problem hiding this comment.
Our SubmissionIds are close to u64::max in serialized size, which is reasonable for our random (snowflake) identifiers. This issues the following serialized to JSON response size:
- 1 (
[) + elements * 20 + elements (,|]) characters
For ~1 million this is ~20MB. We can put that as an configurable upper bound to return to the client with an explicit error if it goes over that so that people can update the settings and return the larger response knowingly.
| Raises: | ||
| - `LookupIdsWithEmptyStrategicMetadataError` if the provided | ||
| 'strategic_metadata' contained no key-value pairs to look for. |
There was a problem hiding this comment.
I would change the interface to raise an explicit error if there are too many matching Submissions, we need that for DoS protection anyway, but allow an empty strategic_metadata set to be processed.
This PR adds the
lookup_submission_ids_by_strategic_metadatamethod toProducerClient, which will return the submission IDs of in-progress subsmissions that match ALL of the provided strategic metadata key-value pairs.