Delete by Query
Delete by query provides a powerful method for removing a large number of documents in a single operation.
The operation removes an entire subject where there is a match on the query. That is, if even a single entry for a subject matches the query, then all entries, as well as the subject itself, will be removed from the index.
This is similar to the result of performing a query: the set of subjects deleted will exactly match the set of subjects returned by the query used for delete by query.
You may want to test delete by query operations by first executing the query as a search.
Due to the broad capability of delete by query to change the state of the index, it can only be executed by a user with 'admin' permissions on the index.
Delete By Query is submitted as an asynchronous Task, which can then be monitored using the Get Task API. Once your task is complete, the data will be removed from the search index and will no longer appear in query results.
Method |
POST |
URL |
/v1/index/<index_id>/delete_by_query |
Authentication required? |
Yes |
Required Roles |
You must have either |
Request Body |
A GSearchRequest See note below. |
Response Body |
Delete by query uses the same input structure as a normal query, GSearchRequest, but only makes use of the q, advanced_query, filters and query_template fields.
All other fields, including those related to pagination, do not change the full set of subjects which match a query, and are therefore ignored by the delete by query operation.
This allows the same GSearchRequest to be used with delete by query as may have been used for a testing or regular query request without need for alteration.
Authentication & Authorization
Tokens for this call must have one of these scopes.
urn:globus:scopes:search.api.globus.org:all urn:globus:scopes:search.api.globus.org:ingest
Request Schemas
GSearchRequest
This is the main document type for encoding a complex Search query.
Field Name | Type | Description |
---|---|---|
q |
String |
User-supplied query, conforming to the query syntax. Required if there are no filters. |
advanced |
Boolean |
Defaults to False When true, interpret q with the advanced query syntax |
filters |
Array |
Optional. An array of GFilter Documents. Filters to apply to the search |
GFilter
Field Name | TYpe | Description |
---|---|---|
type |
String |
One of {"match_any", "match_all", "range"} |
field_name |
String |
The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character. |
values |
Array of Strings or Objects |
The values to evaluate against the field_name. If type is "match_any" or "match_all" this must be a list of Strings. If type is "range", this must be a list of Objects each with the fields from and to. |
"match_any" and "match_all" refer to the different possible behaviors of the filter values. As their names imply, if "match_any" is specified, the filter will match results for which any of filter values match, while "match_all" requires that all of the values match on every result.
"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.
values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.
Examples
{
"type": "match_any",
"field_name": "https://schema\\.labs\\.datacite\\.org/meta/kernel-4\\.0/metadata\\.xsd#resourceTypeGeneral",
"values": ["Globus Endpoint"]
}
{
"type": "range",
"field_name": "path.to.date",
"values": [
{
"from": "1970-01-01",
"to": "2015-01-01"
}
]
}
{
"type": "range",
"field_name": "path.to.date",
"values": [
{
"from": "*",
"to": "2014-11-07"
}
]
}
{
"type": "match_all",
"field_name": "https://transfer\\.api\\.globus\\.org/endpoint#keywords",
"values": ["hpc", "internet2", "uchicago"]
}