Query - POST

This API provides a query interface via a POST request with a body. The body can encode complex data, including filters, sorting criteria, and requests for sideloaded facet data (i.e. aggregations).

At most 10,000 results can be fetched via a POST Query. For more results, use Scrolling Queries.

Method	POST
URL	/v1/index/<index_id>/search
Authentication required?	Only for non-public data
Required Roles	None
Request Body	a GSearchRequest document
Response Body	a GSearchResult document

Authentication & Authorization

Tokens for this call must have one of these scopes.

urn:globus:auth:scope:search.api.globus.org:all
urn:globus:auth:scope:search.api.globus.org:search

Examples

Query via curl

To run a query, we send it via a POST to the API, e.g.

curl -XPOST \
    -H 'Content-Type: application/json' \
    'https://search.api.globus.org/v1/index/4de0e89e-a395-11e7-bc54-8c705ad34f60/search' \
    --data '
{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}'

Request Schemas

GSearchRequest

This is the main document type for encoding a complex Search query.

A GSearchRequest document is versioned with the @version field as either 2017-09-01 or query#1.0.0.

When omitted, @version defaults to the current service default version, which is query#1.0.0.

version query#1.0.0
version 2017-09-01 (legacy)

This is the newer version of a Search request and is the default when no @version field is specified.

Field Name	Type	Description
@version	String	Must be `"query#1.0.0"`
q	String	User-supplied query, conforming to the query syntax. Required if there are no filters.
advanced	Boolean	Optional. When true, interpret q with the advanced query syntax Defaults to False. Mutually exclusive with `q_settings`.
q_settings	Object	Optional. A q_settings object, with settings indicating how to interpret q. Mutually exclusive with `advanced`.
limit	Integer	Optional. Limit the results given to limit many items. Defaults to 10.
offset	Integer	Optional. Start at the result numbered offset, in conjunction with limit allows result paging. Defaults to 0.
bypass_visible_to	Boolean	Optional. Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query. Defaults to False.
filter_principal_sets	List of Strings	Optional. A list of `principal_set` names. The caller’s identity set will be matched against any `principal_sets` assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.
filters	Array	An array of GFilter Documents. Filters to apply to the search. Required if `q` is not provided.
facets	Array	Optional. An array of GFacet Documents. Facets to count on the search.
post_facet_filters	Array	Optional. An array of GFilter Documents. Filters to apply to the search after facets have been counted. These filters therefore only apply to the returned array of results.
boosts	Array	Optional. An array of GBoost Documents. Fields to increase value in un-sorted searches.
sort	Array	Optional. An array of GSort Documents. Fields on which to sort returned values.

Note

If sort is specified, boosts is ignored as results will be ordered based on sorting rather than relevance calculation which is influenced by boosts.

Examples

{
  "q": "the quick brown fox jumps"
}

{
  "q": "a search with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}

{
  "q": "author: \"John Doe\"",
  "q_settings": {
    "mode": "advanced_query_string",
    "default_operator": "or"
  },
  "limit": 5
}

{
  "q": "a search with paging",
  "offset": 100,
  "limit": 100
}

{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}

{
  "q": "(queries can be fancy AND cool) OR (NOT extravagant)",
  "q_settings": {
    "mode": "advanced_query_string",
    "default_operator": "or"
  }
}

This is the legacy version of a Search request. To use this version, you must request it explicitly with the @version field.

Field Name	Type	Description
@version	String	Must be `"2017-09-01"`
q	String	User-supplied query, conforming to the query syntax. Required if there are no filters.
advanced	Boolean	Optional. When true, interpret q with the advanced query syntax Defaults to False.
limit	Integer	Optional. Limit the results given to limit many items. Defaults to 10.
offset	Integer	Optional. Start at the result numbered offset, in conjunction with limit allows result paging. Defaults to 0.
bypass_visible_to	Boolean	Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query. Defaults to False.
filter_principal_sets	List of Strings	Optional. A list of `principal_set` names. The caller’s identity set will be matched against any `principal_sets` assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.
filters	Array	An array of GFilter Documents on version `2017`. Filters to apply to the search. Required if `q` is not provided.
facets	Array	Optional. An array of GFacet Documents on version `2017`. Facets to count on the search.
boosts	Array	Optional. An array of GBoost Documents on version `2017`. Fields to increase value in un-sorted searches.
sort	Array	Optional. An array of GSort Documents on version `2017`. Fields on which to sort returned values.

Note

If sort is specified, boosts is ignored as results will be ordered based on sorting rather than relevance calculation which is influenced by boosts.

Examples

{
  "q": "the quick brown fox jumps"
}

{
  "q": "a search with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}

{
  "q": "author: \"John Doe\"",
  "advanced": true,
  "limit": 5
}

{
  "q": "a search with paging",
  "offset": 100,
  "limit": 100
}

{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}

{
  "q": "(queries can be fancy AND cool) OR (NOT extravagant)",
  "advanced": true
}

GFilter

A GFilter document is one of several document types which encode a filter. The type of filter is identified by the type field. See the table below for the various filter types.

version 1.0.0
version 2017 (legacy)

These filter documents are defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Type	Schema
`match_all`	GFilterMatch
`match_any`	GFilterMatch
`range`	GFilterRange
`geo_bounding_box`	GFilterGeoBoundingBox
`geo_shape`	GFilterGeoShape
`exists`	GFilterExists
`like`	GFilterLike
`not`	GFilterNot
`and`	GFilterAnd
`or`	GFilterOr

These filter documents are defined for documents on version 2017-09-01.

Type	Schema
`match_all`	GFilterMatch
`match_any`	GFilterMatch
`range`	GFilterRange
`geo_bounding_box`	GFilterGeoBoundingBox
`geo_shape`	GFilterGeoShape
`exists`	GFilterExists
`like`	GFilterLike
`not`	GFilterNot
`and`	GFilterAnd
`or`	GFilterOr

Note

All filters on document version 2017-09-01 support a post_filter field. Note that post_filter is only valid on filters when they are in the top level filters array of a request.

When filters are nested under and, or, or not filters, post_filter is no longer valid.

GFilterMatch

A matching filter for finding results which match some set of text terms.

"match_any" and "match_all" refer to the different possible behaviors of the filter values. As their names imply, if "match_any" is specified, the filter will match results for which any of filter values match, while "match_all" requires that all of the values match on every result.

version 1.0.0
version 2017 (legacy)

This is the version of a "match" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

One of {"match_any", "match_all"}

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Strings or Booleans

The values to evaluate against the field_name.

If the field is a boolean field, this must be an array of booleans only. For string fields, it may be a mixture of strings or booleans.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Examples

{
  "type": "match_any",
  "field_name": "globus_metadata.resource_type",
  "values": [
    "Globus Endpoint"
  ]
}

{
  "type": "match_all",
  "field_name": "globus_metadata.keywords",
  "values": [
    "hpc",
    "internet2",
    "uchicago"
  ]
}

{
  "type": "match_any",
  "field_name": "globus_metadata.snorkels",
  "values": [
    "few",
    "many",
    true
  ]
}

Note

This filter is only valid if globus_metadata.snorkels is a string field because string fields can contain boolean values.

If it is a boolean field (which cannot contain string values), the query will fail with an error regarding the improper mapping of "few" and "many" onto a boolean field.

This is the version of a "match" filter defined for legacy document versions (2017-09-01).

Field Name	Type	Description
type	String	One of `{"match_any", "match_all"}`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
values	Array of Strings or Booleans	The values to evaluate against the field_name. If the field is a boolean field, this must be an array of booleans only. For string fields, it may be a mixture of strings or booleans.
post_filter	Boolean	Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results. Defaults to True for `match_any` and False for `match_all`.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Examples

{
  "type": "match_any",
  "field_name": "globus_metadata.resource_type",
  "values": [
    "Globus Endpoint"
  ]
}

{
  "type": "match_all",
  "field_name": "globus_metadata.keywords",
  "values": [
    "hpc",
    "internet2",
    "uchicago"
  ]
}

{
  "type": "match_any",
  "field_name": "globus_metadata.snorkels",
  "values": [
    "few",
    "many",
    true
  ]
}

Note

This filter is only valid if globus_metadata.snorkels is a string field because string fields can contain boolean values.

If it is a boolean field (which cannot contain string values), the query will fail with an error regarding the improper mapping of "few" and "many" onto a boolean field.

GFilterRange

A range filter for finding results which have numeric or date values within a specified range.

version 1.0.0
version 2017 (legacy)

This is the version of a "range" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "range"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Objects

The values to evaluate against the field_name.

Each object has the fields from and to OR each object has exactly one of gte or gt and exactly one of lt or lte.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "10",
      "to": "50"
    }
  ]
}

This example filter has multiple clauses. The combination is implicitly joined with "or" semantics. This means that we allow values from 0 to 5, and greater than or equal to 10.

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "0",
      "to": "5"
    },
    {
      "from": "10",
      "to": "*"
    }
  ]
}

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    },
    {
      "gte": "2015-01-01",
      "lte": "2016-01-01"
    },
    {
      "gt": "2016-01-15",
      "lt": "*"
    }
  ]
}

This is the version of a "range" filter defined for legacy document versions (2017-09-01).

Field Name	Type	Description
type	String	Must have the value `"range"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
values	Array of Objects	The values to evaluate against the `field_name`. Each object has the fields `from` and `to` OR each object has exactly one of `gte` or `gt` and exactly one of `lt` or `lte`.
post_filter	Boolean	Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results. Defaults to True.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "10",
      "to": "50"
    }
  ]
}

This example filter has multiple clauses. The combination is implicitly joined with "or" semantics. This means that we allow values from 0 to 5, and greater than or equal to 10.

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "0",
      "to": "5"
    },
    {
      "from": "10",
      "to": "*"
    }
  ]
}

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    },
    {
      "gte": "2015-01-01",
      "lte": "2016-01-01"
    },
    {
      "gt": "2016-01-15",
      "lt": "*"
    }
  ]
}

GFilterGeoBoundingBox

A bounding box filter for finding geo_shape and geo_point values which intersect with a specified bounding box.

version 1.0.0
version 2017 (legacy)

This is the version of a "geo bounding box" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"geo_bounding_box"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
top_left	Object	An object describing a coordinate pair. It must contain the keys `lat` and `lon`, each of which must have a numeric value.
bottom_right	Object	An object describing a coordinate pair. It must contain the keys `lat` and `lon`, each of which must have a numeric value.

Note

top_left is required to be northwest of bottom_right.

Examples

{
  "type": "geo_bounding_box",
  "field_name": "country.center",
  "top_left": {
    "lat": 49.1,
    "lon": -124.9
  },
  "bottom_right": {
    "lat": 24.9,
    "lon": -67.1
  }
}

This is the version of a "geo bounding box" filter defined for legacy document versions (2017-09-01).

Field Name	Type	Description
type	String	Must have the value `"geo_bounding_box"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
top_left	Object	An object describing a coordinate pair. It must contain the keys `lat` and `lon`, each of which must have a numeric value.
bottom_right	Object	An object describing a coordinate pair. It must contain the keys `lat` and `lon`, each of which must have a numeric value.
post_filter	Boolean	Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results. Defaults to True.

Note

top_left is required to be northwest of bottom_right.

Examples

{
  "type": "geo_bounding_box",
  "field_name": "country.center",
  "top_left": {
    "lat": 49.1,
    "lon": -124.9
  },
  "bottom_right": {
    "lat": 24.9,
    "lon": -67.1
  }
}

GFilterGeoShape

A geo filter for finding geo_shape and geo_point values which intersect with or are contained within a given shape.

version 1.0.0
version 2017 (legacy)

This is the version of a "geo shape" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"geo_shape"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
shape	Object	A GeoJSON formatted Geometry. See note below on supported geometries.
relation	String	The shape relationship to test. One of `{"intersects", "within"}`. Defaults to `"intersects"`.

Examples

{
  "type": "geo_shape",
  "field_name": "city.boundary",
  "shape": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -5.8,
          51.5
        ],
        [
          10.0,
          51.5
        ],
        [
          10.0,
          41.0
        ],
        [
          -5.8,
          41.0
        ],
        [
          -5.8,
          51.5
        ]
      ]
    ]
  }
}

This is the version of a "geo shape" filter defined for legacy document versions (2017-09-01).

Field Name	Type	Description
type	String	Must have the value `"geo_shape"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
shape	Object	A GeoJSON formatted Geometry. See note below on supported geometries.
relation	String	The shape relationship to test. One of `{"intersects", "within"}`. Defaults to `"intersects"`.
post_filter	Boolean	Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results. Defaults to True.

Examples

{
  "type": "geo_shape",
  "field_name": "city.boundary",
  "shape": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -5.8,
          51.5
        ],
        [
          10.0,
          51.5
        ],
        [
          10.0,
          41.0
        ],
        [
          -5.8,
          41.0
        ],
        [
          -5.8,
          51.5
        ]
      ]
    ]
  }
}

Supported Geometries

Only two-dimensional GeoJSON data are allowed in geo_shape filters. That means that coordinates should be encoded as JSON arrays of length 2.

Globus Search only supports filters using GeoJSON Polygons. Furthermore, Polygons are restricted to simple polygons, consisting of only one coordinate ring. This means that polygons with internal cut-outs are forbidden.

GFilterExists

An "existence" filter which checks if a field is present in a document with a non-null value. Note that a field being present but with a value of null is considered the same, under exists filters, as the field being absent from the document.

version 1.0.0
version 2017 (legacy)

This is the version of an "exists" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"exists"`
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

Examples

The following filter finds documents where the field foo exists:

{
  "type": "exists",
  "field_name": "foo"
}

This is the version of an "exists" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "exists"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field foo exists:

{
  "type": "exists",
  "field_name": "foo"
}

GFilterLike

A "like" filter which checks if a field matches a "like-expression". Like expressions are matching strings containing the wildcard characters:

* matches any number of characters
? matches any one character

version 1.0.0
version 2017 (legacy)

This is the version of a "like" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"like"`.
field_name	String	The field to which the filter refers. It must be a text field.
value	String	The filter expression to apply as a match.

Examples

The following filter finds documents where the field filename contains a string ending in .csv.

{
  "type": "like",
  "field_name": "filename",
  "value": "*.csv"
}

Note that this does not technically guarantee that the filename ends with .csv. For example, it is possible for the filter to match on a value like "filename": "foo.csv bar".

This is the version of a "like" filter defined for legacy document versions (2017-09-01).

Field Name	Type	Description
type	String	Must have the value `"like"`.
field_name	String	The field to which the filter refers. It must be a text field.
value	String	The filter expression to apply as a match.
post_filter	Boolean	Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results. Defaults to True.

Examples

The following filter finds documents where the field filename contains a string ending in .csv.

{
  "type": "like",
  "field_name": "filename",
  "value": "*.csv"
}

Note that this does not technically guarantee that the filename ends with .csv. For example, it is possible for the filter to match on a value like "filename": "foo.csv bar".

GFilterNot

A "not" filter for inverting any other valid filter.

version 1.0.0
version 2017 (legacy)

This is the version of a "not" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"not"`
filter	Object	Any valid GFilter object.

Examples

The following filter finds documents where the field foo does not exist:

{
  "type": "not",
  "filter": {
    "type": "exists",
    "field_name": "foo"
  }
}

This is the version of a "not" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "not"

filter

Object

Any valid GFilter object.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field foo does not exist:

{
  "type": "not",
  "filter": {
    "type": "exists",
    "field_name": "foo"
  }
}

GFilterAnd

An "and" filter for joining any other valid filters. In order for an "and" filter to match on documents, all of the filters it contains must match.

version 1.0.0
version 2017 (legacy)

This is the version of an "and" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"and"`
filters	Array of Object	An array of GFilter objects.

Examples

The following filter finds documents where the field title exists and keywords contains hpc:

{
  "type": "and",
  "filter": [
    {
      "type": "exists",
      "field_name": "title"
    },
    {
      "type": "match_any",
      "field_name": "keywords",
      "values": [
        "hpc"
      ]
    }
  ]
}

This is the version of an "and" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "and"

filters

Array of Object

An array of GFilter objects.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field title exists and keywords contains hpc:

{
  "type": "and",
  "filter": [
    {
      "type": "exists",
      "field_name": "title"
    },
    {
      "type": "match_any",
      "field_name": "keywords",
      "values": [
        "hpc"
      ]
    }
  ]
}

GFilterOr

An "or" filter for joining any other valid filters. In order for an "or" filter to match on documents, at least one of the filters it contains must match.

version 1.0.0
version 2017 (legacy)

This is the version of an "or" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name	Type	Description
type	String	Must have the value `"or"`
filters	Array of Object	An array of GFilter objects.

Examples

The following filter finds documents where either the author.institution or the dataset.institution is uchicago.edu. One or both can be a match:

{
  "type": "or",
  "filter": [
    {
      "type": "match_any",
      "field_name": "author.institution",
      "values": [
        "uchicago.edu"
      ]
    },
    {
      "type": "match_any",
      "field_name": "dataset.institution",
      "values": [
        "uchicago.edu"
      ]
    }
  ]
}

This is the version of an "or" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "or"

filters

Array of Object

An array of GFilter objects.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where either the author.institution or the dataset.institution is uchicago.edu. One or both can be a match:

{
  "type": "or",
  "filter": [
    {
      "type": "match_any",
      "field_name": "author.institution",
      "values": [
        "uchicago.edu"
      ]
    },
    {
      "type": "match_any",
      "field_name": "dataset.institution",
      "values": [
        "uchicago.edu"
      ]
    }
  ]
}

This is the definition of facet documents for query#1.0.0.

Field Name	Type	Description
name	String	A name for this facet which is referenced in the results. If name is omitted, it will default to the value of the `field_name` property. If more than one facet in a single search request references the same field, a name must be provided.
type	String	One of `terms`, `date_histogram`, `numeric_histogram`, `sum`, `avg`
field_name	String	The field to which the facet refers. Any dots (`.`) must be escaped with a preceding backslash (`\`) character.
size	Integer	The number of distinct facet values (buckets) to return. For terms, `size=N` limits results to the `N` most common values (buckets with highest count). For numeric_histograms, this is the number of intervals between low and high of the `histogram_range` to be created. Required if `type=numeric_histogram`. Optional if `type=terms`. Forbidden otherwise.
missing	Float	The value to use for entries that do not contain the field named by the value of `field_name`. By default, missing values will be ignored and do not count towards sums and averages. Optional if `type=sum` or `type=avg`. Forbidden otherwise.
histogram_range	Object	An object containing the following fields: `low`: Numeric or date formatted String containing the value at the low end of the histogram range `high`: Numeric or date formatted String containing the value at the high end of the histogram range Required if `type=numeric_histogram`. Optional if `type=date_histogram`. Forbidden otherwise.
date_interval	String	Indicates the unit for the buckets returned within the `histogram_range` Must be one of: `year`, `quarter`, `month`, `week`, `day`, `hour`, `minute`, `second` Required when `type=date_histogram`. Forbidden otherwise.
additional_filters	Array of Objects	Optional. An array of GFilter documents which should apply only to this specific facet. When multiple facets are specified in a query, the `additional_filters` allow per-facet refinement of the results.

Note

For a terms facet, any values containing more than 10,000 characters will not be tabulated into the results and no buckets containing a value with more than 10,000 characters will be created.

date_histogram faceting requires that the field was detected as a date type. See the Globus Search supported Date Formats to see how data is detected as being a date. The histogram also requires that low and high are both in one of the supported date formats.

{
  "name": "File Extension",
  "type": "terms",
  "field_name": "extension",
  "size": 10
}

{
  "name": "pub_date",
  "type": "date_histogram",
  "field_name": "http://dublincore\\.org/schemas/xmls/qdc/2008/02/11/dcterms\\.xsd#created",
  "histogram_range": {
    "low": "2000-01-01",
    "high": "2010-01-01"
  },
  "date_interval": "year"
}

{
  "name": "file size",
  "type": "numeric_histogram",
  "field_name": "https://transfer\\.api\\.globus\\.org/file#size",
  "size": 100,
  "histogram_range": {
    "low": 0,
    "high": 100000000
  }
}

{
  "name": "calculate total cost",
  "type": "sum",
  "field_name": "price"
}

{
  "name": "calculate average cost per item",
  "type": "avg",
  "missing": 1.2,
  "field_name": "price"
}

This is the definition of facet documents for version 2017-09-01 queries.

Field Name	Type	Description
name	String	A name for this facet which is referenced in the results. If name is omitted, it will default to the value of the `field_name` property. If more than one facet in a single search request references the same field, a name must be provided.
type	String	One of `terms`, `date_histogram`, `numeric_histogram`, `sum`, `avg`
field_name	String	The field to which the facet refers. Any dots (`.`) must be escaped with a preceding backslash (`\`) character.
size	Integer	The number of distinct facet values (buckets) to return. For terms, `size=N` limits results to the `N` most common values (buckets with highest count). For numeric_histograms, this is the number of intervals between low and high of the `histogram_range` to be created. Required if `type=numeric_histogram`. Optional if `type=terms`. Forbidden otherwise.
missing	Float	The value to use for entries that do not contain the field named by the value of `field_name`. By default, missing values will be ignored and do not count towards sums and averages. Optional if `type=sum` or `type=avg`. Forbidden otherwise.
histogram_range	Object	An object containing the following fields: `low`: Numeric or date formatted String containing the value at the low end of the histogram range `high`: Numeric or date formatted String containing the value at the high end of the histogram range Required if `type=numeric_histogram`. Optional if `type=date_histogram`. Forbidden otherwise.
date_interval	String	Indicates the unit for the buckets returned within the `histogram_range` Must be one of: `year`, `quarter`, `month`, `week`, `day`, `hour`, `minute`, `second` Required when `type=date_histogram`. Forbidden otherwise.

Note

For a terms facet, any values containing more than 10,000 characters will not be tabulated into the results and no buckets containing a value with more than 10,000 characters will be created.

{
  "name": "File Extension",
  "type": "terms",
  "field_name": "extension",
  "size": 10
}

{
  "name": "pub_date",
  "type": "date_histogram",
  "field_name": "http://dublincore\\.org/schemas/xmls/qdc/2008/02/11/dcterms\\.xsd#created",
  "histogram_range": {
    "low": "2000-01-01",
    "high": "2010-01-01"
  },
  "date_interval": "year"
}

{
  "name": "file size",
  "type": "numeric_histogram",
  "field_name": "https://transfer\\.api\\.globus\\.org/file#size",
  "size": 100,
  "histogram_range": {
    "low": 0,
    "high": 100000000
  }
}

{
  "name": "calculate total cost",
  "type": "sum",
  "field_name": "price"
}

{
  "name": "calculate average cost per item",
  "type": "avg",
  "missing": 1.2,
  "field_name": "price"
}

GBoost

version 1.0.0
version 2017 (legacy)

This is the definition of boost documents for query#1.0.0.

Field Name	Type	Description
field_name	String	Field to rank higher in results. Any dots (".") must be escaped with a preceding backslash ("\") character or they will be treated as paths to a field and not part of a field name
factor	Floating Point	Factor for weighting results for query matches on the field_name. >1 is higher ranking, <1 is negative boosting. Maximum of 10, minimum of 0

Examples

{
  "field_name": "author",
  "factor": 5
}

This is the definition of boost documents for version 2017-09-01 queries.

Field Name	Type	Description
field_name	String	Field to rank higher in results. Any dots (".") must be escaped with a preceding backslash ("\") character or they will be treated as paths to a field and not part of a field name
factor	Floating Point	Factor for weighting results for query matches on the field_name. >1 is higher ranking, <1 is negative boosting. Maximum of 10, minimum of 0

Examples

{
  "field_name": "author",
  "factor": 5
}

GSort

version 1.0.0
version 2017 (legacy)

This is the definition of sort documents for query#1.0.0.

Field Name	Type	Description
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
order	String	Must be one of "asc" or "desc" indicating the ordering of the sort: ascending ("asc") or descending ("desc"). Also, see note on sorting when multiple values are present for a particular field.

Examples

{
  "field_name": "author",
  "order": "asc"
}

{
  "field_name": "path.to.date",
  "order": "desc"
}

This is the definition of sort documents for version 2017-09-01 queries.

Field Name	Type	Description
field_name	String	The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.
order	String	Must be one of "asc" or "desc" indicating the ordering of the sort: ascending ("asc") or descending ("desc"). Also, see note on sorting when multiple values are present for a particular field.

Examples

{
  "field_name": "author",
  "order": "asc"
}

{
  "field_name": "path.to.date",
  "order": "desc"
}

Note

For purposes of sorting, a text field containing more than 10,000 characters will be considered missing, and will thus be sorted to the end of the list.

Sorting on Multiple Values

When the field used for sorting is an array, or when there are multiple entries under a single subject, sorting must consider multiple values as sort criteria.

Consider the following two partial documents:

document15.json

{
  "an_integer": [
    1,
    5
  ]
}

document24.json

{
  "an_integer": [
    2,
    4
  ]
}

If we sort on an_integer, which should be sorted first?

In such situations, the value used during sorting is the "smallest" when doing ascending sort and "largest" when doing a descending sort. That means that document15.json would always sort before document24.json when sorting on an_integer!

Missing Fields and Sort Order

Any record which does not contain a value for a field which sorted upon will appear at the end of the sorted list regardless of whether the sort is ascending or descending.

If more than one record does not contain a value, the ordering among those records is undefined.

q_settings

version 1.0.0

This is the definition of q_settings documents for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

Field Name	Type	Description
mode	String	`query_string`: Evaluate `q` using the normal query mode. `advanced_query_string`: Evaluate `q` using the advanced query mode.
default_operator	String	`or`: Return results that match any of the query terms. `and`: Return results that match all of the query terms.

mode

String

query_string: Evaluate q using the normal query mode.

advanced_query_string: Evaluate q using the advanced query mode.

default_operator

String

or: Return results that match any of the query terms.

and: Return results that match all of the query terms.

Examples

{
  "mode": "query_string",
  "default_operator": "or"
}

Response Schemas

GSearchResult

This is the document type for all results from Search queries.

Field Name	Type	Description
gmeta	Array	An array of GMetaResult documents, the main body of the result
facet_results	Array	Optional. An array of GFacetResult documents with counts for all facets requested on the search request
offset	Integer	The offset provided on the input search request
count	Integer	The number of results returned; i.e. the size of the gmeta array. May be 0
total	Integer	The total number of matches for the search. May be 0 if no matches are found
has_next_page	Boolean	True if there’s another page of results available, False otherwise

Examples

This result is in the 2019-08-27 format for GMetaResult documents.

{
  "@datatype": "GSearchResult",
  "@version": "2017-09-01",
  "count": 1,
  "gmeta": [
    {
      "@datatype": "GMetaResult",
      "@version": "2019-08-27",
      "entries": [
        {
          "content": {
            "cuisine": [
              "mexican"
            ],
            "handle": "salsa-verde",
            "ingredients": [
              {
                "amount": {
                  "number": 10
                },
                "default": "tomatillo",
                "preparation": "simmer 20 minutes",
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2
                },
                "default": "serrano pepper",
                "preparation": "seeded",
                "substitutes": [
                  "jalapeno",
                  "thai bird chili"
                ],
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "clove"
                },
                "default": "garlic",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 0.5
                },
                "default": "yellow onion",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tsp"
                },
                "default": "salt",
                "type": "spice"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tbsp"
                },
                "default": "coriander",
                "preparation": "ground",
                "substitutes": [
                  "cumin"
                ],
                "type": "spice"
              }
            ],
            "keywords": [
              "salsa",
              "tomatillo",
              "coriander",
              "serrano pepper"
            ],
            "origin": {
              "author": "Diana Kennedy",
              "title": "Regional Mexican Cooking",
              "type": "book"
            }
          },
          "entry_id": null
        }
      ],
      "subject": "https://en.wikipedia.org/wiki/Salsa_verde"
    }
  ],
  "offset": 0,
  "total": 1
}

This result is in the 2017-09-01 format for GMetaResult documents.

{
  "count": 1,
  "offset": 0,
  "total": 1,
  "gmeta": [
    {
      "content": [
        {
          "alpha": {
            "beta": "gamma"
          }
        }
      ],
      "entry_ids": [
        null
      ],
      "subject": "http://example.com"
    }
  ]
}

GMetaResult

These are components in a search result.

A GMetaResult is a structure similar to a GMetaEntry from the Ingest API, with the following significant differences:

visibility information is not exposed; i.e. visible_to is not included
metadata for any subject may be an aggregate of multiple documents with different visibility rules or sources. Thus, the result is always returned as an array in which each element represents data provided by a different source or with different visibility

GMetaResult

Field Name Type Description

Field Name	Type	Description
subject	String	the resource described by this metadata, often a URI
entries	Array	An array of objects containing the data pertaining to the subject. Each object has the fields `content`, `entry_id`, and `matched_principal_sets`. The `content` is an object with the entry data which was sent to Search, and the `entry_id` is its ID. If there are any assigned `principal_sets` for the entry which match the current caller, they will be returned as an array of strings in `matched_principal_sets`.

subject

String

the resource described by this metadata, often a URI

entries

Array

An array of objects containing the data pertaining to the subject.

Each object has the fields content, entry_id, and matched_principal_sets. The content is an object with the entry data which was sent to Search, and the entry_id is its ID. If there are any assigned principal_sets for the entry which match the current caller, they will be returned as an array of strings in matched_principal_sets.

{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}

GBucket

Field Name	Type	Description
value	String or Object	If the bucket represents a single value (e.g. in a "terms" `GFacet`), the value is provided. If the bucket represents a range of values, then this is an object with "from" and "to" as in a `GFilter` document This range is assumed to be closed for the "from" value and open on the "to" value as in [from, to)
count	Integer	The number of results in this bucket

{
  "value": ".docx",
  "count": 1234
}

{
  "value": {
    "from": "0",
    "to": "10"
  },
  "count": 0
}

{
  "value": {
    "from": "2011-01-01",
    "to": "2012-01-01"
  },
  "count": 17
}

GFacetResult

Field Name	Type	Description
name	String	Name of the `GFacet` in the search request
value	Float	Result of the `GFacet` if it was a sum or avg facet
buckets	Array	An array of GBucket documents if it was a terms, numeric_histogram or date_histogram facet

{
  "name": "extensions",
  "buckets": [
    {
      "@version": "2017-09-01",
      "value": ".docx",
      "count": 1234
    },
    {
      "@version": "2017-09-01",
      "value": ".png",
      "count": 12
    }
  ]
}

{
  "name": "calculations",
  "value": 24.5
}