Search API
  • Globus Search
  • Overview
  • API Usage & Basics
  • Ingest
  • Query
  • Types, Type Detection, and Schemas
  • Error Handling
  • API Reference
    • Batch Delete by Subject
    • Create or Update Entry
    • Delete by Query
    • Delete by Subject
    • Entry Delete
    • Entry Show
    • Index Create
    • Index Delete
    • Index List
    • Index Reopen
    • Index Show
    • Ingest
    • Query - GET
    • Query - POST
    • Role Create
    • Role Delete
    • Role List
    • Scroll Query
    • Subject Show
    • Task List
    • Task Show
  • Guides
    • Geospatial Search
    • Role Based Filtering
    • Searchable Files
  • Globus Search Limits
  • API Change History
Skip to main content
Globus Docs
  • APIs
    Auth Flows Groups Search Timers Transfer Globus Connect Server Compute Helper Pages
  • Applications
    Globus Connect Personal Globus Connect Server Premium Storage Connectors Compute Command Line Interface Python SDK JavaScript SDK
  • Guides
  • Support
    FAQs Mailing Lists Contact Us Check Support Tickets
  1. Home
  2. Globus Services
  3. Globus Search
  4. API Reference
  5. Query - POST

Query - POST

This API provides a query interface via a POST request with a body. The body can encode complex data, including filters, sorting criteria, and requests for sideloaded facet data (i.e. aggregations).

At most 10,000 results can be fetched via a POST Query. For more results, use Scrolling Queries.

Method

POST

URL

/v1/index/<index_id>/search

Authentication required?

Only for non-public data

Required Roles

None

Request Body

a GSearchRequest document

Response Body

a GSearchResult document

Authentication & Authorization

Tokens for this call must have one of these scopes.

urn:globus:auth:scope:search.api.globus.org:all
urn:globus:auth:scope:search.api.globus.org:search

Examples

Query via curl

To run a query, we send it via a POST to the API, e.g.

curl -XPOST \
    -H 'Content-Type: application/json' \
    'https://search.api.globus.org/v1/index/4de0e89e-a395-11e7-bc54-8c705ad34f60/search' \
    --data '
{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}'

Request Schemas

GSearchRequest

This is the main document type for encoding a complex Search query.

A GSearchRequest document is versioned with the @version field as either 2017-09-01 or query#1.0.0.

When omitted, @version defaults to the current service default version, which is 2017-09-01 (the legacy version).

  • version query#1.0.0
  • version 2017-09-01 (legacy)

This is the newer version of a Search request. It will become the default in a future release of Globus Search. Until then, users must request it explicitly with the @version field.

Field Name Type Description

@version

String

Must be "query#1.0.0"

q

String

User-supplied query, conforming to the query syntax.

Required if there are no filters.

advanced

Boolean

Optional. When true, interpret q with the advanced query syntax

Defaults to False. Mutually exclusive with q_settings.

q_settings

Object

Optional. A q_settings object, with settings indicating how to interpret q.

Mutually exclusive with advanced.

limit

Integer

Optional. Limit the results given to limit many items.

Defaults to 10.

offset

Integer

Optional. Start at the result numbered offset, in conjunction with limit allows result paging.

Defaults to 0.

bypass_visible_to

Boolean

Optional. Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query.

Defaults to False.

filter_principal_sets

List of Strings

Optional. A list of principal_set names.

The caller’s identity set will be matched against any principal_sets assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.

filters

Array

An array of GFilter Documents. Filters to apply to the search.

Required if q is not provided.

facets

Array

Optional. An array of GFacet Documents. Facets to count on the search.

post_facet_filters

Array

Optional. An array of GFilter Documents. Filters to apply to the search after facets have been counted. These filters therefore only apply to the returned array of results.

boosts

Array

Optional. An array of GBoost Documents. Fields to increase value in un-sorted searches.

sort

Array

Optional. An array of GSort Documents. Fields on which to sort returned values.

Note

If sort is specified, boosts is ignored as results will be ordered based on sorting rather than relevance calculation which is influenced by boosts.

Examples
{
  "q": "the quick brown fox jumps"
}
{
  "q": "a search with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}
{
  "q": "author: \"John Doe\"",
  "q_settings": {
    "mode": "advanced_query_string",
    "default_operator": "or"
  },
  "limit": 5
}
{
  "q": "a search with paging",
  "offset": 100,
  "limit": 100
}
{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}
{
  "q": "(queries can be fancy AND cool) OR (NOT extravagant)",
  "q_settings": {
    "mode": "advanced_query_string",
    "default_operator": "or"
  }
}

This is the legacy version of a Search request. It is the default when @version is omitted for compatibility while users migrate to the query#1.0.0 version.

Field Name Type Description

@version

String

Must be "2017-09-01"

q

String

User-supplied query, conforming to the query syntax.

Required if there are no filters.

advanced

Boolean

Optional. When true, interpret q with the advanced query syntax

Defaults to False.

limit

Integer

Optional. Limit the results given to limit many items.

Defaults to 10.

offset

Integer

Optional. Start at the result numbered offset, in conjunction with limit allows result paging.

Defaults to 0.

bypass_visible_to

Boolean

Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query.

Defaults to False.

filter_principal_sets

List of Strings

Optional. A list of principal_set names.

The caller’s identity set will be matched against any principal_sets assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.

filters

Array

An array of GFilter Documents on version 2017. Filters to apply to the search.

Required if q is not provided.

facets

Array

Optional. An array of GFacet Documents on version 2017. Facets to count on the search.

boosts

Array

Optional. An array of GBoost Documents on version 2017. Fields to increase value in un-sorted searches.

sort

Array

Optional. An array of GSort Documents on version 2017. Fields on which to sort returned values.

Note

If sort is specified, boosts is ignored as results will be ordered based on sorting rather than relevance calculation which is influenced by boosts.

Examples
{
  "q": "the quick brown fox jumps"
}
{
  "q": "a search with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}
{
  "q": "author: \"John Doe\"",
  "advanced": true,
  "limit": 5
}
{
  "q": "a search with paging",
  "offset": 100,
  "limit": 100
}
{
  "q": "a search with filtering and faceting",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ],
  "facets": [
    {
      "name": "Publication Date",
      "field_name": "path.to.date",
      "type": "date_histogram",
      "date_interval": "year"
    }
  ],
  "sort": [
    {
      "field_name": "path.to.date",
      "order": "asc"
    }
  ]
}
{
  "q": "(queries can be fancy AND cool) OR (NOT extravagant)",
  "advanced": true
}

GFilter

A GFilter document is one of several document types which encode a filter. The type of filter is identified by the type field. See the table below for the various filter types.

  • version 1.0.0
  • version 2017 (legacy)

These filter documents are defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Type Schema

match_all

GFilterMatch

match_any

GFilterMatch

range

GFilterRange

geo_bounding_box

GFilterGeoBoundingBox

geo_shape

GFilterGeoShape

exists

GFilterExists

like

GFilterLike

not

GFilterNot

and

GFilterAnd

or

GFilterOr

These filter documents are defined for documents on version 2017-09-01.

Type Schema

match_all

GFilterMatch

match_any

GFilterMatch

range

GFilterRange

geo_bounding_box

GFilterGeoBoundingBox

geo_shape

GFilterGeoShape

exists

GFilterExists

like

GFilterLike

not

GFilterNot

and

GFilterAnd

or

GFilterOr

Note

All filters on document version 2017-09-01 support a post_filter field. Note that post_filter is only valid on filters when they are in the top level filters array of a request.

When filters are nested under and, or, or not filters, post_filter is no longer valid.

GFilterMatch

A matching filter for finding results which match some set of text terms.

"match_any" and "match_all" refer to the different possible behaviors of the filter values. As their names imply, if "match_any" is specified, the filter will match results for which any of filter values match, while "match_all" requires that all of the values match on every result.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "match" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

One of {"match_any", "match_all"}

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Strings or Booleans

The values to evaluate against the field_name.

If the field is a boolean field, this must be an array of booleans only. For string fields, it may be a mixture of strings or booleans.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Examples
{
  "type": "match_any",
  "field_name": "globus_metadata.resource_type",
  "values": [
    "Globus Endpoint"
  ]
}
{
  "type": "match_all",
  "field_name": "globus_metadata.keywords",
  "values": [
    "hpc",
    "internet2",
    "uchicago"
  ]
}
{
  "type": "match_any",
  "field_name": "globus_metadata.snorkels",
  "values": [
    "few",
    "many",
    true
  ]
}
Note

This filter is only valid if globus_metadata.snorkels is a string field because string fields can contain boolean values.

If it is a boolean field (which cannot contain string values), the query will fail with an error regarding the improper mapping of "few" and "many" onto a boolean field.

This is the version of a "match" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

One of {"match_any", "match_all"}

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Strings or Booleans

The values to evaluate against the field_name.

If the field is a boolean field, this must be an array of booleans only. For string fields, it may be a mixture of strings or booleans.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True for match_any and False for match_all.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Examples
{
  "type": "match_any",
  "field_name": "globus_metadata.resource_type",
  "values": [
    "Globus Endpoint"
  ]
}
{
  "type": "match_all",
  "field_name": "globus_metadata.keywords",
  "values": [
    "hpc",
    "internet2",
    "uchicago"
  ]
}
{
  "type": "match_any",
  "field_name": "globus_metadata.snorkels",
  "values": [
    "few",
    "many",
    true
  ]
}
Note

This filter is only valid if globus_metadata.snorkels is a string field because string fields can contain boolean values.

If it is a boolean field (which cannot contain string values), the query will fail with an error regarding the improper mapping of "few" and "many" onto a boolean field.

GFilterRange

A range filter for finding results which have numeric or date values within a specified range.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "range" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "range"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Objects

The values to evaluate against the field_name.

Each object has the fields from and to OR each object has exactly one of gte or gt and exactly one of lt or lte.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples
{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}
{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "10",
      "to": "50"
    }
  ]
}

This example filter has multiple clauses. The combination is implicitly joined with "or" semantics. This means that we allow values from 0 to 5, and greater than or equal to 10.

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "0",
      "to": "5"
    },
    {
      "from": "10",
      "to": "*"
    }
  ]
}
{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    },
    {
      "gte": "2015-01-01",
      "lte": "2016-01-01"
    },
    {
      "gt": "2016-01-15",
      "lt": "*"
    }
  ]
}

This is the version of a "range" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "range"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Objects

The values to evaluate against the field_name.

Each object has the fields from and to OR each object has exactly one of gte or gt and exactly one of lt or lte.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples
{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}
{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "10",
      "to": "50"
    }
  ]
}

This example filter has multiple clauses. The combination is implicitly joined with "or" semantics. This means that we allow values from 0 to 5, and greater than or equal to 10.

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "0",
      "to": "5"
    },
    {
      "from": "10",
      "to": "*"
    }
  ]
}
{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    },
    {
      "gte": "2015-01-01",
      "lte": "2016-01-01"
    },
    {
      "gt": "2016-01-15",
      "lt": "*"
    }
  ]
}

GFilterGeoBoundingBox

A bounding box filter for finding geo_shape and geo_point values which intersect with a specified bounding box.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "geo bounding box" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "geo_bounding_box"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

top_left

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

bottom_right

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

Note

top_left is required to be northwest of bottom_right.

Examples
{
  "type": "geo_bounding_box",
  "field_name": "country.center",
  "top_left": {
    "lat": 49.1,
    "lon": -124.9
  },
  "bottom_right": {
    "lat": 24.9,
    "lon": -67.1
  }
}

This is the version of a "geo bounding box" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "geo_bounding_box"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

top_left

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

bottom_right

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Note

top_left is required to be northwest of bottom_right.

Examples
{
  "type": "geo_bounding_box",
  "field_name": "country.center",
  "top_left": {
    "lat": 49.1,
    "lon": -124.9
  },
  "bottom_right": {
    "lat": 24.9,
    "lon": -67.1
  }
}

GFilterGeoShape

A geo filter for finding geo_shape and geo_point values which intersect with or are contained within a given shape.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "geo shape" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "geo_shape"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

shape

Object

A GeoJSON formatted Geometry.

See note below on supported geometries.

relation

String

The shape relationship to test.

One of {"intersects", "within"}. Defaults to "intersects".

Examples
{
  "type": "geo_shape",
  "field_name": "city.boundary",
  "shape": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -5.8,
          51.5
        ],
        [
          10.0,
          51.5
        ],
        [
          10.0,
          41.0
        ],
        [
          -5.8,
          41.0
        ],
        [
          -5.8,
          51.5
        ]
      ]
    ]
  }
}

This is the version of a "geo shape" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "geo_shape"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

shape

Object

A GeoJSON formatted Geometry.

See note below on supported geometries.

relation

String

The shape relationship to test.

One of {"intersects", "within"}. Defaults to "intersects".

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples
{
  "type": "geo_shape",
  "field_name": "city.boundary",
  "shape": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -5.8,
          51.5
        ],
        [
          10.0,
          51.5
        ],
        [
          10.0,
          41.0
        ],
        [
          -5.8,
          41.0
        ],
        [
          -5.8,
          51.5
        ]
      ]
    ]
  }
}
Supported Geometries

Only two-dimensional GeoJSON data are allowed in geo_shape filters. That means that coordinates should be encoded as JSON arrays of length 2.

Globus Search only supports filters using GeoJSON Polygons. Furthermore, Polygons are restricted to simple polygons, consisting of only one coordinate ring. This means that polygons with internal cut-outs are forbidden.

GFilterExists

An "existence" filter which checks if a field is present in a document with a non-null value. Note that a field being present but with a value of null is considered the same, under exists filters, as the field being absent from the document.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of an "exists" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "exists"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

Examples

The following filter finds documents where the field foo exists:

{
  "type": "exists",
  "field_name": "foo"
}

This is the version of an "exists" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "exists"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field foo exists:

{
  "type": "exists",
  "field_name": "foo"
}

GFilterLike

A "like" filter which checks if a field matches a "like-expression". Like expressions are matching strings containing the wildcard characters:

  • * matches any number of characters

  • ? matches any one character

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "like" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "like".

field_name

String

The field to which the filter refers. It must be a text field.

value

String

The filter expression to apply as a match.

Examples

The following filter finds documents where the field filename contains a string ending in .csv.

{
  "type": "like",
  "field_name": "filename",
  "value": "*.csv"
}

Note that this does not technically guarantee that the filename ends with .csv. For example, it is possible for the filter to match on a value like "filename": "foo.csv bar".

This is the version of a "like" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "like".

field_name

String

The field to which the filter refers. It must be a text field.

value

String

The filter expression to apply as a match.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field filename contains a string ending in .csv.

{
  "type": "like",
  "field_name": "filename",
  "value": "*.csv"
}

Note that this does not technically guarantee that the filename ends with .csv. For example, it is possible for the filter to match on a value like "filename": "foo.csv bar".

GFilterNot

A "not" filter for inverting any other valid filter.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of a "not" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "not"

filter

Object

Any valid GFilter object.

Examples

The following filter finds documents where the field foo does not exist:

{
  "type": "not",
  "filter": {
    "type": "exists",
    "field_name": "foo"
  }
}

This is the version of a "not" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "not"

filter

Object

Any valid GFilter object.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field foo does not exist:

{
  "type": "not",
  "filter": {
    "type": "exists",
    "field_name": "foo"
  }
}

GFilterAnd

An "and" filter for joining any other valid filters. In order for an "and" filter to match on documents, all of the filters it contains must match.

An "existence" filter which checks if a field is present in a document with a non-null value. Note that a field being present but with a value of null is considered the same, under exists filters, as the field being absent from the document.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of an "and" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "and"

filters

Array of Object

An array of GFilter objects.

Examples

The following filter finds documents where the field title exists and keywords contains hpc:

{
  "type": "and",
  "filter": [
    {
      "type": "exists",
      "field_name": "title"
    },
    {
      "type": "match_any",
      "field_name": "keywords",
      "values": [
        "hpc"
      ]
    }
  ]
}

This is the version of an "and" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "and"

filters

Array of Object

An array of GFilter objects.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where the field title exists and keywords contains hpc:

{
  "type": "and",
  "filter": [
    {
      "type": "exists",
      "field_name": "title"
    },
    {
      "type": "match_any",
      "field_name": "keywords",
      "values": [
        "hpc"
      ]
    }
  ]
}

GFilterOr

An "or" filter for joining any other valid filters. In order for an "or" filter to match on documents, at least one of the filters it contains must match.

  • version 1.0.0
  • version 2017 (legacy)

This is the version of an "or" filter defined for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

type

String

Must have the value "or"

filters

Array of Object

An array of GFilter objects.

Examples

The following filter finds documents where either the author.institution or the dataset.institution is uchicago.edu. One or both can be a match:

{
  "type": "or",
  "filter": [
    {
      "type": "match_any",
      "field_name": "author.institution",
      "values": [
        "uchicago.edu"
      ]
    },
    {
      "type": "match_any",
      "field_name": "dataset.institution",
      "values": [
        "uchicago.edu"
      ]
    }
  ]
}

This is the version of an "or" filter defined for legacy document versions (2017-09-01).

Field Name Type Description

type

String

Must have the value "or"

filters

Array of Object

An array of GFilter objects.

post_filter

Boolean

Control whether or not this filter should be applied before or after facets are calculated. If True, the filter will not impact facet results, but will filter the query results.

Defaults to True.

Examples

The following filter finds documents where either the author.institution or the dataset.institution is uchicago.edu. One or both can be a match:

{
  "type": "or",
  "filter": [
    {
      "type": "match_any",
      "field_name": "author.institution",
      "values": [
        "uchicago.edu"
      ]
    },
    {
      "type": "match_any",
      "field_name": "dataset.institution",
      "values": [
        "uchicago.edu"
      ]
    }
  ]
}

GFacet

  • version 1.0.0
  • version 2017 (legacy)

This is the definition of facet documents for query#1.0.0.

Field Name Type Description

name

String

A name for this facet which is referenced in the results.

If name is omitted, it will default to the value of the field_name property. If more than one facet in a single search request references the same field, a name must be provided.

type

String

One of terms, date_histogram, numeric_histogram, sum, avg

field_name

String

The field to which the facet refers.

Any dots (.) must be escaped with a preceding backslash (\) character.

size

Integer

The number of distinct facet values (buckets) to return.

For terms, size=N limits results to the N most common values (buckets with highest count). For numeric_histograms, this is the number of intervals between low and high of the histogram_range to be created.

Required if type=numeric_histogram. Optional if type=terms. Forbidden otherwise.

missing

Float

The value to use for entries that do not contain the field named by the value of field_name.

By default, missing values will be ignored and do not count towards sums and averages.

Optional if type=sum or type=avg. Forbidden otherwise.

histogram_range

Object

An object containing the following fields:

low: Numeric or date formatted String containing the value at the low end of the histogram range

high: Numeric or date formatted String containing the value at the high end of the histogram range

Required if type=numeric_histogram. Optional if type=date_histogram. Forbidden otherwise.

date_interval

String

Indicates the unit for the buckets returned within the histogram_range

Must be one of: year, quarter, month, week, day, hour, minute, second

Required when type=date_histogram. Forbidden otherwise.

additional_filters

Array of Objects

Optional. An array of GFilter documents which should apply only to this specific facet. When multiple facets are specified in a query, the additional_filters allow per-facet refinement of the results.

Note

For a terms facet, any values containing more than 10,000 characters will not be tabulated into the results and no buckets containing a value with more than 10,000 characters will be created.

date_histogram faceting requires that the field was detected as a date type. See the Globus Search supported Date Formats to see how data is detected as being a date. The histogram also requires that low and high are both in one of the supported date formats.

{
  "name": "File Extension",
  "type": "terms",
  "field_name": "extension",
  "size": 10
}
{
  "name": "pub_date",
  "type": "date_histogram",
  "field_name": "http://dublincore\\.org/schemas/xmls/qdc/2008/02/11/dcterms\\.xsd#created",
  "histogram_range": {
    "low": "2000-01-01",
    "high": "2010-01-01"
  },
  "date_interval": "year"
}
{
  "name": "file size",
  "type": "numeric_histogram",
  "field_name": "https://transfer\\.api\\.globus\\.org/file#size",
  "size": 100,
  "histogram_range": {
    "low": 0,
    "high": 100000000
  }
}
{
  "name": "calculate total cost",
  "type": "sum",
  "field_name": "price"
}
{
  "name": "calculate average cost per item",
  "type": "avg",
  "missing": 1.2,
  "field_name": "price"
}

This is the definition of facet documents for version 2017-09-01 queries.

Field Name Type Description

name

String

A name for this facet which is referenced in the results.

If name is omitted, it will default to the value of the field_name property. If more than one facet in a single search request references the same field, a name must be provided.

type

String

One of terms, date_histogram, numeric_histogram, sum, avg

field_name

String

The field to which the facet refers.

Any dots (.) must be escaped with a preceding backslash (\) character.

size

Integer

The number of distinct facet values (buckets) to return.

For terms, size=N limits results to the N most common values (buckets with highest count). For numeric_histograms, this is the number of intervals between low and high of the histogram_range to be created.

Required if type=numeric_histogram. Optional if type=terms. Forbidden otherwise.

missing

Float

The value to use for entries that do not contain the field named by the value of field_name.

By default, missing values will be ignored and do not count towards sums and averages.

Optional if type=sum or type=avg. Forbidden otherwise.

histogram_range

Object

An object containing the following fields:

low: Numeric or date formatted String containing the value at the low end of the histogram range

high: Numeric or date formatted String containing the value at the high end of the histogram range

Required if type=numeric_histogram. Optional if type=date_histogram. Forbidden otherwise.

date_interval

String

Indicates the unit for the buckets returned within the histogram_range

Must be one of: year, quarter, month, week, day, hour, minute, second

Required when type=date_histogram. Forbidden otherwise.

Note

For a terms facet, any values containing more than 10,000 characters will not be tabulated into the results and no buckets containing a value with more than 10,000 characters will be created.

date_histogram faceting requires that the field was detected as a date type. See the Globus Search supported Date Formats to see how data is detected as being a date. The histogram also requires that low and high are both in one of the supported date formats.

{
  "name": "File Extension",
  "type": "terms",
  "field_name": "extension",
  "size": 10
}
{
  "name": "pub_date",
  "type": "date_histogram",
  "field_name": "http://dublincore\\.org/schemas/xmls/qdc/2008/02/11/dcterms\\.xsd#created",
  "histogram_range": {
    "low": "2000-01-01",
    "high": "2010-01-01"
  },
  "date_interval": "year"
}
{
  "name": "file size",
  "type": "numeric_histogram",
  "field_name": "https://transfer\\.api\\.globus\\.org/file#size",
  "size": 100,
  "histogram_range": {
    "low": 0,
    "high": 100000000
  }
}
{
  "name": "calculate total cost",
  "type": "sum",
  "field_name": "price"
}
{
  "name": "calculate average cost per item",
  "type": "avg",
  "missing": 1.2,
  "field_name": "price"
}

GBoost

  • version 1.0.0
  • version 2017 (legacy)

This is the definition of boost documents for query#1.0.0.

Field Name Type Description

field_name

String

Field to rank higher in results. Any dots (".") must be escaped with a preceding backslash ("\") character or they will be treated as paths to a field and not part of a field name

factor

Floating Point

Factor for weighting results for query matches on the field_name. >1 is higher ranking, <1 is negative boosting. Maximum of 10, minimum of 0

Examples
{
  "field_name": "author",
  "factor": 5
}

This is the definition of boost documents for version 2017-09-01 queries.

Field Name Type Description

field_name

String

Field to rank higher in results. Any dots (".") must be escaped with a preceding backslash ("\") character or they will be treated as paths to a field and not part of a field name

factor

Floating Point

Factor for weighting results for query matches on the field_name. >1 is higher ranking, <1 is negative boosting. Maximum of 10, minimum of 0

Examples
{
  "field_name": "author",
  "factor": 5
}

GSort

  • version 1.0.0
  • version 2017 (legacy)

This is the definition of sort documents for query#1.0.0.

Field Name Type Description

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

order

String

Must be one of "asc" or "desc" indicating the ordering of the sort: ascending ("asc") or descending ("desc"). Also, see note on sorting when multiple values are present for a particular field.

Examples
{
  "field_name": "author",
  "order": "asc"
}
{
  "field_name": "path.to.date",
  "order": "desc"
}

This is the definition of sort documents for version 2017-09-01 queries.

Field Name Type Description

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

order

String

Must be one of "asc" or "desc" indicating the ordering of the sort: ascending ("asc") or descending ("desc"). Also, see note on sorting when multiple values are present for a particular field.

Examples
{
  "field_name": "author",
  "order": "asc"
}
{
  "field_name": "path.to.date",
  "order": "desc"
}
Note

For purposes of sorting, a text field containing more than 10,000 characters will be considered missing, and will thus be sorted to the end of the list.

Sorting on Multiple Values

When the field used for sorting is an array, or when there are multiple entries under a single subject, sorting must consider multiple values as sort criteria.

Consider the following two partial documents:

document15.json
{
  "an_integer": [
    1,
    5
  ]
}
document24.json
{
  "an_integer": [
    2,
    4
  ]
}

If we sort on an_integer, which should be sorted first?

In such situations, the value used during sorting is the "smallest" when doing ascending sort and "largest" when doing a descending sort. That means that document15.json would always sort before document24.json when sorting on an_integer!

Missing Fields and Sort Order

Any record which does not contain a value for a field which sorted upon will appear at the end of the sorted list regardless of whether the sort is ascending or descending.

If more than one record does not contain a value, the ordering among those records is undefined.

q_settings

  • version 1.0.0

This is the definition of q_settings documents for query#1.0.0, delete_by_query#1.0.0, and scroll#1.0.0.

Field Name Type Description

mode

String

query_string: Evaluate q using the normal query mode.

advanced_query_string: Evaluate q using the advanced query mode.

default_operator

String

or: Return results that match any of the query terms.

and: Return results that match all of the query terms.

Examples
{
  "mode": "query_string",
  "default_operator": "or"
}

Response Schemas

GSearchResult

This is the document type for all results from Search queries.

Field Name Type Description

gmeta

Array

An array of GMetaResult documents, the main body of the result

facet_results

Array

Optional. An array of GFacetResult documents with counts for all facets requested on the search request

offset

Integer

The offset provided on the input search request

count

Integer

The number of results returned; i.e. the size of the gmeta array. May be 0

total

Integer

The total number of matches for the search. May be 0 if no matches are found

has_next_page

Boolean

True if there’s another page of results available, False otherwise

Examples

This result is in the 2019-08-27 format for GMetaResult documents.

{
  "@datatype": "GSearchResult",
  "@version": "2017-09-01",
  "count": 1,
  "gmeta": [
    {
      "@datatype": "GMetaResult",
      "@version": "2019-08-27",
      "entries": [
        {
          "content": {
            "cuisine": [
              "mexican"
            ],
            "handle": "salsa-verde",
            "ingredients": [
              {
                "amount": {
                  "number": 10
                },
                "default": "tomatillo",
                "preparation": "simmer 20 minutes",
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2
                },
                "default": "serrano pepper",
                "preparation": "seeded",
                "substitutes": [
                  "jalapeno",
                  "thai bird chili"
                ],
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "clove"
                },
                "default": "garlic",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 0.5
                },
                "default": "yellow onion",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tsp"
                },
                "default": "salt",
                "type": "spice"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tbsp"
                },
                "default": "coriander",
                "preparation": "ground",
                "substitutes": [
                  "cumin"
                ],
                "type": "spice"
              }
            ],
            "keywords": [
              "salsa",
              "tomatillo",
              "coriander",
              "serrano pepper"
            ],
            "origin": {
              "author": "Diana Kennedy",
              "title": "Regional Mexican Cooking",
              "type": "book"
            }
          },
          "entry_id": null
        }
      ],
      "subject": "https://en.wikipedia.org/wiki/Salsa_verde"
    }
  ],
  "offset": 0,
  "total": 1
}

This result is in the 2017-09-01 format for GMetaResult documents.

{
  "count": 1,
  "offset": 0,
  "total": 1,
  "gmeta": [
    {
      "content": [
        {
          "alpha": {
            "beta": "gamma"
          }
        }
      ],
      "entry_ids": [
        null
      ],
      "subject": "http://example.com"
    }
  ]
}

GMetaResult

These are components in a search result.

A GMetaResult is a structure similar to a GMetaEntry from the Ingest API, with the following significant differences:

  • visibility information is not exposed; i.e. visible_to is not included

  • metadata for any subject may be an aggregate of multiple documents with different visibility rules or sources. Thus, the result is always returned as an array in which each element represents data provided by a different source or with different visibility

GMetaResult

Field Name Type Description

subject

String

the resource described by this metadata, often a URI

entries

Array

An array of objects containing the data pertaining to the subject.

Each object has the fields content, entry_id, and matched_principal_sets. The content is an object with the entry data which was sent to Search, and the entry_id is its ID. If there are any assigned principal_sets for the entry which match the current caller, they will be returned as an array of strings in matched_principal_sets.

{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}

GBucket

Field Name Type Description

value

String or Object

If the bucket represents a single value (e.g. in a "terms" `GFacet`), the value is provided. If the bucket represents a range of values, then this is an object with "from" and "to" as in a `GFilter` document This range is assumed to be closed for the "from" value and open on the "to" value as in [from, to)

count

Integer

The number of results in this bucket

{
  "value": ".docx",
  "count": 1234
}
{
  "value": {
    "from": "0",
    "to": "10"
  },
  "count": 0
}
{
  "value": {
    "from": "2011-01-01",
    "to": "2012-01-01"
  },
  "count": 17
}

GFacetResult

Field Name Type Description

name

String

Name of the `GFacet` in the search request

value

Float

Result of the `GFacet` if it was a sum or avg facet

buckets

Array

An array of GBucket documents if it was a terms, numeric_histogram or date_histogram facet

{
  "name": "extensions",
  "buckets": [
    {
      "@version": "2017-09-01",
      "value": ".docx",
      "count": 1234
    },
    {
      "@version": "2017-09-01",
      "value": ".png",
      "count": 12
    }
  ]
}
{
  "name": "calculations",
  "value": 24.5
}
  • Globus Search
  • Overview
  • API Usage & Basics
  • Ingest
  • Query
  • Types, Type Detection, and Schemas
  • Error Handling
  • API Reference
    • Batch Delete by Subject
    • Create or Update Entry
    • Delete by Query
    • Delete by Subject
    • Entry Delete
    • Entry Show
    • Index Create
    • Index Delete
    • Index List
    • Index Reopen
    • Index Show
    • Ingest
    • Query - GET
    • Query - POST
    • Role Create
    • Role Delete
    • Role List
    • Scroll Query
    • Subject Show
    • Task List
    • Task Show
  • Guides
    • Geospatial Search
    • Role Based Filtering
    • Searchable Files
  • Globus Search Limits
  • API Change History
© 2010- The University of Chicago Legal Privacy Accessibility