Search API Menu
  • Globus Search
  • Overview
  • API Usage & Basics
  • Ingest
  • Query
  • Types, Type Detection, and Schemas
  • Error Handling
  • API Reference
    • Create or Update Entry
    • Delete by Query
    • Delete by Subject
    • Delete Entry
    • Get Entry
    • GET Query
    • Get Subject
    • Get Task
    • Index Create (BETA)
    • Index Delete (BETA)
    • Index List
    • Index Reopen (BETA)
    • Ingest
    • POST Query
    • Role Create
    • Role Delete
    • Role List
    • Scroll Query
    • Show Index
    • Task List
  • Guides
    • Geospatial Search
    • Role Based Filtering
    • Searchable Files
  • Globus Search Limits
  • API Change History
Skip to main content
Globus Docs
  • APIs
    Auth Flows Groups Search Transfer Python SDK Helper Pages
  • How To
  • Guides
    Globus Connect Server High Assurance Collections for Protected Data Command Line Interface Premium Storage Connectors Security Modern Research Data Portal
  • Support
    FAQs Mailing Lists Contact Us Check Support Tickets
  1. Home
  2. Globus APIs
  3. Globus Search
  4. API Reference

Scroll Query

Method

POST

URL

/v1/index/<index_id>/scroll

Authentication required?

Only for non-public data

Required Roles

None

Request Body

a GScrollRequest document

Response Body

a GScrollResult document

Authentication & Authorization

Tokens for this call must have one of these scopes.

urn:globus:scopes:search.api.globus.org:all
urn:globus:scopes:search.api.globus.org:search

Examples

Scroll via curl

To run a scroll query, we send it via a POST to the API, e.g.

curl -XPOST \
    -H 'Content-Type: application/json' \
    'https://search.api.globus.org/v1/index/4de0e89e-a395-11e7-bc54-8c705ad34f60/scroll' \
    --data '
{
  "q": "a scroll request with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}'

Request Schemas

GScrollRequest

This is the main document type for encoding a scrolling query.

Field Name Type Description

q

String

User-supplied query, conforming to the query syntax. Required if there are no filters.

advanced

Boolean

Defaults to False

When true, interpret q with the advanced query syntax

limit

Integer

Optional. Limit the results given to limit many items. Defaults to 10

bypass_visible_to

Boolean

Defaults to False

Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query.

filter_principal_sets

List of Strings

Optional. A list of principal_set names.

The caller’s identity set will be matched against any principal_sets assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.

filters

Array

Optional. An array of GFilter Documents. Filters to apply to the search

marker

String

Optional. An opaque token from a previous scroll result document, used to request the nex page of results.

Examples

{
  "q": "the quick brown fox jumps"
}

GFilter

A GFilter document is one of several document types which encode a filter. The type of filter is identified by the type field. See the table below for the various filter types.

Type Schema

match_all

GFilterMatch

match_any

GFilterMatch

range

GFilterRange

geo_bounding_box

GFilterGeoBoundingBox

GFilterMatch

A matching filter for finding results which match some set of text terms.

"match_any" and "match_all" refer to the different possible behaviors of the filter values. As their names imply, if "match_any" is specified, the filter will match results for which any of filter values match, while "match_all" requires that all of the values match on every result.

Field Name Type Description

type

String

One of {"match_any", "match_all"}

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Strings

The values to evaluate against the field_name.

If type is "match_any" or "match_all" this must be a list of Strings.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Examples
{
  "type": "match_any",
  "field_name": "globus_metadata.resource_type",
  "values": ["Globus Endpoint"]
}
{
  "type": "match_all",
  "field_name": "globus_metadata.keywords",
  "values": ["hpc", "internet2", "uchicago"]
}

GFilterRange

A range filter for finding results which have numeric or date values within a specified range.

Field Name Type Description

type

String

Must have the value "range"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Objects

The values to evaluate against the field_name.

This must be an Array of Objects each with the fields from and to.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples
{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}
{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "10",
      "to": "50"
    }
  ]
}

This example filter has multiple clauses. The combination is implicitly joined with "or" semantics. This means that we allow values from 0 to 5, and greater than or equal to 10.

{
  "type": "range",
  "field_name": "cardinality_of_foobar",
  "values": [
    {
      "from": "0",
      "to": "5"
    },
    {
      "from": "10",
      "to": "*"
    }
  ]
}

GFilterGeoBoundingBox

A bounding box filter for finding geo_shape and geo_point values which intersect with a specified bounding box.

Field Name Type Description

type

String

Must have the value "geo_bounding_box"

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

top_left

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

bottom_right

Object

An object describing a coordinate pair.

It must contain the keys lat and lon, each of which must have a numeric value.

Note

top_left is required to be northwest of bottom_right.

Examples
{
  "type": "geo_bounding_box",
  "field_name": "country.center",
  "top_left": {
    "lat": 49.1,
    "lon": -124.9
  },
  "bottom_right": {
    "lat": 24.9,
    "lon": -67.1
  }
}

Response Schemas

GScrollResult

This is the document type for all results from scrolling queries.

Field Name Type Description

gmeta

Array

An array of GMetaResult documents, the main body of the result

count

Integer

The number of results returned; i.e. the size of the gmeta array. May be 0

total

Integer

The total number of matches for the search. May be 0 if no matches are found

has_next_page

Boolean

True if there’s another page of results available, False otherwise

marker

String

An opaque marker used to request the next page of results

Examples

{
  "count": 1,
  "gmeta": [
    {
      "@datatype": "GMetaResult",
      "@version": "2019-08-27",
      "entries": [
        {
          "content": {
            "cuisine": [
              "mexican"
            ],
            "handle": "salsa-verde",
            "ingredients": [
              {
                "amount": {
                  "number": 10
                },
                "default": "tomatillo",
                "preparation": "simmer 20 minutes",
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2
                },
                "default": "serrano pepper",
                "preparation": "seeded",
                "substitutes": [
                  "jalapeno",
                  "thai bird chili"
                ],
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "clove"
                },
                "default": "garlic",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 0.5
                },
                "default": "yellow onion",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tsp"
                },
                "default": "salt",
                "type": "spice"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tbsp"
                },
                "default": "coriander",
                "preparation": "ground",
                "substitutes": [
                  "cumin"
                ],
                "type": "spice"
              }
            ],
            "keywords": [
              "salsa",
              "tomatillo",
              "coriander",
              "serrano pepper"
            ],
            "origin": {
              "author": "Diana Kennedy",
              "title": "Regional Mexican Cooking",
              "type": "book"
            }
          },
          "entry_id": null
        }
      ],
      "subject": "https://en.wikipedia.org/wiki/Salsa_verde"
    }
  ],
  "total": 1,
  "has_next_page": true,
  "marker": "3d34900e3e4211ebb0a806b2af333354"
}

GMetaResult

These are components in a search result.

A GMetaResult is a structure similar to a GMetaEntry from the Ingest API, with the following significant differences:

  • visibility information is not exposed; i.e. visible_to is not included

  • metadata for any subject may be an aggregate of multiple documents with different visibility rules or sources. Thus, the result is always returned as an array in which each element represents data provided by a different source or with different visibility

GMetaResult version 2019-08-27

Field Name Type Description

subject

String

the resource described by this metadata, often a URI

entries

Array

An array of objects containing the data pertaining to the subject.

Each object has the fields content, entry_id, and matched_principal_sets. The content is an object with the entry data which was sent to Search, and the entry_id is its ID. If there are any assigned principal_sets for the entry which match the current caller, they will be returned as an array of strings in matched_principal_sets.

{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}

GMetaResult version 2017-09-01 (legacy format)

Field Name Type Description

subject

String

the resource described by this metadata, often a URI

content

Array

an array of objects containing the metadata pertaining to the subject

entry_ids

Array

an array of Entry IDs matching the content such that the entry ID at index i has content found in content[i]. See note below

Note

entry_ids and content are kept separate to maintain backwards compatibility. They can easily be unified with a zip operation in many programming languages.
{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    }
  ],
  "subject": "http://example.com"
}

Note how, in the example below, the new format makes it easier to associate entry_id values with content blobs. Additionally, this new format is more extensible — if new fields are needed in the new format, they can be added as siblings of the content and entry_id fields.

Table 1. Format Comparison
New Format (2019-08-27) Old Format (2017-09-01)
{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}
{
  "content": [
    {
      "alpha": {
        "beta": "gamma"
      }
    },
    {
      "alpha": {
        "beta": "delta"
      }
    }
  ],
  "entry_ids": [
    null,
    "with_delta"
  ],
  "subject": "http://example.com"
}

each entry is a complete, standalone subdocument

entry_ids needs to be zipped with content to make sense of the structure

  • Globus Search
  • Overview
  • API Usage & Basics
  • Ingest
  • Query
  • Types, Type Detection, and Schemas
  • Error Handling
  • API Reference
    • Create or Update Entry
    • Delete by Query
    • Delete by Subject
    • Delete Entry
    • Get Entry
    • GET Query
    • Get Subject
    • Get Task
    • Index Create (BETA)
    • Index Delete (BETA)
    • Index List
    • Index Reopen (BETA)
    • Ingest
    • POST Query
    • Role Create
    • Role Delete
    • Role List
    • Scroll Query
    • Show Index
    • Task List
  • Guides
    • Geospatial Search
    • Role Based Filtering
    • Searchable Files
  • Globus Search Limits
  • API Change History
© 2010- The University of Chicago Legal Privacy Accessibility