Search API Menu

Globus SearchOverviewAPI Usage & BasicsIngestQueryTypes, Type Detection, and SchemasError Handling
API Reference
Create or Update EntryDelete by QueryDelete by SubjectDelete EntryGet EntryGET QueryGet SubjectGet TaskIndex Create (BETA)Index ListIngestPOST QueryRole CreateRole DeleteRole ListScroll QueryShow IndexTask List
Guides
Role Based Filtering
Globus Search LimitsAPI Change History
Skip to main content
Globus Docs
  • APIs
    • Auth
    • Transfer
    • Groups
    • Search
    • Python SDK
    • Helper Pages
  • How To
  • Guides
    • Globus Connect Server Installation Guides
    • High Assurance Collections for Protected Data
    • Management Console Guide
    • Command Line Interface
    • Premium Storage Connectors
    • Security
    • Modern Research Data Portal
  • Support
    • FAQs
    • Mailing Lists
    • Contact Us
    • Check Support Tickets
  1. Home
  2. Globus APIs
  3. Globus Search
  4. API Reference

Scroll Query

Method

POST

URL

/v1/index/<index_id>/scroll

Authentication required?

Only for non-public data

Required Roles

None

Request Body

a GScrollRequest document

Response Body

a GScrollResult document

Authentication & Authorization

Tokens for this call must have one of these scopes.

urn:globus:scopes:search.api.globus.org:all
urn:globus:scopes:search.api.globus.org:search

Examples

Scroll via curl

To run a scroll query, we send it via a POST to the API, e.g.

curl -XPOST \
    -H 'Content-Type: application/json' \
    'https://search.api.globus.org/v1/index/4de0e89e-a395-11e7-bc54-8c705ad34f60/scroll' \
    --data '
{
  "q": "a scroll request with filtering",
  "filters": [
    {
      "type": "range",
      "field_name": "path.to.date",
      "values": [
        {
          "from": "*",
          "to": "2014-11-07"
        }
      ]
    }
  ]
}'

Request Schemas

GScrollRequest

This is the main document type for encoding a scrolling query.

Field Name Type Description

q

String

User-supplied query, conforming to the query syntax. Required if there are no filters.

advanced

Boolean

Defaults to False

When true, interpret q with the advanced query syntax

limit

Integer

Optional. Limit the results given to limit many items. Defaults to 10

bypass_visible_to

Boolean

Defaults to False

Allowed for Index Admins only. When true, visible_to restrictions will be ignored for this search query.

filter_principal_sets

List of Strings

Optional. A list of principal_set names.

The caller’s identity set will be matched against any principal_sets assigned to entry documents, and filtered to matches for any of these strings. If this parameter is provided, at least one match must be present.

filters

Array

Optional. An array of GFilter Documents. Filters to apply to the search

marker

String

Optional. An opaque token from a previous scroll result document, used to request the nex page of results.

Examples

Simple Query

{
  "q": "the quick brown fox jumps"
}

GFilter

Field Name TYpe Description

type

String

One of {"match_any", "match_all", "range"}

field_name

String

The field to which the filter refers. Any dots (".") must be escaped with a preceding backslash ("\") character.

values

Array of Strings or Objects

The values to evaluate against the field_name.

If type is "match_any" or "match_all" this must be a list of Strings.

If type is "range", this must be a list of Objects each with the fields from and to.

"match_any" and "match_all" refer to the different possible behaviors of the filter values. As their names imply, if "match_any" is specified, the filter will match results for which any of filter values match, while "match_all" requires that all of the values match on every result.

Note

"match_any" and "match_all" are the same when there’s only one value as far as filtering is concerned, but they may have different impact on the way that facets are interpreted.

Note

values.from and values.to may be the special string "*" indicating that the range is unbounded on this end. An example is given below.

Examples

Example 1

{
  "type": "match_any",
  "field_name": "https://schema\\.labs\\.datacite\\.org/meta/kernel-4\\.0/metadata\\.xsd#resourceTypeGeneral",
  "values": ["Globus Endpoint"]
}

Example 2

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "1970-01-01",
      "to": "2015-01-01"
    }
  ]
}

Example 3

{
  "type": "range",
  "field_name": "path.to.date",
  "values": [
    {
      "from": "*",
      "to": "2014-11-07"
    }
  ]
}

Example 4

{
  "type": "match_all",
  "field_name": "https://transfer\\.api\\.globus\\.org/endpoint#keywords",
  "values": ["hpc", "internet2", "uchicago"]
}

Response Schemas

GScrollResult

This is the document type for all results from scrolling queries.

Field Name Type Description

gmeta

Array

An array of GMetaResult documents, the main body of the result

count

Integer

The number of results returned; i.e. the size of the gmeta array. May be 0

total

Integer

The total number of matches for the search. May be 0 if no matches are found

has_next_page

Boolean

True if there’s another page of results available, False otherwise

marker

String

An opaque marker used to request the next page of results

Examples

Example 1

{
  "count": 1,
  "gmeta": [
    {
      "@datatype": "GMetaResult",
      "@version": "2019-08-27",
      "entries": [
        {
          "content": {
            "cuisine": [
              "mexican"
            ],
            "handle": "salsa-verde",
            "ingredients": [
              {
                "amount": {
                  "number": 10
                },
                "default": "tomatillo",
                "preparation": "simmer 20 minutes",
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2
                },
                "default": "serrano pepper",
                "preparation": "seeded",
                "substitutes": [
                  "jalapeno",
                  "thai bird chili"
                ],
                "type": "fruit"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "clove"
                },
                "default": "garlic",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 0.5
                },
                "default": "yellow onion",
                "type": "vegetable"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tsp"
                },
                "default": "salt",
                "type": "spice"
              },
              {
                "amount": {
                  "number": 2,
                  "unit": "tbsp"
                },
                "default": "coriander",
                "preparation": "ground",
                "subsitutes": [
                  "cumin"
                ],
                "type": "spice"
              }
            ],
            "keywords": [
              "salsa",
              "tomatillo",
              "coriander",
              "serrano pepper"
            ],
            "origin": {
              "author": "Diana Kennedy",
              "title": "Regional Mexican Cooking",
              "type": "book"
            }
          },
          "entry_id": null
        }
      ],
      "subject": "https://en.wikipedia.org/wiki/Salsa_verde"
    }
  ],
  "total": 1,
  "has_next_page": true,
  "marker": "3d34900e3e4211ebb0a806b2af333354"
}

GMetaResult

These are components in a search result.

A GMetaResult is a structure similar to a GMetaEntry from the Ingest API, with the following significant differences:

  • visibility information is not exposed; i.e. visible_to is not included

  • metadata for any subject may be an aggregate of multiple documents with different visibility rules or sources. Thus, the result is always returned as an array in which each element represents data provided by a different source or with different visibility

GMetaResult version 2019-08-27

Field Name Type Description

subject

String

the resource described by this metadata, often a URI

entries

Array

An array of objects containing the data pertaining to the subject.

Each object has the fields content, entry_id, and matched_principal_sets. The content is an object with the entry data which was sent to Search, and the entry_id is its ID. If there are any assigned principal_sets for the entry which match the current caller, they will be returned as an array of strings in matched_principal_sets.

Example 1

{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}

GMetaResult version 2017-09-01 (legacy format)

Field Name Type Description

subject

String

the resource described by this metadata, often a URI

content

Array

an array of objects containing the metadata pertaining to the subject

entry_ids

Array

an array of Entry IDs matching the content such that the entry ID at index i has content found in content[i]. See note below

Note

entry_ids and content are kept separate to maintain backwards compatibility. They can easily be unified with a zip operation in many programming languages.

Example 1

{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    }
  ],
  "subject": "http://example.com"
}

Side-by-side comparison with the new format

Note how, in the example below, the new format makes it easier to associate entry_id values with content blobs. Additionally, this new format is more extensible — if new fields are needed in the new format, they can be added as siblings of the content and entry_id fields.

Table 1. Format Comparison
New Format (2019-08-27) Old Format (2017-09-01)
{
  "entries": [
    {
      "content": {
        "alpha": {
          "beta": "gamma"
        }
      },
      "matched_principal_sets": [],
      "entry_id": null
    },
    {
      "content": {
        "alpha": {
          "beta": "delta"
        }
      },
      "matched_principal_sets": [],
      "entry_id": "with_delta"
    }
  ],
  "subject": "http://example.com"
}
{
  "content": [
    {
      "alpha": {
        "beta": "gamma"
      }
    },
    {
      "alpha": {
        "beta": "delta"
      }
    }
  ],
  "entry_ids": [
    null,
    "with_delta"
  ],
  "subject": "http://example.com"
}

each entry is a complete, standalone subdocument

entry_ids needs to be zipped with content to make sense of the structure

Globus SearchOverviewAPI Usage & BasicsIngestQueryTypes, Type Detection, and SchemasError Handling
API Reference
Create or Update EntryDelete by QueryDelete by SubjectDelete EntryGet EntryGET QueryGet SubjectGet TaskIndex Create (BETA)Index ListIngestPOST QueryRole CreateRole DeleteRole ListScroll QueryShow IndexTask List
Guides
Role Based Filtering
Globus Search LimitsAPI Change History
© 2010- The University of Chicago Legal Accessibility