Geospatial Search
Globus Search provides features in support of geospatial searches.
Data being indexed may be described as geo_point
or geo_shape
data.
These allow users to ingest data in the form of specific coordinates, or to
describe shapes like GeoJSON Polygons.
When querying data, use the geo_bounding_box
filter type to select for data
which is within a specific geographic area. Bounding boxes are the four corner
coordinates of a rectangular map area, and are considered matching if they
intersect with geospatial data (e.g. if a point is inside the box).
You must have an index where you have write permissions. At least a passing familiarity with GeoJSON is also recommended.
API Methods
We will leverage these API methods:
Submit an Ingest Task |
|
Perform a complex query |
It is assumed that you have a client (e.g. globus_sdk.SearchClient
in python) which
can be used to leverage these APIs.
Defining Geo Data in Ingest
Normally, Globus Search automatically deduces the types of fields based on the data seen for those fields. However, in the case of geo data, this is not feasible.
For this reason, the Ingest API
supports additional data regarding geo_point
and geo_shape
fields. Use the
field_mapping
field of the GIngest
document to specify that a field is a
geo_point
or geo_shape
.
field_mapping
is a JSON object whose keys are field names, and whose values
are either geo_point
or geo_shape
.
Define a geo_point
The following GIngest
document defines a single point, the approximate
coordinates of the city of Chicago.
Chicago has a latitude of 41.9° N and a longitude of 87.6° W. Therefore, the document reads:
{
"ingest_type": "GMetaEntry",
"ingest_data": {
"subject": "https://example.com/cities/chicago",
"visible_to": [
"public"
],
"content": {
"city": {
"name": "Chicago",
"state_or_province": "Illinois",
"coordinates": "41.9, -87.6"
}
}
},
"field_mapping": {
"city.coordinates": "geo_point"
}
}
There are several important notes here:
-
The field name of
city.coordinates
is dotted, to indicate the path to the fieldIf a field contains dots in one of the components in its path, they must be backslash-escaped.
-
The coordinates are given as
latitude, longitude
This applies whenever a
geo_point
is specified as a string.
Alternate geo_point Formats
Globus Search supports several alternative formats for describing points. These are the same as the document as above, but with the various different supported encodings of Chicago’s coordinates.
All of these documents are equally valid and encode the same data.
In GeoJSON, points are specified as two-element arrays, of the form
[longitude, latitude]
. Therefore:
{
"ingest_type": "GMetaEntry",
"ingest_data": {
"subject": "https://example.com/cities/chicago",
"visible_to": [
"public"
],
"content": {
"city": {
"name": "Chicago",
"state_or_province": "Illinois",
"coordinates": [
-87.6,
41.9
]
}
}
},
"field_mapping": {
"city.coordinates": "geo_point"
}
}
Search also supports points specified as two-element objects, of the form
{"lon": longitude, "lat": latitude}
. Therefore:
{
"ingest_type": "GMetaEntry",
"ingest_data": {
"subject": "https://example.com/cities/chicago",
"visible_to": [
"public"
],
"content": {
"city": {
"name": "Chicago",
"state_or_province": "Illinois",
"coordinates": {
"lon": -87.6,
"lat": 41.9
}
}
}
},
"field_mapping": {
"city.coordinates": "geo_point"
}
}
Define a geo_shape
Globus Search supports a wide range of shapes, as specified in GeoJSON. The following shapes are supported:
-
Point
-
LineString
-
Polygon
-
MultiPoint
-
MultiLineString
-
MultiPolygon
-
GeometryCollection
For simplicity in this demonstration, we will define a rectangular polygon for the area of the continental United States, excluding Alaska. (We are aware that the national borders are not rectangular. Therefore, this polygon will describe areas outside of the United States as well.) This box corresponds to a viewport in a map interface which displays most of the US.
The following values will be used:
-
West (longitude): -124.9
-
South (latitude): 24.9
-
East (longitude): -67.1
-
North (latitude): 49.1
This area, expressed as a GeoJSON polygon, is as follows:
{
"type": "Polygon",
"coordinates": [
[
[
-124.9,
24.9
],
[
-67.1,
24.9
],
[
-67.1,
49.1
],
[
-124.9,
49.1
],
[
-124.9,
24.9
]
]
]
}
In Globus Search, polygons must be closed. If the last point had been omitted from the coordinates, the polygon would be treated as invalid.
We could store this data in a Search index by ingesting the following GIngest
document:
{
"ingest_type": "GMetaEntry",
"ingest_data": {
"subject": "https://example.com/countries/united-states",
"visible_to": [
"public"
],
"content": {
"country": {
"name": "United States",
"country_code": "us",
"default_viewport": {
"type": "Polygon",
"coordinates": [
[
[
-124.9,
24.9
],
[
-67.1,
24.9
],
[
-67.1,
49.1
],
[
-124.9,
49.1
],
[
-124.9,
24.9
]
]
]
}
}
}
},
"field_mapping": {
"country.default_viewport": "geo_shape"
}
}
Querying for Geo Data
Globus Search supports querying geo data by use of two filter types:
geo_bounding_box
and geo_shape
.
geo_bounding_box filters
A bounding box is a rectangular polygon of four corners, defined by a top_left
corner and
a bottom_right
corner. They are specified like so:
{
"type": "geo_bounding_box",
"field_name": "country.default_viewport",
"top_left": {
"lat": 101.0,
"lon": 100.0
},
"bottom_right": {
"lat": 100.0,
"lon": 101.0
}
}
Note that the top edge must be North of the bottom edge, and the right edge must be East of the left edge.
A geo_bounding_box
is a type of GFilter document, which you can find
documented as part of the
POST Query documentation.
Given the coordinates above used for a "United States" view, the following query searches for cities within that area:
{
"q": "*",
"filters": [
{
"type": "geo_bounding_box",
"field_name": "city.coordinates",
"top_left": {
"lat": 49.1,
"lon": -124.9
},
"bottom_right": {
"lat": 24.9,
"lon": -67.1
}
}
]
}
geo_shape filters
A "shape" is a GeoJSON Polygon. Globus Search allows shape filters to query about points and shapes which intersect with or are fully contained within a polygon.
For example, here’s a query which finds cities whose boundaries are fully contained within a viewport over the United States (using the same bounding box coordinates from before):
{
"q": "*",
"filters": [
{
"type": "geo_shape",
"field_name": "city.boundary",
"shape": {
"type": "Polygon",
"coordinates": [
[
[
-124.9,
24.9
],
[
-67.1,
24.9
],
[
-67.1,
49.1
],
[
-124.9,
49.1
],
[
-124.9,
24.9
]
]
]
},
"relation": "within"
}
]
}
Like geo_bounding_box
, geo_shape
is a filter type which is documented as part of the
POST Query documentation.
geo_shape
is restricted to two-dimensional Polygons containing only one coordinate ring.
Please contact support@globus.org to inquire about support for other GeoJSON geometries.