1. Overview

This document describes operations that can be performed on a remote endpoint’s filesystem. For manipulating the endpoint definition itself, see Endpoint Management.

All operations in this document require that the endpoint is "activated" with user credentials, so the transfer service can perform operations on their behalf. See Endpoint Activation for details.

The operations described in this document are short foreground operations, that don’t return data until completed or an error is encountered. The API resources for these operations have prefix /operation/. This is used to indicate that they involve communication with the external server described by the endpoint, and could raise errors related to network communication or authentication failures.

Long running operations, including delete and transfer, are documented elsewhere, and result in the creation of a task to track progress. See Task Submission for details.

2. Path Encoding

For maximum compatibility with different filesystems, it’s recommended to use only ASCII characters in file and directory names. If using other characters is absolutely necessary, all systems involved should be configured to use UTF-8 encoding when possible. See platform notes below.

The API uses JSON as its data format, and all strings in JSON are Unicode. Since Linux filesystems and the GridFTP protocol use raw bytes for path names, the Transfer API service must decode the bytes in order to display them as characters. It does this assuming UTF-8 encoding, which is the most common encoding on Linux systems.

Note:Globus will handle non UTF-8 filesystems in a limited fashion. If the bytes received from the GridFTP server can’t be decoded as UTF-8, they will be decoded as latin-1 and prefixed with unicode character "\uFFFD", which renders as a question mark. These paths will not display to the user as intended but can be manipulated by passing to other API calls. The rendering used is round trip safe, in that it will be converted back to bytes in a reliable way if passed back to the API for an operation like mkdir or rename. Problems are still likely to arise though, from transferring data to other systems that do use UTF-8 encoding or a different non UTF-8 encoding, or from accepting user input which will be encoded as UTF-8. This also means that if the user provides a path component beginning with the unicode character "\uFFFD", then it will be misinterpreted by the system. This character was chosen because it’s very unlikely to be used in actual filenames, and has an appearance that helps indicate that the path may not be displayed correctly.

2.1. Invalid Path Names

Globus does not allow the string "\r\n" in any file or directory names, and passing such a path will result in an error from the API.

Depending on the filesystem of the endpoint, other characters or strings may be disallowed. For example, Windows filesystem do not allow several common punctuation characters, including <, >, and *. Globus attempts to classify such errors with code InvalidPath, but there may be combinations of GridFTP server and filesystem that result in a generic EndpointError code.

2.2. Linux and Unix

In Linux and Unix filesystems, file names are stored as raw bytes. The common case is that the bytes will be UTF-8 encoded unicode, but it depends on user space configuration, which can be set system wide and overridden by individual users or even by individual applications or login shells. On most modern Linux systems, UTF-8 will be used everywhere unless a user goes out of their way to use something else. Transferring data between two such Linux systems using UTF-8 encoding is the best case scenario - no path name corruption will occur.

2.3. Windows

Windows systems use UTF-16 encoding. It is recommended that windows users limit themselves to ASCII characters for file names. Non-ascii special characters, accented characters, and non-english characters could end up encoded in the wrong way, resulting in name corruption. We plan on fixing this in a future update to Globus Connect Personal for Windows, by having the GridFTP server convert everything to/from UTF-8 for communication with Globus. Please contact support@globus.org if you have concerns.

2.4. Mac OS X

Mac OS X uses UTF-8 by default, but it also does NFD normalization. This can cause path name corruption when copying files with non-ASCII names between Mac OSX and Linux, if it contains any multi-byte characters that are altered by the normalization.

3. Document Types

3.1. Result

The "result" family document types, which includes resource specific result types like "mkdir_result", represents the result to a foreground operation. If the operation fails, an error result we be returned, but some operations have multiple success cases.

Result Document Example
{
  "DATA_TYPE": "mkdir_result",
  "code": "DirectoryCreated",
  "message": "The directory was created successfully",
  "resource": "/operation/endpoint/user#ep1/mkdir",
  "request_id": "ABCdef789"
}

3.1.1. Result Fields

Field Name JSON Type Description

DATA_TYPE

string

Has value "result" or "(subtype)_result" to indicate a result family document type. Some result subtypes have additional fields.

code

string

Code indicating how the operation succeeded. Depends on the specific operation.

message

string

Message describing how the operation succeeded in more detail.

resource

string

Path relative to the API version root of the request.

request_id

string

Id of the request, which can be used by Globus admins to look up the request in the server logs. Useful when submitting support requests or posting to the mailing list.

4. Path Arguments

Name Type Description

endpoint_xid

string

The id field of the endpoint, or for backward compatibility the canonical_name of the endpoint. The latter is deprecated, and all clients should be updated to use id.

5. Common Query Parameters

Name Type

Description

fields

string

Comma separated list of fields to include in the response. This can be used to save bandwidth on large list responses when not all fields are needed.

6. Common Errors

Code HTTP Status Description

ServiceUnavailable

503

If the service is down for maintenance.

OperationPaused

409

If the administrator of the endpoint has set a pause rule for the operation. The error response will include a pause_message string field that contains a message from the administrator about why the pause rule was set.

7. Operations

7.1. List Directory Contents

List the contents of the directory at the specified path on an endpoint’s filesystem. The endpoint must be activated before performing this operation.

The path is specified in the path query parameter. If the parameter is not passed, the default path depends on the type of endpoint:

  • For shared endpoints, S3 endpoints, and anonymous FTP endpoints, the default is /.

  • For GridFTP endpoints, the default is /~/. Most of the time this will map to the user’s home directory. However the administrator of the GridFTP server can configure it to point elsewhere. Also as a special case, if the restricted paths configuration on the server does not allow the user’s home directory, it will fall back to /.

Note:If a directory contains over 100,000 entries, a "DirectorySizeLimit" error will be returned. There is currently no way around this limit for directory listings, but these very large directories can still be transferred recursively.

Results can be paged, sorted, and filtered. By default all entries up to the 100,000 entry limit are returned, sorted by (type, name).

URL

/operation/endpoint/<endpoint_xid>/ls [?path=/path/to/dir/]

Method

GET

Response Body

{
    "DATA_TYPE": "file_list",
    "path": "/~/path/to/dir",
    "endpoint": "auser#epname",
    "rename_supported": true,
    "DATA": [
        {
            "DATA_TYPE": "file",

            "name": "somefile",
            "type": "file",
            "link_target": null,

            "user": "auser",
            "group": "agroup",
            "permissions": "0644",

            "last_modified": "2000-01-02 03:45:06+00:00",
            "size": 1024
        }
    ]
}

7.1.1. Dir Listing Query Parameters

Name Type Description

path

string

Path to a directory on the remote endpoint to list.

show_hidden

boolean

"1" for true, "0" for false. If true, show hidden files (files with a name that begins with a dot). Default true.

limit

int

Change the page size. Defaults to 100,000, which is also the maximum. Note that the entire directory is is still fetched from the endpoint server on every request. This is because the GridFTP protocol used by most endpoints does not support paging, so paging must be implemented in the REST API server.

offset

int

If using a limit less than 100,000, this can be used to page through the results.

orderby

string

A comma separated list of order by options. Each order by option is either a field name, or a field name followed by space and ASC or DESC for ascending and descending; ascending is the default. For the directory listing results, any "file" document field can be used in the orderby. Default orderby=type,name.

filter

string

Return only file documents that match all of the specified filters. The param value must be a forward slash separated list of filter clauses. For string fields, the clause is of the form FIELD_NAME:value1,value2,..., or FIELD_NAME:~pattern1,pattern1,.... For example, filter=name:~.*/type:dir will return hidden directories. For size, a filter clause can be one of size:>MIN_SIZE, size:<MAX_SIZE, or size:EXACT_SIZE. For last_modified the clause is a date range, with dates specified in ISO 9660 format: last_modified:start,end. Either start or end can be omitted to specify an open range.

7.1.2. Dir Listing Response

The "file" document represents a single file or directory. The response is an encapsulated list of "file" documents, with some additional fields providing directory level data.

The "rename_supported" field indicates if the endpoint software supports rename operations. This does not necessarily mean the current user has authorization to rename a file.

File Document Example
{
    "DATA_TYPE": "file",
    "name": "somefile",
    "type": "file",

    "user": "auser",
    "group": "agroup",
    "permissions": "0644",

    "last_modified": "2000-01-02 03:45:06+00:00",
    "link_target": null,
    "size": 1024
}
File List Document Example
{
    "DATA_TYPE": "file_list"

    "path": "/~/path/to/dir",
    "endpoint": "go#ep1",

    "DATA": [
        {
            "DATA_TYPE": "file",
            ...
        },
        ...
    ],
}

7.1.3. File Fields

Field Name JSON Type Description

DATA_TYPE

string

Always has value "file" to indicate this document type.

name

string

The name of the file or directory.

type

string

The type of the listing; possible values include dir and file, and for unix special files chr, blk, pipe, and other.

link_target

string

For symlinks, the absolute path of the target, otherwise null.

permissions

string

The unix permissions, as an octal mode string.

size

int

The file size in bytes.

user

string

The user owning the file or directory, if applicable on the endpoint’s filesystem.

group

string

The group owning the file or directory, if applicable.

last_modified

string

The date and time the file or directory was last modified, in modified ISO 9660 format: YYYY-MM-DD HH:MM:SS+00:00, i.e. using space instead of "T" to separate date and time. Always in UTC, indicated explicitly with a trailing "+00:00" timezone.

7.1.4. File List Fields

Field Name JSON Type Description

DATA_TYPE

string

Always has value "file_list" to indicate this document type.

endpoint

string

Canonical name of endpoint the listing is for.

path

string

Path on endpoint the listing is for. Particularly useful when listing with the default path, to see what the default was actually mapped to which can depend on the endpoint type and the configuration of the server.

DATA

list

List of "file" documents.

7.1.5. Errors

Code HTTP Status Description

ClientError.NotFound

404

<endpoint_xid> not found.

EndpointError

502

Catch all for errors returned by endpoint server that don’t have specific types.

ClientError.ActivationRequired

400

The endpoint in the request is not activated or the activation has expired. Activate the endpoint and retry the operation.

Conflict

409

The endpoint in the request is a shared endpoint and its host endpoint has been deleted.

7.2. Make Directory

Create a directory at the specified path on an endpoint filesystem. The endpoint must be activated before performing this operation.

URL

/operation/endpoint/<endpoint_xid>/mkdir

Method

POST

Request Body

{
  "DATA_TYPE": "mkdir"
  "path": "/~/newdir",
}

Response Body

{
  "DATA_TYPE": "mkdir_result",
  "code": "DirectoryCreated",
  "message": "The directory was created successfully",
  "request_id": "ShbIUzrWT",
  "resource": "/operation/endpoint/go%23ep1/mkdir"
}

7.2.1. Result Codes

The "code" field of the result document will be one of the following:

Code HTTP Status Description

DirectoryCreated

202

Directory created successfully.

7.2.2. Mkdir Request Fields

Field Name JSON Type Description

DATA_TYPE

string

Always has value "mkdir" to indicate this document type.

path

string

Absolute path on remote endpoint.

7.2.3. Errors

The mkdir operation can return any error returned by directory listing, as well as the following errors.

Code HTTP Status Description

ExternalError.MkdirFailed.Exists

502

The path already exists.

ExternalError.MkdirFailed.PermissionDenied

403

The user does not have permission to read or write one of the specified file or directories.

7.3. Rename

Rename or move a file or directory on an endpoint filesystem. The endpoint must be activated before performing this operation. When moving to a different parent directory, the parent directory of the new path must already exist.

Note:Most servers will require that the new path is on the same filesystem as the old path, so this is not a general purpose move operation.

URL

/operation/endpoint/<endpoint_xid>/rename

Method

POST

Request Body

{
  "DATA_TYPE": "rename",
  "old_path": "/~/typo_name.txt",
  "new_path": "/~/fixed_name.txt"
}

Response Body

{
  "DATA_TYPE": "result",
  "code": "FileRenamed",
  "message": "File or directory renamed successfully"
  "request_id": "ShbIUzrWT",
  "resource": "/operation/endpoint/go%23ep1/mkdir"
}

7.3.1. Result Codes

The "code" field of the result document will be one of the following:

Code HTTP Status Description

FileRenamed

200

File or directory renamed successfully.

7.3.2. Rename Request Fields

JSON strings are Unicode, but will be encoded as UTF-8 to interact with byte oriented filesystems. See the Path Encoding section for details. Directories paths must end in /.

Field Name JSON Type Description

DATA_TYPE

string

Always has value "rename" to indicate this document type.

old_path

string

Current absolute path of a file or directory on the endpoint.

new_path

string

New absolute path to rename/move the file or directory to.

7.3.3. Errors

Note:New error codes may be added in the future. Clients should have a generic handler which displays the message field to the user.
Code HTTP Status Description

NotSupported

409

<endpoint_xid> does not support the rename operation. Currently S3 endpoints do not support rename.

EndpointNotFound

404

<endpoint_xid> doesn’t exist or is not visible to the current user.

NoCredException

409

<endpoint_xid> is not activated.

NoPhysicalsException

409

<endpoint_xid> has no active servers. Note: physical endpoint or physical for short is an alternate name for server used by the CLI.

GCDisconnectedException

409

<endpoint_xid> is a Globus Connect Personal endpoint and is not currently connected.

GCPausedException

409

<endpoint_xid> is a Globus Connect Personal endpoint and is paused.

EndpointPermissionDenied

403

The user does not have permission to read or write one of the specified paths.

NotFound

404

old_path doesn’t exist. Note: if the parent directory of new_path does not exist, then EndpointError is returned instead.

InvalidPath

400

One of the specified paths contains characters that are not supported by the remote filesystem or is otherwise not valid.

Exists

409

new_path already exists

EndpointError

502

Catch all for other errors received from the server. Examples include connection failure, authentication failure, and filesystem failures like new_path being on a different filesystem from old_path or the parent directory of new_path not existing. The message field of the error document will contain the actual message returned by the server, and should be displayed to the user for further interpretation. It may include complex details not understood by some users, but it can be used in support requests with Globus and endpoint administrators.


© 2010- The University of Chicago Legal