Transfer API
  • Transfer API Documentation
  • API Overview
  • Task Submission
  • Task Management
  • File Operations
  • Endpoints and Collections
  • Globus Connect Personal Management
  • Endpoint and Collection Search
  • Roles
  • Collection Bookmarks
  • Guest Collection Permission Management
  • Advanced Collection Management
  • Transfer Action Providers
    • Migrating Transfer Action Providers
    • Transfer Action Provider: Transfer
    • Transfer Action Provider: Delete
    • Transfer Action Provider: Manage Permission
    • Transfer Action Provider: List Directory Contents
    • Transfer Action Provider: Stat File or Directory
    • Transfer Action Provider: Make Directory
    • Transfer Action Provider: Collection Info
    • Transfer Action Provider: Create GCP Guest Collection
    • Transfer Action Provider: Create GCSv5 Guest Collection
Skip to main content
Globus Docs
  • APIs
    Auth Flows Groups Search Timers Transfer Globus Connect Server Compute Helper Pages
  • Applications
    Globus Connect Personal Globus Connect Server Premium Storage Connectors Compute Command Line Interface Python SDK JavaScript SDK
  • Guides
  • Support
    FAQs Mailing Lists Contact Us Check Support Tickets
  1. Home
  2. Globus Services
  3. Transfer API Documentation
  4. File Operations

File Operations

Table of Contents
  • 1. Overview
  • 2. Path Encoding
    • 2.1. Invalid Path Names
    • 2.2. Linux and Unix
    • 2.3. Windows
    • 2.4. Mac OS X
  • 3. Document Types
    • 3.1. Result
    • 3.2. file_list Document
    • 3.3. File Document
  • 4. Path Arguments
  • 5. Common Query Parameters
  • 6. Common Errors
  • 7. Operations
    • 7.1. List Directory Contents
      • 7.1.1. Directory Listing Query Parameters
      • 7.1.2. Directory Listing Filtering
      • 7.1.3. Directory Listing Response
      • 7.1.4. Errors
    • 7.2. Get File or Directory Status
      • 7.2.1. Stat Query Parameters
      • 7.2.2. Stat Response
      • 7.2.3. Errors
    • 7.3. Make Directory
      • 7.3.1. Mkdir Request Fields
      • 7.3.2. Result Codes
      • 7.3.3. Errors
    • 7.4. Rename
      • 7.4.1. Rename Request Fields
      • 7.4.2. Result Codes
      • 7.4.3. Errors

1. Overview

This document describes synchronous operations that can be performed on a collection.

The operations described in this document are short foreground operations, that don’t return data until completed or an error is encountered. The API resources for these operations have prefix '/operation/'. This is used to indicate that they involve communication with the collection, and could raise errors related to network communication or authentication failures.

Long running operations, including delete and transfer, are documented elsewhere, and result in the creation of a task to track progress. See Task Submission for details.

2. Path Encoding

For maximum compatibility with different filesystems, it’s recommended to use only ASCII characters in file and directory names. If using other characters is absolutely necessary, all systems involved should be configured to use UTF-8 encoding when possible. See platform notes below.

The API uses JSON as its data format, and all strings in JSON are Unicode. Since Linux filesystems and the GridFTP protocol use raw bytes for path names, the Transfer API service must decode the bytes in order to display them as characters. It does this assuming UTF-8 encoding, which is the most common encoding on Linux systems.

Note

Globus will handle non UTF-8 filesystems in a limited fashion. If the bytes received from the GridFTP server can’t be decoded as UTF-8, they will be decoded as latin-1 and prefixed with Unicode character "\uFFFD", which renders as a question mark. These paths will not display to the user as intended but can be manipulated by passing to other API calls. The rendering used is round trip safe, in that it will be converted back to bytes in a reliable way if passed back to the API for an operation like mkdir or rename. Problems are still likely to arise though, from transferring data to other systems that do use UTF-8 encoding or a different non UTF-8 encoding, or from accepting user input which will be encoded as UTF-8. This also means that if the user provides a path component beginning with the Unicode character "\uFFFD", then it will be misinterpreted by the system. This character was chosen because it’s very unlikely to be used in actual filenames, and has an appearance that helps indicate that the path may not be displayed correctly.

2.1. Invalid Path Names

Globus does not allow the string "\r\n" in any file or directory names, and passing such a path will result in an error from the API.

Depending on the collection’s underlying filesystem, other characters or strings may be disallowed. For example, Windows filesystem do not allow several common punctuation characters, including '<', '>', and '*'. Globus attempts to classify such errors with code InvalidPath, but there may be combinations of GridFTP server and filesystem that result in a generic EndpointError[1] code.

2.2. Linux and Unix

In Linux and Unix filesystems, file names are stored as raw bytes. The common case is that the bytes will be UTF-8 encoded Unicode, but it depends on user space configuration, which can be set system wide and overridden by individual users or even by individual applications or login shells. On most modern Linux systems, UTF-8 will be used everywhere unless a user goes out of their way to use something else. Transferring data between two such Linux systems using UTF-8 encoding is the best case scenario - no path name corruption will occur.

2.3. Windows

Windows systems use UTF-16 encoding but do not enforce any particular normalization. It is recommended that Windows users limit themselves to ASCII characters for file names. Non-ASCII special characters, accented characters, and non-English characters could be incorrectly encoded, resulting in file name corruption. We plan on fixing this in a future update to Globus Connect Personal for Windows, by having the GridFTP server convert everything to/from UTF-8 for communication with Globus. Please contact support@globus.org if you have concerns.

2.4. Mac OS X

Mac OS X uses UTF-8 by default, but HFS+ also forces NFD normalization. This can cause path name corruption when copying files with non-ASCII names from Linux or Windows systems.

The new file system, APFS, will not force NFD normalization, which fixes the most common cause of name mangling (a single file). However, there is still a potential issue: two file names that differ only in normalization are allowed on Linux and Windows, but will alias a single file on Mac APFS because it is normalization-insensitive (this is a similar issue as case-insensitivity).

3. Document Types

3.1. Result

The "result" family of document types, which includes resource-specific result types like "mkdir_result", represents the result of a foreground operation. If the operation fails, an error result will be returned. Some operations have multiple success cases.

Fields
Field Name JSON Type Description

DATA_TYPE

string

Has value "result" or "(subtype)_result" to indicate a result family document type. Some result subtypes have additional fields.

code

string

Code indicating how the operation succeeded. Depends on the specific operation.

message

string

Message describing how the operation succeeded in more detail.

resource

string

Path relative to the API version root of the request.

request_id

string

ID of the request, which can be used by Globus admins to look up the request in the server logs. Useful when submitting support requests or posting to the mailing list.

Example
{
  "DATA_TYPE": "mkdir_result",
  "code": "DirectoryCreated",
  "message": "The directory was created successfully",
  "request_id": "ABCdef789",
  "resource": "/operation/endpoint/6c54cade-bde5-45c1-bdea-f4bd71dba2cc/mkdir"
}

3.2. file_list Document

Fields
Field Name JSON Type Description

DATA_TYPE

string

Always has value "file_list" to indicate this document type.

endpoint[1]

string

The collection ID that was requested.

path

string

The path that was listed; may start with /~/ when listing the default home directory.

absolute_path

string

The path that was listed; /~/ is expanded to the actual directory like /home/alice/ when listing the default home directory.

This field will not include the host path for guest collections; it is always a virtual root based path.

This field is not a "physical path", meaning it does not resolve symlinks like "pwd -P".

rename_supported

bool

Indicates if the collection supports rename operations. This does not necessarily mean the current user has authorization to rename a file.

symlink_supported

bool

Indicates if the collection supports creating symbolic links.

DATA

list

List of "file" documents.

Example
{
    "DATA_TYPE": "file_list",
    "path": "/~/path/to/dir",
    "endpoint": "5d3c6c59-5244-11e5-84dd-22000bb3f45d",
    "rename_supported": true,
    "symlink_supported": true,
    "DATA": [
        {
            "DATA_TYPE": "file",
            ...
        },
        ...
    ]
}

3.3. File Document

Fields
Field Name JSON Type Description

DATA_TYPE

string

Always has value "file" to indicate this document type.

name

string

The name of this entry in the filesystem

type

string

The type of the entry: "dir", "file", or "invalid_symlink". For unix special files "chr", "blk", "pipe", or "other".

If this entry is a valid symlink, the type will describe the target ("file", "dir", etc.), and the permissions, size, user, group, and last_modified attributes will describe the target of the symlink.

If this entry is an invalid symlink, the type will be "invalid_symlink", and the permissions, size, user, group, and last_modified attributes will describe the symlink itself.

link_target

string

If this entry is a symlink (valid or invalid), this is the path of its target, which may be an absolute or relative path. If this entry is not a symlink, this field is null.

permissions

string

The unix permissions, as an octal mode string.

size

int

The file size in bytes.

user

string

The user owning the file or directory, if applicable on the collection’s filesystem.

group

string

The group owning the file or directory, if applicable.

last_modified

string

The date and time the file or directory was last modified, in modified ISO 8601 format: YYYY-MM-DD HH:MM:SS+00:00, i.e. using space instead of "T" to separate date and time. Always in UTC, indicated explicitly with a trailing "+00:00" timezone.

link_size, link_user, link_group, link_last_modified

various

If this entry is a symlink (valid or invalid), these fields show attributes of the symlink itself, not its target. Same format as the size, user, group, and last_modified fields, respectively. These fields will be NULL for older GridFTP versions. If this entry is not a symlink, these fields are null.

Example
{
  "DATA_TYPE": "file",
  "name": "somefile",
  "type": "file",
  "user": "auser",
  "group": "agroup",
  "permissions": "0644",
  "last_modified": "2000-01-02 03:45:06+00:00",
  "link_target": null,
  "size": 1024
}

4. Path Arguments

Name Type Description

collection_id

string

The id of the collection.

5. Common Query Parameters

Name Type Description

fields

string

Comma separated list of fields to include in the response. This can be used to save bandwidth on large list responses when not all fields are needed.

6. Common Errors

The error code can be found in the HTTP response body JSON document. See error overview .

Code HTTP Status Description

ServiceUnavailable

503

The service is down for maintenance.

OperationPaused

409

An administrator of the endpoint or collection has set a pause rule for the operation. The error response will include a 'pause_message' string field that contains a message from the administrator about why the pause rule was set.

ConsentRequired

403

The collection requires consent to a data_access scope missing from the user’s current consents. See Data Access Consent for details.

7. Operations

7.1. List Directory Contents

List the contents of the directory at the specified path on a collection.

The path is specified in the path query parameter. If the parameter is not passed, the default path depends on the type of collection:

  • For guest collections the default is '/'.

  • For mapped collections, the default is '/~/'. Most of the time this will be the mapped user’s home directory.

Note

If a directory contains over 100,000 entries, a "DirectorySizeLimit" error will be returned. There is currently no way around this limit for directory listings, but these very large directories can still be transferred recursively.

Results can be paged, sorted, and filtered. By default all entries up to the 100,000 entry limit are returned, sorted by (type, name).

URL

/operation/endpoint/<collection_id>/ls [?path=/path/to/dir/][1]

Method

GET

7.1.1. Directory Listing Query Parameters

Name Type Description

path

string

Path to a directory on the collection to list. Non-absolute paths are treated as relative to /~/.

show_hidden

boolean

If true, show hidden files (files with a name that begins with a dot). If false, hide them. Default is true.

limit

int

Change the page size. Defaults to 100,000, which is also the maximum. Note that the entire directory is is still fetched from the collection on every request. This is because the GridFTP protocol does not support paging, so paging must be handled by the Transfer service.

offset

int

If using a limit less than 100,000, this can be used to page through the results.

orderby

string

A comma separated list of order by options. Each order by option is either a field name, or a field name followed by space and 'ASC' or 'DESC' for ascending and descending; ascending is the default. For the directory listing results, any "file" document field can be used in the orderby. Default orderby=type,name.

filter

string

Return only file documents that match the filter clauses specified in this string. This parameter can be passed multiple times. See Directory Listing Filtering for details.

local_user

string

Optional value passed to identity mapping specifying which local user account to map to. Only usable with Globus Connect Server v5 mapped collections.

7.1.2. Directory Listing Filtering

An individual filter parameter for directory listing is made up of filter clauses. Each clause starts with a field from the Directory Listing Response followed by a colon and filter syntax dependent on the field chosen. Additional clauses may be added, separated by a forward slash (/) (representing a logical AND condition). To be matched by that filter, an item must match every clause given.

For example, "filter=name:~.*/type:dir" would match only hidden directories.

The filter parameter can be passed multiple times to allow a logical OR across the different filter clauses. An item will be included in the response if it matches at least one of the filters.

For example, "filter=type:dir&filter=name:~*.txt" would match both directories and txt files.

String Fields

String fields such as name and type accept comma separated lists of patterns. An item matching any of the patterns is considered matching the clause. Patterns start with special characters to determine how they are applied. If no character is given, = is assumed.

= requires the strings match exactly.

~ matches against a pattern that can include and ? as wildcards. will match any number of other characters. ? will match any single character.

! is the inverse of =, allowing any string that doesn’t match exactly.

!~ is the inverse of ~, allowing any string that doesn’t match the pattern.

Some examples:

"type:=file" or just "type:file" would filter out any items that are not files.

"name:~*.txt,~*.pdf" would filter out any items that do not end with the .txt or .pdf extensions.

"user:!alice/user:!bob" would filter out any items owned by local users named "alice" or "bob". Note that this is made up of two separate filter clauses, since "user:!alice,!bob" would match all items.

"name:!~.*" would filter out hidden items.

size

The size field supports a comma separated list of comparison operators along with an integer value in bytes. An item matching any of the operations is considered matching the clause. The supported comparison operators are: =, !, <, >, ⇐, and >=. If no operator is given, = is assumed.

Some examples:

"size:=1,=2,=3" or "size:1,2,3" would filter out any items that weren’t 1, 2, or 3 bytes in size.

"size:!0" would filter out any items that were 0 bytes in size.

"size:>=500/size:<1000" would filter out any items that were between 500 and 1000 bytes in size including 500 byte items but excluding 1000 byte items. Note that this is made up of two separate filter clauses, since "size:>=500,<1000" would match all items.

last_modified

The last_modified field supports a comma separated date range with dates specified in ISO 8601 format. Either end of the date range may be left out to specify an open range. If no comma is given the range defaults to after the given time.

Some examples:

"last_modified:2020-01-01," or "last_modified:2020-01-01" would list only items that were last modified on or after Jan 1, 2020.

"last_modified:,2021-01-01" would list only items that were last modified before Jan 1, 2021.

"last_modified:2020-01-01,2021-01-01" would list only items that were last modified in 2020.

7.1.3. Directory Listing Response

The response is a "file_list" document, containing a list of "file" documents, and some additional directory-level fields. Each "file" document represents a single file or directory. See the "Document Types" section for details.

Example
{
  "DATA_TYPE": "file_list",
  "path": "/~/path/to/dir/",
  "endpoint": "5d3c6c59-5244-11e5-84dd-22000bb3f45d",
  "rename_supported": true,
  "symlink_supported": true,
  "DATA": [
    {
      "DATA_TYPE": "file",
      "name": "somefile",
      "type": "file",
      "link_target": null,
      "user": "auser",
      "group": "agroup",
      "permissions": "0644",
      "last_modified": "2000-01-02 03:45:06+00:00",
      "size": 1024
    }
  ]
}

7.1.4. Errors

Code HTTP Status Description

ClientError.NotFound

404

collection not found.

EndpointError[1]

502

Catch all for errors returned by the collection that don’t have specific types.

7.2. Get File or Directory Status

Stat the file or directory at the specified path on a collection.

Like ls, the path is specified in the path query parameter. Unlike ls, there is no "default" path — it must be specified.

URL

/operation/endpoint/<collection_id>/stat [?path=/path/to/item][1]

Method

GET

7.2.1. Stat Query Parameters

Name Type Description

path

string

Path to a file or directory on the collection. Non-absolute paths are treated as relative to /~/.

local_user

string

Optional value passed to identity mapping specifying which local user account to map to. Only usable with Globus Connect Server v5 mapped collections.

7.2.2. Stat Response

The response is a File document, similar to ls, which returns a list of such documents. The type field of the response can be used to determine if an entity is a file or directory. See the "Document Types" section for more details.

Example
{
  "DATA_TYPE": "file",
  "group": "agroup",
  "last_modified": "2024-01-02 03:45:06+00:00",
  "link_group": null,
  "link_last_modified": null,
  "link_size": null,
  "link_target": null,
  "link_user": null,
  "name": "my_directory",
  "permissions": "0755",
  "size": 4096,
  "type": "dir",
  "user": "auser"
}

7.2.3. Errors

Code HTTP Status Description

InvalidPath

400

The path contains characters that are not supported by the remote filesystem or is otherwise not valid.

EndpointPermissionDenied[1]

403

The user does not have permission to read the status of the specified path on the collection. For example, if the path is a symlink to outside the collection.

ClientError.NotFound

404

collection or path not found.

EndpointError[1]

502

Catch all for errors returned by the collection that don’t have specific types.

7.3. Make Directory

Create a directory at the specified path on a collection.

URL

/operation/endpoint/<collection_id>/mkdir[1]

Method

POST

Request Body

{
  "DATA_TYPE": "mkdir",
  "path": "/~/newdir"
}

Response Body

{
  "DATA_TYPE": "mkdir_result",
  "code": "DirectoryCreated",
  "message": "The directory was created successfully",
  "request_id": "ShbIUzrWT",
  "resource": "/operation/endpoint/6c54cade-bde5-45c1-bdea-f4bd71dba2cc/mkdir"
}

7.3.1. Mkdir Request Fields

Field Name JSON Type Description

DATA_TYPE

string

Always has value "mkdir" to indicate this document type.

path

string

Path of the directory to be created. Non-absolute paths are treated as relative to /~/.

local_user

string

Optional value passed to identity mapping specifying which local user account to map to. Only usable with Globus Connect Server v5 mapped collections.

7.3.2. Result Codes

The "code" field of the result document will be one of the following:

Code HTTP Status Description

DirectoryCreated

202

Directory created successfully.

7.3.3. Errors

The mkdir operation can return any error returned by directory listing, as well as the following errors.

Code HTTP Status Description

ExternalError.MkdirFailed.Exists

502

The path already exists.

ExternalError.MkdirFailed.PermissionDenied

403

The user does not have permission to read or write one of the specified file or directories.

7.4. Rename

Rename or move a file, directory, or symlink on a collection. If the object is a symlink, the symlink itself is renamed, not its target.

When moving to a different parent directory, the parent directory of the new path must already exist.

Note

Most servers will require that the new path is on the same filesystem as the old path, so this is not a general purpose move operation.

URL

/operation/endpoint/<collection_id>/rename[1]

Method

POST

Request Body

{
  "DATA_TYPE": "rename",
  "old_path": "/~/typo_name.txt",
  "new_path": "/~/fixed_name.txt"
}

Response Body

{
  "DATA_TYPE": "result",
  "code": "FileRenamed",
  "message": "File or directory renamed successfully",
  "request_id": "ShbIUzrWT",
  "resource": "/operation/endpoint/6c54cade-bde5-45c1-bdea-f4bd71dba2cc/rename"
}

7.4.1. Rename Request Fields

JSON strings are Unicode, but will be encoded as UTF-8 to interact with byte oriented filesystems. See the Path Encoding section for details.

Field Name JSON Type Description

DATA_TYPE

string

Always has value "rename" to indicate this document type.

old_path

string

Current path of a file, directory, or symlink. Non-absolute paths are treated as relative to /~/.

new_path

string

Path the item at old_path will be renamed to. Non-absolute paths are treated as relative to /~/.

local_user

string

Optional value passed to identity mapping specifying which local user account to map to. Only usable with Globus Connect Server v5 mapped collections.

7.4.2. Result Codes

The "code" field of the result document will be one of the following:

Code HTTP Status Description

FileRenamed

200

File or directory renamed successfully.

7.4.3. Errors

Note

New error codes may be added in the future. Clients should have a generic handler which displays the message field to the user.
Code HTTP Status Description

NotSupported

409

collection does not support the rename operation.

EndpointNotFound[1]

404

collection doesn’t exist or is not visible to the current user.

GCDisconnectedException

409

the Globus Connect Personal collection is not currently connected.

GCPausedException

409

the Globus Connect Personal collection is paused.

EndpointPermissionDenied[1]

403

The user does not have permission to read or write one of the specified paths on the collection.

NotFound

404

old_path doesn’t exist. Note: if the parent directory of new_path does not exist, then EndpointError is returned instead.

InvalidPath

400

One of the specified paths contains characters that are not supported by the remote filesystem or is otherwise not valid.

Exists

409

new_path already exists

EndpointError[1]

502

Catch all for other errors received from the collection. Examples include connection failure, authentication failure, and filesystem failures like new_path being on a different filesystem from old_path or the parent directory of new_path not existing. The message field of the error document will contain the actual message returned by the server, and should be displayed to the user for further interpretation. It may include complex details not understood by some users, but it can be used in support requests with Globus and collection administrators.


1. This use of the term "endpoint" is a case of legacy endpoint terminology and can also/exclusively refer to collections
  • Transfer API Documentation
  • API Overview
  • Task Submission
  • Task Management
  • File Operations
  • Endpoints and Collections
  • Globus Connect Personal Management
  • Endpoint and Collection Search
  • Roles
  • Collection Bookmarks
  • Guest Collection Permission Management
  • Advanced Collection Management
  • Transfer Action Providers
    • Migrating Transfer Action Providers
    • Transfer Action Provider: Transfer
    • Transfer Action Provider: Delete
    • Transfer Action Provider: Manage Permission
    • Transfer Action Provider: List Directory Contents
    • Transfer Action Provider: Stat File or Directory
    • Transfer Action Provider: Make Directory
    • Transfer Action Provider: Collection Info
    • Transfer Action Provider: Create GCP Guest Collection
    • Transfer Action Provider: Create GCSv5 Guest Collection
© 2010- The University of Chicago Legal Privacy Accessibility