Identifiers

The identifiers fetch action retrieves a list of items that are present in the repository and returns an identifier for each item. This action can be used by front end applications for providing an interface to browse a repository. You can use the identifiers that are returned by this action in other connector actions that require you to provide identifiers.

By default the identifiers action only returns identifiers for items that would be synchronized using the specified task configuration. If you have set configuration parameters to exclude certain items, those items are not returned. If you want to see excluded items, set the action parameter ShowExcluded to TRUE.

TIP: The identifiers fetch action does not expand container files, and does not provide identifiers for sub-files.

Type: Asynchronous

Parameter Name Description Required
Config A base-64 encoded configuration. The configuration parameters that are set override the same parameters in the connector's configuration file. No
ConfigSection The name of the configuration file section that contains the task settings. Yes
ContainersOnly A Boolean value (default false) that specifies whether to return only those items that represent containers. No
FilterTypes A comma-separated list of the types of items to return identifiers for. If you omit this parameter, the action returns items of all types. No
Identifiers A comma-separated list of identifiers. The action returns identifiers (and status information if supported) for these items and their ancestors, but does not return descendant items (to do this, set the ParentIdentifiers parameter instead). Set this parameter or ParentIdentifiers
IdentifiersAction The name of an action to perform on the returned identifiers. Only the collect fetch action is available. If the action you specify would require additional parameters, specify them as parameters to this action. No
MaxDepth The maximum depth that the connector crawls in the repository (from ParentIdentifiers). The default maximum depth is 1. To specify no limit, set this parameter to 0 (zero). Be aware that if you increase the maximum depth or specify an unlimited maximum depth, the action could take a long time to complete. No
ParentIdentifiers A comma-separated list of identifiers. The action returns identifiers (and status information if supported) for these items, and for ancestors and descendants of these items. To specify the root of the repository, set this parameter to ROOT. Set this parameter or Identifiers
ShowAncestors A Boolean value (default true) that specifies whether to return the identifiers of ancestors for items specified by ParentIdentifiers or Identifiers. The action returns parent items up to the root of the repository. No
ShowAttributes A Boolean value (default true) that specifies whether to show attributes in the response. For example, shows whether an item is a container, and shows whether a document could be ingested to represent the item. No
ShowDocStatus A Boolean value (default false) that specifies whether to show status information. This can include the ingestion status for each document and the modification history for items in the repository. No
ShowExcluded A Boolean value (default false) that specifies whether to return identifiers for excluded items. (Items that would not be synchronized because they are excluded by the task configuration). By default the action only returns items that would be synchronized. If you want to see items that would be ignored, set this parameter to true. No
ShowMetadata

A comma-separated list of basic metadata fields to return for each item. You can set this parameter to a comma-separated list of the following values:

  • createdDate - the date when the item was created.
  • modifiedDate - the date when them was last modified.
  • sizeBytes - the size of the item in bytes.

If you omit this parameter the connector does not return metadata. This feature is not supported by all connectors, and some platforms, repositories, or items might not support all of the metadata fields.

No
ShowNames A Boolean value (default true) that specifies whether the response shows a display name for each item (if one is available). No
ShowTypes A Boolean value (default true) that specifies whether the response shows the type of item that each identifier represents. No
Override_Config_Parameters

Any other action parameters that you set override settings in the connector's configuration file. For example:

/action=fetch&fetchaction=...
&[Section]Parameter=Value

where [Section] (optional) is the name of a configuration file section, Parameter is the name of a configuration parameter, and Value is the parameter value.

No

Example

The following example sends the identifiers fetch action to the connector. The connector returns the items that it finds by crawling from the root of the repository to a maximum depth of two levels:

http://localhost:7002/action=Fetch&FetchAction=Identifiers
                                  &ConfigSection=MyTask
                                  &ParentIdentifiers=ROOT
                                  &MaxDepth=2
                                  &ShowDocStatus=False

Response

The fetch action is asynchronous, so it returns a token. You can use the token with the QueueInfo action to retrieve the response.

<autnresponse>
  <action>QUEUEINFO</action>
  <response>SUCCESS</response>
  <responsedata>
    <actions>
      <action>
        <status>Finished</status>
        ...
        <documentcounts>
          <documentcount errors="0" seen="3" task="MYTASK"/>
        </documentcounts>
        <fetchaction>IDENTIFIERS</fetchaction>
        <identifiers parent_identifier="ROOT">
          <identifier attributes="container" name="C:\MyFiles" type="Directory">identifier1</identifier>
        </identifiers>
        <identifiers parent_identifier="identifier1" descendant="true">
          <identifier attributes="container" name="MyFolder" type="Directory">identifier2</identifier>
          <identifier attributes="document" name="File.txt" type="File">identifier3</identifier>
        </identifiers>
        <token>...</token>
      </action>
    </actions>
  </responsedata>
</autnresponse>

The response contains an <identifiers parent_identifier="..."> element for each of the identifiers passed to the action in the ParentIdentifiers or Identifiers action parameter. If you use the ParentIdentifiers action parameter and set the MaxDepth action parameter to a value greater than 1, the response also contains an <identifiers parent_identifier="..."> element for descendant items that have child identifiers, down to the requested depth. The parent_identifier attribute specifies the identifier of the item. These elements can also include the following attributes:

  • self="true" indicates that this is one of the identifiers you passed to the action in the ParentIdentifiers or Identifiers parameter.
  • ancestor="true" indicates that this item is an ancestor of one of the identifiers you passed to the action in the ParentIdentifiers or Identifiers parameter. You can hide these elements by setting the action parameter ShowAncestors=False.
  • descendant="true" indicates that this item is a descendant of one of the identifiers you passed to the action in the ParentIdentifiers parameter. You can limit the number of descendants that are returned by setting the action parameter MaxDepth.

Each <identifiers parent_identifier="..."> element contains <identifier ...> elements for items that are direct descendants of the parent item.

An <identifiers ...> or <identifier ...> element can provide the following attributes:

  • name - a display name for the item (if one is available).
  • attributes - contains a comma-separated list of attributes for the item.

    • If an item has the container attribute it is an item that can contain other items. To retrieve the identifiers of the child items, increase the value of MaxDepth or run the action again, using the identifier of the container as the value of the ParentIdentifiers action parameter.
    • If an item has the document attribute it is a file or has metadata that can be ingested.
  • meta_* - these attributes contain basic metadata for the item. They are present in the response only if you set the ShowMetadata action parameter. This feature is not supported by all connectors, and some platforms, repositories, or items might not support all of the metadata fields.

    • meta_createdDate - the date when the item was created, in epoch seconds.
    • meta_modifiedDate - the date when the item was last modified, in epoch seconds.
    • meta_sizeBytes - the size of the item, in bytes.
  • type - the type of item that the identifier represents.

  • exclude - indicates whether the item is excluded from being synchronized by the task configuration. This attribute can have any combination of the values self (the item itself is excluded) and children (all descendants of the item are excluded, to an unlimited depth). For example, exclude="self,children" means that the item and its descendants are both excluded. Excluded items and the exclude attribute are returned in the response only when you set the action parameter ShowExcluded=TRUE. Be aware that an identifier might not be marked as excluded but could be excluded because one of its ancestors has the exclude attribute exclude="children".

The identifiers action for the File System Connector can return the following types:

Type Possible Attributes Description Display Name Child Identifier Types
Directory container A folder in the file system. The folder path when the parent identifier is the root of the repository. Otherwise, the folder name. Directory, File
File document A file in the file system. The file name.  

An <identifiers parent_identifier="..."> or <identifier ...> element can include the following attributes that describe the status of the item. Some or all of these attributes can be omitted if the information is not available.

  • attributesmodified - the number of times the attributes of a file have been modified. This might be a minimum number; if an item is modified more than once between synchronize cycles the connector might only observe a single change (this depends on the information available from the repository).
  • attributesmodified_history - the time intervals between changes to attributes (up to 50, one character per interval, with the most recent change at the end of the list). For information about how to read this string, see Read the Document Modification History.
  • modified - the number of modifications (to content) that have been observed by the connector. This might be a minimum number; if an item is modified more than once between synchronize cycles the connector might only observe a single change (this depends on the information available from the repository).
  • modified_history - the time intervals between changes to the content of the item (up to 50, one character per interval, with the most recent change at the end of the list). For information about how to read this string, see Read the Document Modification History.
  • status - a comma-separated list of values that describe the status of ingestion for the item.

    Status Description Notes
    Invalid Invalid identifier. This value always appears alone.
    NonExistent The identifier is valid but the item does not exist in the repository or in the connector's datastore. This value always appears alone.
    Pending An operation is pending. At most one of these values.
    PendingAdd An add (or full update) is pending.
    PendingUpdate A metadata update is pending.
    PendingDelete A delete is pending.
    PendingRecursive A recursive operation is pending (changes are pending but only on child items). At most one of these values, but they can be combined with the pending statuses above. For example, an item with a status of PendingAdd,PendingRecursiveUpdate requires a full update but its children require only a metadata update.
    PendingRecursiveAdd A recursive full update is pending.
    PendingRecursiveUpdate A recursive metadata update is pending.
    PendingRecursiveDelete A recursive delete is pending.
    Unseen The connector has not seen the item during a synchronize action. At most one of these values.
    Seen The connector has seen the item during a synchronize action.
    NotCrawled Child items have not been processed. At most one of these values.
    Crawled Some or all child items have been processed.
    NotIngested The item has not been ingested. At most one of these values. If neither of these values is returned the ingestion state is not known.
    Ingested The item has been ingested.

Read the Document Modification History

The attributes modified_history and attributesmodified_history contain strings where each character represents a time duration.

To convert a character into a time duration:

  1. Convert the character to an integer, n, (0 to 61) by the position in this string: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.
  2. Calculate using floating point arithmetic (13/9)n.
  3. Read the resulting number as a number of seconds giving the minimum duration. Due to rounding the actual value will be in the range (13/9)n to (13/9)(n+1).

For example, to read "F":

  • "F" is the 16th character in the string above, so n=15.
  • (13/9)15 = (1.44444...)15 = 248.6...This gives a minimum duration of 0:04:08. The full range would be 248.6 to 359.1 seconds, or a duration between 0:04:08 and 0:05:59.