Create Identifiers
Although a document's reference should identify a unique file or item in the repository, it might not be sufficient for the connector to retrieve it again efficiently (this might be necessary, for example, in response to a collect
request). You can therefore add additional information to a document's identifier. The identifier always holds the reference, but can also hold a set of key-value pairs to help retrieve the document.
To illustrate this, the following code sample creates a document for an imaginary e-mail archive system:
IDocInfo doc = DocInfo.Create(task.TaskConfig, "32998F5A-852D-404A-BD7B-730724D89784", "tempFile.msg", true);
In this system each message has a unique GUID, which is a good choice for the reference. However, imagine the system does not provide a way to retrieve a document by GUID, other than by searching through every folder of every mailbox. To help retrieve the message efficiently you might add additional information to the document's identifier:
doc.Identifier.SetProperty("Mailbox", "someone@example.com"); doc.Identifier.SetProperty("Folder", "/Inbox");
Now, given only the document identifier, the connector can quickly find the original message from which the document was created. When the document is indexed into IDOL, the identifier is stored in the AUTN_IDENTIFIER
document field.
NOTE: The identifier should include information for finding the document, but not for connecting to the repository. Do not include information such as web service URLs, user credentials and so on. This information is not document-specific and can be stored in the connector's configuration file. Storing connection details in the configuration file means that document identifiers continue to be valid if the connection details change. This also helps to minimize the length of the AUTN_IDENTIFIER
field.