This structure contains information about a subfile in a container file. It is initialized by calling fpGetSubFileInfo(). This structure is defined in kvxtract.h
.
typedef struct tag_KVSubFileInfo { KVStructHeader; char *subFileName; int subFileType; long subFileSize; unsigned long infoFlag; KVCharSet charset; int isMSBLSB; BYTE fileTime[8]; int parentIndex; int childCount; int *childArray; } KVContainerSubFileInfoRec, *KVSubFileInfo;
KVStructHeader
|
The KeyView version of the structure. See |
subFileName
|
The path, file name, or path and file name of the subfile. If the subfile is the body text of a mail file or is an embedded OLE object, KeyView provides a default file name. See Default File Names for Extracted Subfiles. |
subFileType
|
The subfile’s position in the container file’s hierarchy.
NOTE:
The classification of embedded images into images, icons, content, and previews is supported only for some Microsoft Office file formats (DOC, DOCX, XLSX, PPT, PPTX). |
subFileSize
|
The size of the subfile in bytes. This information might be useful if you do not want to extract very large files. This value is approximate and is the maximum size of the subfile. The subfile is usually smaller than this value when it is extracted. |
infoFlag
|
A bitwise flag that provides additional information about the subfile. The following flags are available:
|
charset
|
If the subfile is not an attachment, this is the character set of the subfile. If the subfile is an attachment, the character set is KVCS_UNKNOWN . |
isMSBLSB
|
This flag indicates whether the byte order for Unicode text is Big Endian (MSBLSB) or Little Endian (LSBMSB). |
fileTime
|
When the subfile is a mail message, this is the file’s
|
parentIndex
|
The index number of this file’s parent. For example, the index of a folder in which the subfile is stored, or the file to which the subfile is attached. If a file does not have a parent, the parentIndex is -1 . |
childCount
|
The number of first-level children in the subfile. |
childArray
|
A pointer to an array of first-level children in the subfile. |
Embedded images (subFileType
matching KVSubFileType_EmbeddedImage
, KVSubFileType_EmbeddedIcon
, KVSubFileType_EmbeddedContent
, and KVSubFileType_EmbeddedPreview
are not extracted unless you set ExtractImages=TRUE
in the configuration file (or the flag KVFLT_EXTRACTIMAGES
). However, text contained in these objects is present in the filter output from the container file. As a result, if you filter a document but also extract and filter its embedded images, the output from KeyView will contain duplicate content.
If you prefer not to see the duplicate content, you can modify your application so that it ignores these sub-files based on their subFileType
. Alternatively, in the Filter API, you can set the flag KVFLT_NOEMBEDDEDOBJECT
using the function fpFilterConfig()
. This instructs KeyView not to include information from embedded previews (subFileType
matching KVSubFileType_EmbeddedPreview
) in the filter output for the container file.
The KVSubFileType_Main
type applies to the following for each file format:
File format | KVSubFileType_Main applies to... |
---|---|
MSG and EML | The message body. |
Zip files | A file inside the archive. |
PST files | An item that is not an attachment, an OLE object, or a root node. |
MBX files | A message in the MBX file. |
NSF files | An item that is not an attachment, an OLE object, or a root node. |
PDF files | An item that is not an attachment or a root node. |
If you set the KVSubFileInfoFlag_NeedsExtraction
flag, open the subfile and extract its children. See fpOpenFile() and fpExtractSubFile().
The parentIndex
and childArray
members provide information about the subfile’s parent and children. You can use this information to recreate the file hierarchy on extraction. Because childArray
retrieves only the first-level children in the subfile, you must call fpGetSubFileInfo()
repeatedly until information for the leaf-node children is extracted. See Recreate a File’s Hierarchy.
|