ClusterSGDataGen
Allows you to generate spectrograph data from snapshots that you have taken by using the ClusterSnapshot action. The ClusterSGDataGen
action uses the snapshots in the time span defined by one or more of the StartDate, EndDate, and Interval parameters.
NOTE: This is an administrative action that can be sent only by users that belong to an authorization role that allows the Admin
standard role, or which enables the action explicitly. See Authorization Roles Configuration Parameters.
ClusterSGDataGen
generates a data set based on multiple cluster snapshots with the same job name (which you specify with SourceJobName). You can use the data to generate a spectrograph, which is a visual representation of how clusters change over a given time period.
You must have created at least two snapshots with the specified SourceJobName for the ClusterSGDataGen
action to run successfully. You can generate additional snapshots by using the FillGaps parameter.
When you send the ClusterSGDataGen
action, IDOL Server queues it. After it finishes, the spectrograph data sets it has generated are stored in the cluster/SGDATA
directory in your IDOL Server installation directory. You can retrieve the spectrograph image, data, or documents by using the ClusterSGPicServe, ClusterSGDataServe, and ClusterSGDocsServe actions.
Example
http://12.3.4.56:9000/action=ClusterSGDataGen&SourceJobName=Job1&TargetJobName=Job1a&StartDate=1000290039&EndDate=1000290650
This action generates spectrograph data from all snapshots called Job1
that were generated between September 12 2001
at 11:20:39
and September 12 2001
at 11:30:50
(if no snapshot is available for these dates, IDOL Server uses the snapshots that were generated before these dates). The spectrograph data set that is generated is called Job1a
.
Required Parameters
The following action parameters are required.
Parameter | Description |
---|---|
SourceJobName | The snapshot to generate spectrograph data from. |
TargetJobName | The name of the spectrograph data to generate. |
You must define a time span, which you can set by using one or more of the following parameters.
Parameter | Description |
---|---|
EndDate | The time span for which to generate spectrograph data. |
Interval | The time span for which to generate spectrograph data. |
StartDate | The time span for which to generate spectrograph data. |
Optional Parameters
This action accepts the following optional parameters.
Parameter | Description |
---|---|
BindLevel | The conceptual similarity of clusters. |
ComparisonTolerance | The maximum amount of time between snapshots to compare. |
Cycles | The number of times to run the action. |
Dependencies | A list of classification schedules that must be complete before the ClusterSGDataGen action can run. |
DREQuery | A query to use to restrict snapshot generation when you set FillGaps to True . |
FillGaps | Whether to create additional snapshots for the spectrograph generation if none exist in the timespan. |
FillGapsFrequency | The interval at which IDOL Server checks if snapshots exist in the timespan for which it is generating a spectrograph. |
ForceTimestamp | A specific time stamp to use for the generated spectrograph data. |
Params | The names of parameters to use in the Suggest actions that IDOL Server uses to create seeds for addition snapshots, when you set FillGaps to True . |
ProfileSourceJobName | A profile snapshot to compare to the SourceJobName. |
RankSections | The relevance that documents must have to a cluster. |
Repeat | The time to elapse between runs of the action. |
Retries | The number of times to retry a failed action. |
RetryInterval | The number of seconds to wait before retrying a failed action. |
SeedBindLevel | A value that specifies how closely bound concepts must be to form a cluster seed for creating snapshots when you set FillGaps to True . |
SeedSize | The size of the document group that forms a seed for creating snapshots when you set FillGaps to True . |
SourceJobName2 | The name of the snapshot to compare to the SourceJobName snapshot. |
StartTime | The time to run the first action. |
Username | The name of the user performing the action. |
Values | The values for the specified Params. |
XMLEncoding | Overrides the default XML encoding. |
This action accepts the following standard ACI action parameters.
Parameter | Description |
---|---|
ActionID | A string to use to identify an ACI action. |
EncryptResponse | Encrypt the output. |
FileName | The file to write output to. |
ForceTemplateRefresh | Forces the server to load the template from disk. |
Output | Writes output to a file. |
ResponseFormat | The format of the action output. |
Template | The template to use for the action output. |
TemplateParamCSVs | A list of variables to use for the specified template. |
Comments
You must define a timespan, which you can set with one or more of the following parameters:
Result Format
The results from the spectrograph data returns each cluster in a <node>
element. This contains the following attributes:
nodeID
|
The ID for the node. |
title
|
The cluster title and a summary of its contents. |
clusterID
|
The ID of the cluster (this ID is the same across multiple snapshots). |
numDocs
|
The number of documents in the cluster, including any duplicates (duplicates can occur because a document can be used in multiple cluster seeds, which might then be combined during the clustering process). |
absDocs
|
The number of unique documents in the cluster. |
selfSimilarity
|
A measure of how tight the cluster is, between zero and one. A value close to one indicates a well-defined single subject. A value closer to zero indicates a broader grouping. |
whatsHotScore
|
A measure of how important the cluster is. This score measures how similar the documents in a cluster are, and how narrow the range of concepts in that cluster are. When a cluster contains a lot of very similar data, it has a higher What’s Hot score, so higher scores represent more important trends. |
intensity
|
A measure of the cluster size, represented on the spectrograph by how brightly the cluster is colored. |
radius
|
A measure of the cluster importance, represented on the spectrograph by the width of the cluster. This value is based on the WhatsHotScore for the cluster. |
yPos
|
The y-position (in pixels) of this cluster on the spectrograph. (The x-position is defined by the times that each of the snapshots were taken.) |
colour
|
The top color for the cluster. This value is returned as an integer, where a higher number means a darker color. You can use these values to assign colors for each cluster. |
connection
|
(nodeID and weight ). An indication of a link to other nodes in adjacent snapshots, and the strength of this link. |