The parse_document_csv
function parses a CSV file into documents and calls a function on each document.
This function can handle CSV files with or without a header row, but if a header row is not present you must:
use_header_row
to false
.csv_field_names
.parse_document_csv( filename, handler [, params ] )
Argument | Description |
---|---|
filename
|
(string) The path and file name of the CSV file to parse into documents. |
handler
|
(document_handler_function) The function to call on each document that is parsed from the CSV file. |
params
|
(table) A table of named parameters to configure parsing. The table maps parameter names (String) to parameter values. For information about the parameters that you can set, see the following table. |
Named Parameter | Description |
---|---|
content_field
|
(string, default DRECONTENT ) The name of the field, in the CSV file, to use as the document content. |
csv_field_names
|
(string list) A list of names for the fields that exist in the CSV file. This overrides any header row, if one is present. |
reference_field
|
(string, default DREREFERENCE ) The name of the field, in the CSV file, to use as the document reference. |
use_header_row
|
(boolean, default TRUE ) Specify whether the CSV file includes a header row (whether the first row is a list of field names and not values). If this parameter is True and you do not set csv_field_names , the field names in the header row are used as the names of the document fields. |
The following example parses a CSV file named data.csv
, and calls the function documentHandler
on each document. The values in the field item_id
become document references and the values in the field body
become document content.
function documentHandler(document) -- do something, for example print(document:getReference()) end ... parse_document_csv("./data.csv", documentHandler, { reference_field="item_id", content_field="body" })
The following example shows how to provide field names when there is no header row in the CSV file:
parse_document_csv("./data_no_header.csv", documentHandler, { use_header_row=false, csv_field_names={"DREREFERENCE", "title", "modified", "DRECONTENT"} })
Nil.
|