This package provides a single function called read.csv.columns
. Like other functions, this allows you to read data from a CSV file, but here you need to specify in advance what type each column is.
The signature of the function is
read.csv.columns(file.name, column.types="", max.line.length=16384, has.header=TRUE, num.threads=1)
where the meaning of the arguments is as follows:
file.name
: the path to the CSV file
column.types
: a string containing as many characters as there are columns in the CSV file. If this string is empty, the data type of each column will be guessed based on the first line containing data. The allowed characters in this string and their meanings are:
i
: the column contains integersr
: the column contains real numberss
: the column contains arbitrary strings.
: the column should be ignoredmax.line.length
: specifies an upper limit to the length of a line in the CSV file (the default is probably more than enough)
has.header
: if set to TRUE
(the default), the values of the first line are used as labels for the columns of the CSV file. If set to FALSE
, the first line is also considered to consist of data values.
num.threads
: By default, a single processor thread is used to parse the strings into numbers. If this is set to a number larger than one, this amount of threads will be used to parse this data, possibly offering a speedup. If the number is zero or negative, the amount of cores as reported by detectCores
function (from the parallel
package) will be used.
The function returns a list where each entry corresponds to a column in the CSV file. The columns that were marked as ‘ignored’, are not present in this list.