HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Parsing DATE while copying csv file into PostgreSQL table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlfilewhileintocsvdatecopyingparsingtable

Problem

I have a long series of .csv files, which I want to import into a local database. I believe my query is correct, but there are some problems in parsing DATE and TIMESTAMP columns. PostgreSQL reads these columns expecting an ISO format "yyyy/mm/dd", but my data has it in another format: "dd/mm/yyyy".

I read online and on other Stack Overflow answers that one can SET the datestyle to be different, but it's not recommended.

Is there a way to specify the format of the columns to import? Also, I do not need to import all columns from the csv file: can I leave some out?
Details

First, I wrote the code to create the table (sorry if column names are in Italian, but it's not important):

CREATE TABLE IF NOT EXISTS bikes (
    bici INT,
    tipo_bici VARCHAR(20),
    cliente_anonimizzato INT,
    data_riferimento_prelievo DATE,
    data_prelievo TIMESTAMP,
    numero_stazione_prelievo INT,
    nome_stazione_prelievo TEXT,
    slot_prelievo SMALLINT,
    data_riferimento_restituzione DATE,
    data_restituzione TIMESTAMP,
    numero_stazione_restituzione INT,
    nome_stazione_restituzione TEXT,
    slot_restituzione SMALLINT,
    durata VARCHAR(10),
    distanza_totale REAL,
    co2_evitata REAL,
    calorie_consumate REAL,
    penalità CHAR(2)
);


Then I add the query to copy data into the table:

COPY bikes(
    bici,
    tipo_bici,
    cliente_anonimizzato,
    data_riferimento_prelievo,
    data_prelievo,
    numero_stazione_prelievo,
    nome_stazione_prelievo,
    slot_prelievo,
    data_riferimento_restituzione,
    data_restituzione,
    numero_stazione_restituzione,
    nome_stazione_restituzione,
    slot_restituzione,
    durata,
    distanza_totale,
    co2_evitata,
    calorie_consumate,
    penalità
)
FROM '/Users/luca/tesi/data/2019q3.csv'
DELIMITER ','
CSV HEADER;


The code seems fine, except the following error pops up:
`ERROR: date/time field value out of range: "31/07/2019"
HINT: Perhaps you need a different "datestyle" settin

Solution

Set datestyle to ISO, DMY, and your dates will be parsed as you want. There is nothing wrong with setting that parameter - do it with SET right before you COPY.

There is no way to skip columns from the CSV file. Add extra columns to the table and drop them later, that is cheap.

Context

StackExchange Database Administrators Q#290762, answer score: 4

Revisions (0)

No revisions yet.