HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Import in postgres json data in a csv file

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
filepostgrescsvjsondataimport

Problem

I'm trying to import into postgres a csv file containing the data for a table. One of the column of the table has jsonb type.

One line of my csv file contains something like

1,{"a":"b"}


Suppose the table has a schema

id              | smallint          | 
data            | jsonb             |


If I try just to insert the data, everything works fine

INSERT INTO table VALUES (1, '{"a":"b"}');


Trying to import directly from the file with

COPY table FROM '/path/to/file.csv' DELIMITER ',' csv;


gives me the following error:

ERROR:  invalid input syntax for type json
DETAIL:  Token "a" is invalid.
CONTEXT:  JSON data, line 1: {a...
COPY availability, line 1, column services: "{a: b}"


I tried to quote the fields with ', with ", with \" and \', but nothing works..

Which is the correct syntax do do it?

Solution

The PostgreSQL COPY command is seldom ideal, but it often works. For reference there are better methods to figure this out than guessing.

CREATE TEMP TABLE baz AS
  SELECT 1::int, '{"a":"b"}'::jsonb;


This is your exact sample data. Now we can test different settings..

# COPY baz TO STDOUT;
1   {"a": "b"}

COPY baz TO STDOUT DELIMITER ',';
1,{"a": "b"}


You'll see that the above generates the exact data your questioning...

COPY baz TO '/tmp/data.csv' DELIMITER ',';


There is no problem. At least not with PostgreSQL 9.5.

CSV Mode

So where is your problem, it's with CSV-mode. Observe,

# COPY baz TO STDOUT;
1   {"a": "b"}
# COPY baz TO STDOUT CSV;
1,"{""a"": ""b""}"


You can see these two are different now. Let's try to load the non-CSV file in CSV mode which assumes the format that CSV mode generated above.

TRUNCATE baz;
COPY baz FROM '/tmp/data.csv' DELIMITER ',' CSV;
ERROR:  invalid input syntax for type json
DETAIL:  Token "a" is invalid.
CONTEXT:  JSON data, line 1: {a...
COPY baz, line 1, column jsonb: "{a: b}"


Now we error. The reason for that comes from RFC 4180


Each field may or may not be enclosed in double quotes (however
some programs, such as Microsoft Excel, do not use double quotes
at all). If fields are not enclosed with double quotes, then
double quotes may not appear inside the fields.

  • So JSON RFC 4627 specifies an object's names in name/value pairs must be strings which require double quotes.



  • And CSV RFC 4180 specifies that if any double quotes are inside the field, then the whole field must be quoted.



At this point you have two options..

  • Don't use CSV mode.



  • Or, Escape the inner quotes.



So these would be valid inputs under the same options in CSV mode.

#COPY baz TO STDOUT DELIMITER ',' CSV ESCAPE E'\\';
1,"{\"a\": \"b\"}"

# COPY baz TO STDOUT DELIMITER ',' CSV;
1,"{""a"": ""b""}"

Code Snippets

CREATE TEMP TABLE baz AS
  SELECT 1::int, '{"a":"b"}'::jsonb;
# COPY baz TO STDOUT;
1   {"a": "b"}

COPY baz TO STDOUT DELIMITER ',';
1,{"a": "b"}
COPY baz TO '/tmp/data.csv' DELIMITER ',';
# COPY baz TO STDOUT;
1   {"a": "b"}
# COPY baz TO STDOUT CSV;
1,"{""a"": ""b""}"
TRUNCATE baz;
COPY baz FROM '/tmp/data.csv' DELIMITER ',' CSV;
ERROR:  invalid input syntax for type json
DETAIL:  Token "a" is invalid.
CONTEXT:  JSON data, line 1: {a...
COPY baz, line 1, column jsonb: "{a: b}"

Context

StackExchange Database Administrators Q#156372, answer score: 8

Revisions (0)

No revisions yet.