debugMinor
Unicode error with \u0000 on COPY of large JSON file into Postgres
Viewed 0 times
errorfilepostgreswithintounicodelargejsonu0000copy
Problem
I'm using Postgres 9.4 on Ubuntu 14.04.
I have a 30GB JSON file that I'm trying to COPY into Postgres.
But I keep getting the following error:
I tried searching for the offending Unicode string to remove it, but because it's \u0000, which is NULL, I can't seem to get a regex to work. And when I print it into console, I have no idea where the NULL is, because (at least my guess is) it seems to print as nothing (though I have no idea what it is actually doing).
Is there a way to skip these errors?
Alternatively, how could I replace any instances of that Unicode in my JSON file?
I have a 30GB JSON file that I'm trying to COPY into Postgres.
But I keep getting the following error:
COPY (comment_jsonb) FROM '' WITH (format csv, quote e'\x01', delimiter e'\x02', escape e'\x01');
ERROR: unsupported Unicode escape sequence
DETAIL: \u0000 cannot be converted to text.I tried searching for the offending Unicode string to remove it, but because it's \u0000, which is NULL, I can't seem to get a regex to work. And when I print it into console, I have no idea where the NULL is, because (at least my guess is) it seems to print as nothing (though I have no idea what it is actually doing).
Is there a way to skip these errors?
Alternatively, how could I replace any instances of that Unicode in my JSON file?
Solution
In a (perhaps) similar case I had been confronted with \u0000 in a string I needed to process as JSON. The follow replace worked for me:
Hope this helps.
regexp_replace(stringWithNull, '\\u0000', '', 'g')Hope this helps.
Code Snippets
regexp_replace(stringWithNull, '\\u0000', '', 'g')Context
StackExchange Database Administrators Q#115029, answer score: 5
Revisions (0)
No revisions yet.