snippetMinor
How can I load csv data into mongodb and choose the types I'd like to have created for each column?
Viewed 0 times
typescanthehoweachcolumncreatedintocsvmongodb
Problem
I'm looking for a method to load csv data in mongodb and specify the types that I'd like to have created for each column?
I've been using mongoimport but I find it quite painful because there doesn't appear to be any way to specify the datatype to be used for a particular column in the csv file.
I load the data ...
Then I do analysis to see what types mongo actually created.
Then I do surgery to correct errors like so ...
This is tedious.
It would be great if there were a method that allowed me to specify at import time what type to create for each column in the csv file like so:
http://docs.mongodb.org/manual/reference/operator/query/type/
I've been using mongoimport but I find it quite painful because there doesn't appear to be any way to specify the datatype to be used for a particular column in the csv file.
I load the data ...
mongoimport --drop -d myDatabase -c myCollection --type csv --fields name --file myFile.csvThen I do analysis to see what types mongo actually created.
> db.artist.itunes.feed.count()
36545
> db.artist.itunes.feed.count( { 'name' : { $type : 2 } } )
36511
> db.artist.itunes.feed.count( { 'name' : { $type : 1 } } )
1
> db.artist.itunes.feed.count( { 'name' : { $type : 16 } } )
33Then I do surgery to correct errors like so ...
db.artist.itunes.feed.find( { 'name' : { $type : 1 } } ).forEach( function (x) {
x.name = new String(x.name); // convert field to string
db.artist.itunes.feed.save(x);
});This is tedious.
It would be great if there were a method that allowed me to specify at import time what type to create for each column in the csv file like so:
mongoimport --drop -d myDatabase -c myCollection --type csv --fields field1,field2 --types 2,2 --file myFile.csvhttp://docs.mongodb.org/manual/reference/operator/query/type/
Solution
Ensure you're using
See the MongoDB Reference Documentation for details.
The following types can be specified:
Consider the following CSV data:
Katherine Gray, 1996-02-03, F, 1235, TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2NpbmcgZWxpdCwgc2VkIGRvIGVpdXNtb2QgdGVtcG9yIGluY2lkaWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWduYSBhbGlxdWEuIFV0IGVuaW0gYWQgbWluaW0gdmVuaWFtLCBxdWlzIG5vc3RydWQgZXhlcmNpdGF0aW9uIHVsbGFtY28gbGFib3JpcyBuaXNpIHV0IGFsaXF1aXAgZXggZWEgY29tbW9kbyBjb25zZXF1YXQuIER1aXMgYXV0ZSBpcnVyZSBkb2xvciBpbiByZXByZWhlbmRlcml0IGluIHZvbHVwdGF0ZSB2ZWxpdCBlc3NlIGNpbGx1bSBkb2xvcmUgZXUgZnVnaWF0IG51bGxhIHBhcmlhdHVyLiBFeGNlcHRldXIgc2ludCBvY2NhZWNhdCBjdXBpZGF0YXQgbm9uIHByb2lkZW50LCBzdW50IGluIGN1bHBhIHF1aSBvZmZpY2lhIGRlc2VydW50IG1vbGxpdCBhbmltIGlkIGVzdCBsYWJvcnVtLg==
Albert Gilbert, 1992-04-24, T, 13, Q3VwY2FrZSBpcHN1bSBkb2xvciBzaXQgYW1ldCB0b290c2llIHJvbGwgYm9uYm9uIHRvZmZlZS4gQ2FuZHkgY2FuZXMgcGllIGNyb2lzc2FudCBjaG9jb2xhdGUgYmFyIGxvbGxpcG9wIGJlYXIgY2xhdyBtYWNhcm9vbi4gU3dlZXQgcm9sbCBjdXBjYWtlIGNoZWVzZWNha2Ugc291ZmZsw6kgYnJvd25pZSBpY2UgY3JlYW0uIEp1anViZXMgY2FrZSBjdXBjYWtlIG1hY2Fyb29uIGRhbmlzaCBqZWxseS1vIHNvdWZmbMOpLiBDYWtlIGFwcGxlIHBpZSBnaW5nZXJicmVhZCBjaG9jb2xhdGUgc3VnYXIgcGx1bS4gU3dlZXQgY2hvY29sYXRlIGNha2UgY2hvY29sYXRlIGNha2UganVqdWJlcyB0aXJhbWlzdSBvYXQgY2FrZS4gU3dlZXQgc291ZmZsw6kgY2hvY29sYXRlLiBMaXF1b3JpY2UgY290dG9uIGNhbmR5IGNob2NvbGF0ZSBtYXJzaG1hbGxvdy4gSmVsbHkgY29va2llIGNha2UgamVsbHkgYm
The
mongoimport --db users --collection contacts --type csv
--columnsHaveTypes --fields "name.string(),birthdate.date(2006-01-02),contacted.boolean(),followerCount.int32(),user thumbnail.binary(base64)"
--file /example/file.csv
mongoimport version 3.4 or higher, and use the --columnsHaveTypes parameter to tell the import what type each column is.See the MongoDB Reference Documentation for details.
The following types can be specified:
auto()
binary()
boolean()
date()
date_go()
date_ms()
date_oracle()
decimal()
double()
int32()
int64()
string()Consider the following CSV data:
Katherine Gray, 1996-02-03, F, 1235, TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2NpbmcgZWxpdCwgc2VkIGRvIGVpdXNtb2QgdGVtcG9yIGluY2lkaWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWduYSBhbGlxdWEuIFV0IGVuaW0gYWQgbWluaW0gdmVuaWFtLCBxdWlzIG5vc3RydWQgZXhlcmNpdGF0aW9uIHVsbGFtY28gbGFib3JpcyBuaXNpIHV0IGFsaXF1aXAgZXggZWEgY29tbW9kbyBjb25zZXF1YXQuIER1aXMgYXV0ZSBpcnVyZSBkb2xvciBpbiByZXByZWhlbmRlcml0IGluIHZvbHVwdGF0ZSB2ZWxpdCBlc3NlIGNpbGx1bSBkb2xvcmUgZXUgZnVnaWF0IG51bGxhIHBhcmlhdHVyLiBFeGNlcHRldXIgc2ludCBvY2NhZWNhdCBjdXBpZGF0YXQgbm9uIHByb2lkZW50LCBzdW50IGluIGN1bHBhIHF1aSBvZmZpY2lhIGRlc2VydW50IG1vbGxpdCBhbmltIGlkIGVzdCBsYWJvcnVtLg==
Albert Gilbert, 1992-04-24, T, 13, Q3VwY2FrZSBpcHN1bSBkb2xvciBzaXQgYW1ldCB0b290c2llIHJvbGwgYm9uYm9uIHRvZmZlZS4gQ2FuZHkgY2FuZXMgcGllIGNyb2lzc2FudCBjaG9jb2xhdGUgYmFyIGxvbGxpcG9wIGJlYXIgY2xhdyBtYWNhcm9vbi4gU3dlZXQgcm9sbCBjdXBjYWtlIGNoZWVzZWNha2Ugc291ZmZsw6kgYnJvd25pZSBpY2UgY3JlYW0uIEp1anViZXMgY2FrZSBjdXBjYWtlIG1hY2Fyb29uIGRhbmlzaCBqZWxseS1vIHNvdWZmbMOpLiBDYWtlIGFwcGxlIHBpZSBnaW5nZXJicmVhZCBjaG9jb2xhdGUgc3VnYXIgcGx1bS4gU3dlZXQgY2hvY29sYXRlIGNha2UgY2hvY29sYXRlIGNha2UganVqdWJlcyB0aXJhbWlzdSBvYXQgY2FrZS4gU3dlZXQgc291ZmZsw6kgY2hvY29sYXRlLiBMaXF1b3JpY2UgY290dG9uIGNhbmR5IGNob2NvbGF0ZSBtYXJzaG1hbGxvdy4gSmVsbHkgY29va2llIGNha2UgamVsbHkgYm
The
--fields option specifies which field type mongoimport will use when importing the data into MongoDB (command-line wrapped for readability):mongoimport --db users --collection contacts --type csv
--columnsHaveTypes --fields "name.string(),birthdate.date(2006-01-02),contacted.boolean(),followerCount.int32(),user thumbnail.binary(base64)"
--file /example/file.csv
Code Snippets
auto()
binary(<arg>)
boolean()
date(<arg>)
date_go(<arg>)
date_ms(<arg>)
date_oracle(<arg>)
decimal()
double()
int32()
int64()
string()Context
StackExchange Database Administrators Q#77135, answer score: 4
Revisions (0)
No revisions yet.