HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Backup a sample of a PostgreSQL database

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqldatabasesamplebackup

Problem

Is it possible to back up a PostgreSQL database with only a fraction of the data (1000 Rows from each table)?

I am able to back up that database using pg_dump. In some case I need db with minimum data.

pg_dump --host=localhost --port=5432 --username=postgres --password 
        --column-inserts --schema=test  testdb > test_backup.sql


How can I modify the above command to take the back up with 1000 number of data??

Solution

I had a similar problem, I wanted to copy the most recent rows from a few tables from one db to another hosted on a different server, and I ended up writing a bash script that executes the pg_dump command followed by various psql commands:

#!/bin/bash
declare -a arr=("my_table_1" "my_table_2" "my_table_3")
startdate=2014-12-01
enddate=2014-12-30
for table in  "${arr[@]}"
do
    echo -e \\n$table\\n$(for each in $(seq 1 ${#table}); do printf "-"; done)
    pg_dump -h host_name_1 --schema-only -t $table db_name_1 -U my_username -c | psql -h host_name_2 -U my_username db_name_2
    psql -h host_name_1 -U my_username db_name_1 -c "\copy (select * from $table where date >='$startdate' and date <='$enddate' order by date desc) to '/tmp/data.csv' csv header;"
    psql -h host_name_2 -U my_username db_name_2 -c "\copy $table from '/tmp/data.csv' csv header"
done


Line by line in the body of the loop:

  • the echo is just prettyfication: it prints the name of the table and underlines it like a title,



  • the pg_dump command generates some SQL to copy the table schema from the source and performs a few more SQL housekeeping commands (e.g. the -c flag means that the SQL will contain a command to delete the table first before creating it), and this SQL is piped with | to psql pointing at the destination (you can try running just the pg_dump part at the comand line, pg_dump -h host_name_1 --schema-only -t $table db_name_1 -U my_username -c, and see the SQL it generates),



  • the psql on the next line copies a table in the source db into a csv file,



  • the final psql copies from that csv file into a table in the destination db.

Code Snippets

#!/bin/bash
declare -a arr=("my_table_1" "my_table_2" "my_table_3")
startdate=2014-12-01
enddate=2014-12-30
for table in  "${arr[@]}"
do
    echo -e \\n$table\\n$(for each in $(seq 1 ${#table}); do printf "-"; done)
    pg_dump -h host_name_1 --schema-only -t $table db_name_1 -U my_username -c | psql -h host_name_2 -U my_username db_name_2
    psql -h host_name_1 -U my_username db_name_1 -c "\copy (select * from $table where date >='$startdate' and date <='$enddate' order by date desc) to '/tmp/data.csv' csv header;"
    psql -h host_name_2 -U my_username db_name_2 -c "\copy $table from '/tmp/data.csv' csv header"
done

Context

StackExchange Database Administrators Q#30869, answer score: 5

Revisions (0)

No revisions yet.