Copy data between Redshift databases (same or remote servers)

My challenge was to copy data from one Redshift database to another, in this case on the same server instance. This solution works well for remote instances, too.

I tried to find a native ‘sql’ solution to the problem and found some tips of using dblink, but they did not work with Redshift.

I then found the solution on the website http://optimalbi.com/blog/2014/04/23/moving-redshift-data-between-databases/, but I did not look into it at first as involving an S3 related command did not seem to make sense for a native Sql approach mindset that I had at that time.

What the article tells you to do is to ‘temp-database-ish’ perform an UNLOAD sql query against redshift (SELECT X FROM Y WHERE Z>1), and instead of getting the results as a SQL query, store the result in S3. This is very fast even with big databases. Next you perform a COPY sql statement against S3, loading the same result into the target Redshift database in question.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s