Saturday, March 8, 2014

Postgres en_US - UTF 8 Collation on Windows

I was restoring a schema today on a stage server and ran into a collation error. The source of schema backup was Postgres 9.1, CentOS and the destination was Postgresq 9.1, windows server 2008 R2.

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2125939; 1259 50334466 INDEX ix_rd_vendorlist_facts_defaultsort_en_us
pg_restore: [archiver (db)] could not execute query: ERROR: collation "public.en_US" for encoding "UTF8" does not exist

After reading a related stackoverflow thread I realized en_US locale is Linux specific and the windows equivalent is 'English_United States.1252'. I created these collations on template database and the user database I was trying to restore this schema on and the restore finally worked.

CREATE COLLATION pg_catalog."en_US" (
 LC_COLLATE = 'English_United States.1252'
 ,LC_CTYPE = 'English_United States.1252'
 );

CREATE COLLATION pg_catalog."fr_CA" (
 LC_COLLATE = 'French_Canada.1252'
 ,LC_CTYPE = 'French_Canada.1252'
 );

CREATE COLLATION pg_catalog."pt_BR" (
 LC_COLLATE = 'Portuguese_Brazil.1252'
 ,LC_CTYPE = 'Portuguese_Brazil.1252'
 );

-- add public collations
CREATE COLLATION PUBLIC."en_US" (
 LC_COLLATE = 'English_United States.1252'
 ,LC_CTYPE = 'English_United States.1252'
 );

CREATE COLLATION PUBLIC."fr_CA" (
 LC_COLLATE = 'French_Canada.1252'
 ,LC_CTYPE = 'French_Canada.1252'
 );

CREATE COLLATION PUBLIC."pt_BR" (
 LC_COLLATE = 'Portuguese_Brazil.1252'
 ,LC_CTYPE = 'Portuguese_Brazil.1252'
 );