Change charset for all tables.

A lot of linux distributions comes with mysql with default charset latin1. In modern world it seems to be rudimentary, for i18n utf8 is a must. So converting existing mysql databases/tables to utf8 is not an uncommon task. There’s no problem with it, mysql has ALTER TABLE SET CHARSET statement. But what if you have several dozens of tables? And what if each table has a lot of text fields? You’ll spend a lot of time in front of mysql console or phpmyadmin. Or you’ll have to generate script from table/field names.
I offer simple solution when all your data is english (I can’t predict behaviour if you have other data).

  1. Make mysqldump of your entire database
  2. Change all charset definitions sed -i 's/latin1/utf8/' dumpfile.sql
  3. Load dump back to database

Monday, September 24th, 2007

Parsing large mysql dumps

Recently I faced a problem – I had large mysql dump on remote machine and I needed just some tables from it. Dump file was about 900 Mb, bzip’ed – around 60 Mb. I didn’t want to download it just for about 1 Mb of required data. First idea that came into my head – load this dump to the database on remote machine and then dump again only required tables. The problem that I had no possibility to do that – I had just ftp/ssh access. So, I opened this ~1Gb sql file with vi. It opened it even with syntax highlighting! But when I tried to remove 2 millions of unnecessary lines it hanged forever… Emacs wasn’t better.
(more…)

Saturday, July 7th, 2007