Changing database encoding from latin1 to UTF8
Now a days, UTF-8 is the most used data encoding format, and the fact that your database is not using UTF8 encoding is really annoying, specially additionally when it comes to integrating different systems, that has no one unified encoding format.
So if you think it's time to change your data encoding to utf8 format, then here what this post is all about.
I'll list here the steps to do so, i just have to clarify that the main data encoding here is windows-1256 (which is the main Arabic encoding used in web applications), but it's saved in latin1 encoding in the database (mydata ->windows-1256 -> latin1) ,also note that i'm using Mysql database.
Here are the steps:
- Export (only) the schema of the db,without "set Names" phrase in the outputted sql file, this will bring you back the data in the original encoding (windows-1256)
- Export the data of the db without "set Names" phrase in the outputted sql file, this will bring you back the data in the original encoding (windows-1256):
- Change the encoding of both files from arabic to utf8 -check the notes if you r using windows
- Open the file 'db_name_schema.sql' with any editor and replace each "DEFAULT CHARSET=latin1" phrase with "DEFAULT CHARSET=utf8" one
- Make a new db ,encoded in utf8:
- Import the schema and data in utf8 encoding:
notes
-
If you are wondering why to separate schema from data upon exporting , then the answer is that the operation of replacing "DEFAULT CHARSET=latin1" phrase with "DEFAULT CHARSET=utf8" one , is taking place only on schema files, so it's recommended to separate them so that you dont stuck when loading the big data files.
-
If you are a windows user and can't use iconv , then u can use any editor to do the job for u, try scite or note++ or even dreamweaver
enjoy!!!