Can Anyone Help me Save my Chinese Blog?
June 21st, 2007 by MarkAs you may have noticed, my Chinese blog has become completely wrecked over the last couple of weeks. It happened when I restored an SQL backup (made through phpmyadmin). 道聽塗說 became é “è ½å¡—èªª, and all the other Chinese characters (i.e. the whole blog) became similarly corrupted. I’ve tried deleting the whole database and creating a new one. It’s clear that the problem is related to SQL character set encodings, and so I’ve tried importing my .sql backup as a variety of character sets, including Latin-1 and UTF8, but to no avail.
However, I don’t think my blog is irretrievably lost. The Chinese in my .sql backup is readable in UTF8! If you know anything about SQL and have the inclination to help, I’ll get you a copy of it.
The file has to be viewed in UTF8 for the Chinese to display properly and it’s filled with DEFAULT CHARSET=latin1 commands. That’s probably what’s causing the problem, but I’m just not familiar enough with SQL to know how to repair it.

June 21st, 2007 at 12:38 pm
Hi Mark, this is Sebastian. Faced similar problems when programming taiwan.roomsDB.net.
What you can try first ist put the following meta tag into the header of your blog’s .html files:
This will tell the browsers who read your blog that they should use UTF-8 to decode your texts.
If that doesn’t work, than it means that your database really did not correctly save your sql dump in UTF-8 and made some kind of conversion to it’s default character set, in your case that may be “latin”. In this case, you have to change your MySql’s default character set to UTF-8 Unicode (utf8). You can do this directly on the first page of your phpmyadmin. Well, I can do it there, if you cannot do it, that might mean you have a different version of your phpmyadmin or your rights are not sufficient to change the default character set. However, you can try to change the default character set of the specific database that you use for your blog texts:
ALTER DATABASE db_name DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci
After you did that, check again. If it still doesn’t work it means that your “alter database” command does not change the data that is already in it. You then have to copy your sql dump into the database again.
Hope that helps! Bye, Sebastian
June 21st, 2007 at 12:41 pm
Seems like your comment function deleted the meta tag that I wrote. here another shot, I replaced the “” with “{” “}”
{meta http-equiv=”Content-Type” content=”text/html; charset=utf-8″}
June 21st, 2007 at 2:22 pm
Bummer…Glad someone is trying to help. I’m afraid the only good I can do is pray! Hope you figure it out.
June 21st, 2007 at 9:45 pm
btw, your site still loads very very slowly.
June 21st, 2007 at 11:54 pm
Sebastian, thanks for the suggestions. Actually, both the Chinese blog and the English blog already have <meta> tags setting the character encoding to UTF8. The problem really is in the database import. BTW, you have to escape less than and greater than signs in html. I.e., write < for less than signs and > for greater than signs.
I guess I’ll keep pluging away and see if I come up with a way to import my backup. Any other advice would be welcome.
June 22nd, 2007 at 12:49 am
I had similar problems when my host provider forced me to move from MySql 4 to 5. I found that I needed to make two changes to get posts with Chinese characters to show up properly
1. Assuming that you are using MySql 5 create the database using utf8_unicode_ci (Similar to what Sebastian suggested). Then, try to restore your data. I’m not sure what will happen if you try to change the collation after restoring.
2. Since you are using WordPress you need to make sure that it knows which character set to use when reading the posts from MySql. If you have WP 2.2 you can add the following lines to your wp-config.php file:
If you are using WP 2.1 or older you’ll need to modify the WP core files.
Add the following line:
mysql_query("SET NAMES 'utf8'") or die('SET NAMES failed');right after the call to
mysql_connect(...)in /wp-includes/wp-db.php. Put it right after the if statement that checks whether the connection was successful. No need to call set names if there is no connection.Hope this helps.
June 22nd, 2007 at 2:18 am
I have to agree with PR, your front page of your blog takes a while to load. Hope you are able to figure out the coding part of things.
June 22nd, 2007 at 5:50 am
Shaun, that’s exactly what I did and you can see the results… The database is indeed using using utf8_unicode_ci, but after restoring from that .sql file, everything is messed up. Inside the .sql file, there are commands to create the tables as latin1, though. I’ve tried using a text editor to change each instance of “latin1″ into “utf8″, but importing the resulting .sql file failed. Any other ideas?
Range and PR, I’m not sure why it’s so slow. It may be because of my theme, but I sure can’t figure out where the bottle-neck is. Other than some simple changes to the CSS and a couple of alterations to the comments to number them and make mine a separate color, it’s just plain old K2.
June 27th, 2007 at 4:17 am
The guy who run World Learner Chinese is good at mysql. He’s a cool dude. Drop by http://www.worldlearnerchinese.com He is also in Taiwan.
June 28th, 2007 at 12:37 am
I’m sorry, but I think this has nothing to do with your import, but rather with your export. This file is corrupted …
I tried it, and repaired the SQL tables through this little trick here - http://www.filination.com/tech/2007/04/27/drupal-change-db-and-table-collation-to-unicode-utf-8/
but I already suspected the file doesn’t look right.
Do you have a gziped SQL backup file that hasn’t been exported through phpmyadmin?
Since you only have a few posts, you can do what you already did before - use the Google Archive to restore everything. It will take an hour or so, but…
-
As for load time, use OctaGate (http://www.octagate.com/service/SiteTimer/) and Web Page Analyzer (http://www.websiteoptimization.com/services/analyze/) to see what’s taking so long.
The index page reports “” which is insane.
For example, you can see that “rolling archives” - whatever that is - is loading all your archives every single time, etc.
June 30th, 2007 at 9:55 am
Have you tried just removing the latin-1 “hints” from the script? You shouldn’t need them, as it should just take the database default if you don’t.
Otherwise, perhaps the “Utf8″ tag is incorrect. I assume you’re using phpMyAdmin to do the import also?
I’ll look at the script today too, and try it out on my server…
July 1st, 2007 at 9:15 pm
I have tried removing all the “latin-1″s from the file, Stephan. The import fails. I also tried just doing a search and replace via a text editor to change all the “latin-1″ tags into UTF8.
I really appreciate your suggestions, though. I’m really not much of an SQL guy at all.
July 2nd, 2007 at 5:17 am
Fili, your comment was caught by Akismet. That’s why I missed it. Thanks for the ideas. Unfortunately, my backups were created by Dreamhost via their panel. I have some of my own, but they’re really old and missing over a hundred comments. Maybe I’ll just start over.
Hmm… the rolling archives are part of the K2 theme. The strange thing is that all that stuff still loads when I switch back to my old theme. I think the drop-down archives in my sidebar may be the culprit!
July 2nd, 2007 at 1:12 pm
I’ve looked at the “backup”. Now I see that it wasn’t done by phpmyadmin. The latin-1 is only a “default” for the table, but it should be changed anyways. However, I think the REAL problem is that every data (insert) line is encoded with en-US.
Tomorrow is last day of finals. After that, I’m going to try and load directly with phpMyAdmin after changing the encoding of the inserts.
July 2nd, 2007 at 1:22 pm
Well, now I think I was wrong about the EN-US. I see that’s part of the browser ID that you’re storing.
July 3rd, 2007 at 2:51 pm
OK,
I’ve successfully imported the wp_comments table into my local MySql database using phpMyadmin.
Well, actually, I only imported the first couple hundred records. I’ve sent you a private email to your gmail account. Please read it as soon as you can.
What I ended up doing (so far) is probably overkill, so I’ve still got to reduce the number of steps. Basically, I made a new table (different encodings, etc.) to see if that would help. Using “Import”, it did NOT help, but if I run the sql Insert commands from the command window, they all work fine.
So, the file upload/import step is where the data gets corrupted.
I’m going to upload the file to the web site, and import direct from there next to see if that also allows a quick fix.
Do you have phpMyAdmin? Do you have shell access and ability to run mysql from the command line?
July 3rd, 2007 at 2:53 pm
I should mention, it’s “my web site” that I’m talking about in previous post. I don’t have access to yours.
July 16th, 2007 at 11:19 am
If you havent been able to fix this yet. Zip your backup database sql file and post a link to it here and I will update it so it will work for you.
July 16th, 2007 at 5:05 pm
I think Stephan has pretty gotten it fixed for me. Thanks, everybody.