Chinese Number Tool

March 30th, 2006 by Mark

Last week, John suggested to me that a tool for converting big Chinese numbers into Arabic numerals could be useful for a lot of beginning students. I completely agree, so I’ve written a Chinese Number Tool. This tool has a bit different functionality than most Chinese number converters, so it needs some explanation.

The primary purpose of this tool is to help foreigners reading online newspapers, who come across big numbers such as 百萬 or 120億. Every number to the left of a 萬 (万) or an 億 (亿) will be interpreted as a number. Chinese conventions for and omitting trailing s, s, and s are supported. There also is support for a wide variety of input that I haven’t seen in other similar Chinese math tools, though. Here are some examples:

Input Output
五千八百 5800
兩百五十 250
三百萬 3000000
三百万 3000000
7十 70
九億 900000000
二十万零三点一八 200003.18
6.25亿 625000000
1500万 15000000
五億三千九百二十萬四千四百四十一 539204441
壹仟叄佰柒拾捌 1378

I think the tool is pretty much what you’d expect. If see buggy behavior, let me know. Tell me what kinds of numbers you come across that are inconvenient to convert and I’ll see if I can add support for them into the tool. Suggestions are welcome.

Updates:

  • As per Micah’s suggestion, the tool now supports 大写 numbers.
  • Conversions can have commas added automatically
  • Decimals are also now supported.
  • A clickable interface has been added so users can input Chinese numbers without an IME.
Tags:

35 Responses to “Chinese Number Tool”

  1. 1 Micah Says:

    Support 大写 numbers.

  2. 2 Kanwa-kyudai Says:

    Mark-san,

    I have just tried the tool now. The convention for 零 seems to be supported. My results are as follows:

    四百零一   401
    四千零一 4001
    四万零一 40001
    四十万零一 400001
    四百万零一 4000001
    四千万零一 40000001
    四億零五 400000005
    四億零五十五 400000055
    一百零五万零五百零一 1050501

    If this important rule was not supported, the tool would be completely useless. And I can not think of the reason for not supporting the rule of omitted numbers that is as important as 零.

  3. 3 Mark Says:

    Micah, I’ve added support for 壹, 貳, 贰, 叄, 叁, 肆, 伍, 陸, 柒, 捌, 玖, 拾, 佰, 仟 as well as the short-hand 〇.

    Kanwa-kyudai-san, I didn’t add that support for ommitted numbers yet because that is comparatively tedious to program, and (I think) less useful for foreigners learning Chinese. At least for myself, I never had problems quickly understanding numbers like “五百八”. For me the hard part was when I saw things like 6.2億, or 750萬. For many westerners, it’s hard to quickly recognize numbers based on 104 and 108 instead of 103, 106 and 109 like we’re used to.

    I’ll add support for omitted numbers, but I’m going to have to think of a way of doing so that won’t interfere with the properties of this tool that make it useful for newspaper numbers like those I mentioned above. I still want to make sure floating point numbers such as 13.5 work before 萬 and 億

  4. 4 Mark Says:

    Omitted numbers are now supported, for the most part.

  5. 5 Kanwa-kyudai Says:

    Mark-san,

    I did not know it was you who wrote the program of the tool. I was careless not to notice it when I read your post first. I thought such an improvement was relatively easy because I do not have much knowledge of programming. As far as I have tried, there seems to be no major problems now. I want to say, “You are great !”

    一百零五万零五百一 1050510
    四億五 450000000

    It might be more helpful for everyone if the tool return, for example, 450,000,000 instead of 450000000. Sorry, I am too demanding.

    The Chinese, Korean, and Japanese also have trouble reading English numbers just as you can not quickly recognize numbers in Chinese. It surely is a tough job for me to “decode”a number like this: five hundred sixty-seven million eight hundred ninety thousand one hundred twenty-three. We need another tool…..just a joke.

  6. 6 John B Says:

    Schweet! :)

    Though I must say while this is good when you’re first learning, I would recommend to any Chinese learner that you get a handle on numbers quick, because they’re really important :)

  7. 7 Mark Says:

    Kanwa-kyudai-san, That’s a great idea! I added an “add commas” button.

  8. 8 John Says:

    Whoa, you did it! Nice! I like it.

    You say “the largest supported number is 億 (亿)”. What does that mean exactly? I noticed that 十亿 and 四千亿 work fine, but 四白亿 and 四万亿 get killed by the script.

  9. 9 Mark Says:

    John, characters larger than 億/亿, such as 兆 (1012) aren’t supported. 万亿 isn’t a valid number because it should be represented as 兆. I didn’t support 白 because it’s a color. :P

    Integers, floating point numbers, 千,百, and 十 can all be used in front of 億/亿.

  10. 10 John Says:

    Haha, I am so lame. I totally did not notice my typo.

    Regardless of 万亿’s “invalidity” on logical grounds, you still see it quite a bit. Especially since 兆 isn’t supported, it would be nice to support 万亿.

  11. 11 Mark Says:

    John, think nothing of it. Despite my previous Japanese study, I thought that the simplified 萬 (万) you wrote in your original email to me was 方. I’ll defer to your greater exposure to Chinese reading. If 萬億 is commonly used, then I’ll support it.

    I’ve added support for 兆, and any occurrences of 萬億 are converted to 兆 before conversion. That means you can get really funky and input things like “2.13萬億”, or even “貳.壹叄萬億” if you’re so inclined. Are there any other Chinese habits related to number usage you can think of that I should add support for?

  12. 12 John Says:

    Mark,

    I can’t think of anything at the moment. In the comments of my post promoting your new tool, it has been suggested that the commas by added into the output by default. I think that’s a good idea.

  13. 13 John Says:

    I just did a Google search and discovered that the Chinese number tool idea has been done before. Here’s the top ranked one: http://www.mandarintools.com/numbers.html

    I think yours is better. That one doesn’t support decimals, only accepts Big5, and requires the page to reload. You might want to take a look if you’re still looking for ways to improve yours, though.

  14. 14 Mark Says:

    Thanks for the heads up, John. It’s clear to me that whoever made that tool didn’t have much input from guys like you or Micah. These are the results I got with that tool:

    Input Output
    五千八 Number value is 5008
    萬億 Number value is 0
    一萬零兩百 Number value is 10100

    That page probably has such a high page rank because it’s so old, and because of the explanation about Chinese numbers it has at the bottom.

    I do have another idea for improving the tool, though. Since some people may not know how to input Chinese characters, I’ve made an “Examples” button, and buttons for inputing each of the characters.

  15. 15 Kanwa-kyudai Says:

    Mark-san,

    Your “Examples” buttons are very useful. Please add “大写” buttons to make it perfect. Incidentally, do you have any plan to add the function for converting Arabic numbers into Chinese ones? It would be helpful for all the beginners of Chinese.

    Although I do not know if its programming is difficult or not, I imagine that you have already gotten enough know-how on conversion of numbers. This tool is the best in the world now, but it would be the best in the universe after the improvement.

    p.s. Sorry again that I am too demanding!

  16. 16 Kanwa-kyudai Says:

    Sorry. I have made a mistake in sending my last comment.

  17. 17 Mark Says:

    I deleted the parts you repeated. If it isn’t right, email me and I’ll fix it.

    I think your advice has been very helpful. I think I’ll add button to convert digits into Chinese. It won’t be hard at all from a programming stand-point, but I’m not sure about how to arrange the buttons or what to name them. The layout of the tool is beginning to feel a little bit crowded.

  18. 18 Kanwa-kyudai Says:

    Mark-san,

    (A)
    Chinese to Arabic (commas) go!
    Chinese to Arabic (no commas) go!
    Arabic to Chinese (簡体字) go!
    Arabic to Chinese (繁体字) go!
    Clear

    (B)
    Chinese to Arabic (commma, no comma) go!
    Arabic to Chinese (簡体字,繁体字) go!
    Clear

    In case of (B), you can return the two results simultaneously as follows:

    123,456,789
    123456789

    (簡)一兆二亿三千四百五十六万七千八百九十九
    (繁)一兆二億三千四百五十六萬七千八百九十九

    But when you convert several numbers at the same time, you will find the results a little crowded. As for the name of this tool, I think “Chinese Number Tool” is still suitable. I am not confident about it, though. Anyway, I hope you will think of a better layout and name.

  19. 19 John Says:

    Here’s a number from a news article I just read that your tool currently can’t handle: 四点八二亿.

  20. 20 Gin Says:

    How about 四分之三亿, or 七十个亿?

  21. 21 Mark Says:

    Good idea, John. I added in conversions from “点” and “點” to “.” during pre-processing.

  22. 22 Gin Says:

    (1) The following examples still do not convert correctly: 七十四点二五亿, 二十万零三点一八….

    (2) It would be nice to display the original input above the result after the convertion.

    (3) A student must not rely on a converter so much as to neglect the practice of convertion by heart!

  23. 23 Mark Says:

    (1) Wow, Gin. I never even considered that sort of input. To be honest, I didn’t even know myself that one could use 二五 together (with no 十s, 百s, or 千s) after a 點. The problem is, since I’m dealing with large numbers and doing floating point converstions, there’s a little bit of error. Since I don’t want numbers like 700,000,002 returned when it should be 700,000,000, I round the result of each parsing of the number. 七十四点二五亿 is being turned into 75億 because of this issue.

    Initially, my goal was just to support floating point (Arabic) numbers since they often occur in newspapers. The kinds of numbers you’re writing sure won’t come up much. Still, it bothers me when apparently valid input breaks the tool. I’m thinking the best course of action might just be to return an error if people enter 點, unless I can think of a way to eliminate the chance of misleading feedback on valid input without greatly increasing the complexity of the tool. I really appreciate your testing feedback, though!

    Seeing X個萬/億 isn’t too uncommon, so I’ve changed the tool to parse out the 個s. I’m going to have to think about what to do about X分之Y, and the 點.

    (2) What are you thinking of? Maybe two separate text boxes?

  24. 24 Gin Says:

    (1) Yes, strings after the decimal point is most frequently read without 十s, 百s, or 千s, e.g., π+40 = 四十三点一四一五九二六….

    (2) Something like “三点一四 = 3.14″ or “三点一四 converts to 3.14″ or
    “三点一四
    3.14″

  25. 25 Mark Says:

    Gin, check it out!

    二十万零三点一八 = 200,003.18
    七十四点二五亿 = 7,425,000,000
    一万零四十三点一四一五九二六 = 10,043.1415926

    You can now use numbers with 點 before 萬,億, and 兆. The tool does NOT support things like 一點八十 or other decimals before 十,百, and 千, though. Have you seen people write things like “2.6百”?

  26. 26 Gin Says:

    No, not 2.6百, nor 35千.
    You are beautiful, man.

  27. 27 Bill Shi Says:

    What about: 十之

    Should it be 10%?

    The tool can’t handle that.

    thx

  28. 28 Mark Says:

    Should it? I’ve never heard people say that before.

  29. 29 Erik Says:

    The converter at http://www.mandarintools.com/numbers.html did have many problems (thanks for the bug reports) but I’ve gone through and it should be fixed now. I’d appreciate feedback on how it currently works.

    Does the new tool handle negative numbers?

  30. 30 Mark Says:

    Wow, it’s a little bit of a surprise to hear from you. I’d thought you’d stopped working on your page. You’ve really improved your converter quite a bit, and done it in under a week. Way to go! I’m amazed, to be honest. Your converter still is generating some weird output, though. My advice is to go through the comments on this post and run all of them through your tool. After that, if you like, I’ll make a separate post about your converter and see if people will test it for you.

    It was a really hard choice for me to NOT make this tool in Perl as you did with yours. As I’ve said elsewhere, I don’t even know JavaScript, really. If you try putting multi-line input into my tool you’ll see that while it works in both IE and Mozilla, it doesn’t look quite as nice in IE, due to IE’s less than perfect Jscript regex support. In the end though, I decided to use JavaScript anyway, because of performance reasons. I think your switch from pure CGI to AJAX was a huge improvement. It’s a good compromise. Another thing you did that I’ve been considering is, to make separate text boxes for the input and the results.

    As for negative numbers, I didn’t think about them but it turns out that my tool does support them.

  31. 31 Erik Says:

    I found it wasn’t working with 四万亿 or 萬億 or with 个 added in but I’ve fixed that now. It seems to work as expected with the other numbers in the posts above. Could you say which are causing errors? A post asking others to test it would be much appreciated.

    AJAX is pretty cool. I’ll be updating a lot of my web pages to use it over the next few months. The reason I kept everything in Perl is to make it easier to reuse with my other projects. The main reason I use a separate results box that isn’t a text area is so that I can display the Chinese results using GIFs at some point for people without Chinese support on their computer.

    I tried negative numbers in your tool, both using - and 負 and neither seemed to work. I use FireFox 1.5

  32. 32 Mark Says:

    Hmm… you must have tried negative 十, 百 or 千, then. The first time I checked, I just did a few negative floating point numbers, 負四 and 負万. I just checked the tool and those were the three that it didn’t work with. Thanks for the feedback. I’ll post about your tool on my blog.

  33. 33 b1lfwa Says:

    很不错哦,,支持呀,,,

  34. 34 Sinosplice: Life Says:

    Chinese Number Tool…

    A little while back I recommended that Mark of the blog Doubting to shuō make an online number conversion tool similar to his Pinyin Tone Tool and Cantonese Tone Tool. Well, Mark has done it. These are the kinds of conversions it can do:

    Input
    O…

  35. 35 Robert Says:

    Why bother to “convert” large numbers?

    I was doing some arithmetic with large numbers the other day. I found it easier to use four-digit “chunks” than three digit chunks. Why do you think the Chinese language splits up large numbers that way? It is just easier to think of numbers in the Chinese notation.

Leave a Reply

Quicktags: