CEDICT
Chinese–English dictionary
From Wikipedia, the free encyclopedia
Content
CEDICT is a text file; other programs (or simply Notepad or egrep or equivalent) are needed to search and display it. This project is used by several other Chinese-English projects. The Unihan Database uses CEDICT data for most of its information about character compounds, but this is auxiliary and is explicitly not a part of the main Unicode database.[6]
Features:
- Traditional Chinese and Simplified Chinese
- Pinyin (several pronunciations)
- American English (several)
- As of 16 July 2025[update], it had 123,524 entries in UTF-8.[7]
The basic format of a CEDICT entry is:
Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/ 漢字 汉字 [han4 zi4] /Chinese character/CL:個|个/
Example of a simple egrep search:
$ egrep -i 有勇無謀 cedict.txt 有勇無謀 有勇无谋 [you3 yong3 wu2 mou2] /bold but not very astute/
History
| Year | Event |
|---|---|
| 1991 | EDICT Japanese dictionary project was started by Jim Breen. |
| 1997 | CEDICT project started by Paul Denisowski, on the model of EDICT. Continued by Erik Peterson. |
| 2007 | MDBG started a new project called CC-CEDICT which continues the CEDICT project with a new license: Creative Commons Attribution-Share Alike 3.0 License, allowing more projects to use it.[8] Additionally a work flow has been set up to streamline the process of submitting, reviewing and processing new entries. |
Related projects
CEDICT has shown the way to some other projects:
- HanDeDict (~149,000 Chinese entries) for German[9]
- CFDICT (~60,000 entries) for French[10]
- Some older CEDICT data is also found in the Adsotrans dictionary.
- February 2012: ChE-DICC, the Spanish-Chinese free dictionary starts (currently beta)
- CHDICT (~19,000 entries) for Hungarian[11]
- CC-Canto is Pleco Software's addition of Cantonese language readings in Jyutping transcription to CC-CEDICT[12]
- Cantonese CEDICT features Cantonese language readings in Yale transcription and has Cantonese-specific words, many of which were taken from "A Dictionary of Cantonese Slang"[13] in possible copyright infringement.[14]
- CC-CIDICT (over 124,000 entries) is a free, open-source Chinese–Indonesian dictionary. It is a direct derivative of the CC-CEDICT project. The project is community-managed and released under a Creative Commons Attribution–ShareAlike 4.0 International (CC BY-SA 4.0) license.[15]