MeCab

MeCab
MeCab
Developers	Taku Kudo, Google Japanese Input project
Stable release	0.996 / 18 February 2013; 13 years ago
Written in	C++, has modules for C, C#, Java, Perl, Python, and Ruby
Platform	Cross-platform
License	Tri-licensed under GPL, LGPL and BSD licenses
Website	https://taku910.github.io/mecab
Repository	github.com/taku910/mecab ;

MeCab is an open-source text segmentation library for Japanese written text. It was originally developed by the Nara Institute of Science and Technology and is maintained by Taku Kudo (工藤拓) as part of his work on the Google Japanese Input project.^[1]^[2] The name derives from the developer's favorite food, mekabu [ja] (和布蕪), a Japanese dish made from wakame leaves.

DevelopersTaku Kudo, Google Japanese Input project

Stable release

0.996 / 18 February 2013; 13 years ago

Written inC++, has modules for C, C#, Java, Perl, Python, and Ruby

PlatformCross-platform

Quick facts Developers, Stable release ...

Close

The software was originally based on ChaSen and was developed under the name ChaSenTNG, but now it is developed independently from ChaSen and was rewritten from scratch. MeCab's analysis accuracy is comparable to ChaSen, and it is about 3–4 times faster.

MeCab analyzes and segments a sentence into its parts of speech. There are several dictionaries available for MeCab, but IPADIC is the most commonly used one as with ChaSen.

In 2007, Google used MeCab to generate n-gram data for a large corpus of Japanese text, which it published on its Google Japan blog.^[3]

MeCab is also used for Japanese input on Mac OS X 10.5 and 10.6, and in iOS since version 2.1.^[4]^[5]

[1]

[2]

[3]

[4]

[5]

MeCab

Example

References

External links

Related Articles