A corpus ideal for research in natural language processing and machine learning. You can use data tailored to various applications such as text, audio, and lexicons.
This is a corpus handled by the LDC (Linguistic Data Consortium), headquartered at the University of Pennsylvania in the United States. It includes a rich collection of text databases, audio databases, lexicons (dictionaries), and various forms of data. It can be used for various purposes such as natural language processing research, machine learning, and automatic speech development.
Inquire About This Product
basic information
【Example of Handling Results】 GALE Phase 4 Chinese Broadcast Conversation Transcripts LDC Catalog No.: LDC2013S04 DCMI Type(s): Text Data Source(s): broadcast conversation Language(s): Mandarin Chinese, Chinese 2000 NIST Speaker Recognition Evaluation LDC Catalog No.: LDC2001S97 DCMI Type(s): Sound Data Source(s): telephone speech Language(s): English Morphologically Annotated Korean Text LDC Catalog No.: LDC2004T03 Data Type: text Data Source(s): newswire Language(s): Korean
Price information
To purchase the LDC Corpus, it is necessary for the end user to contact the manufacturer directly. *Our company will act as an intermediary for payments between the manufacturer and the customer.
Delivery Time
Applications/Examples of results
Natural language processing, syntactic analysis, annotation, machine learning, automatic speech.
Company information
Tegara Corporation is forming a research and development platform that integrates specialized product procurement and sales, information provision, and support services for researchers and developers nationwide. In the field of research and development, where speed holds value, it is Tegara's mission to assist our customers in accelerating their research and development efforts, thereby contributing to the advancement of research and development in Japan and around the world. To remain a strong partner for researchers and developers, our company continuously hones new technologies and strengthens our support system every day.