ASLIB London: Automatic Translation Tools at WIPO
The Association for Information Management: Translating and the Computer Conference – 17 & 18 November 2011
ASLIB London: Automatic Translation Tools at WIPO
Bruno Pouliquen, World Intellectual Property Organization
- UN agency in charge of Intellectual Property (including patents)
- Millions of patent applications (about 150K per year)
- Patentscope: search engine developed specifically for patents; user can search in different languages
- Using Statistical Machine Translation (Moses) – inexpensive ; quality depends on the amount of data
- CLIR: Cross-Lingual Information Retrieval; developed by WIPO; used for keyword searches
- Enter a search query in EN, DE, ES, FR, JP, KO, PT, RU, ZH (keyword translation, automatically expanded into other languages)
- Example: when searching for the word “toothbrush”, the search yields over 7,000 results with documents containing the word “toothbrush” in multiple languages
- Using Google Translate, Microsoft Translator, KIPO Translate
- TAPTA suite:
- TAPTA-js (java suite – the user drives the translation)
- TAPTA-Web (similar to js, but web-based)
- TAPTA-Web-Lite: WIPO’s external gisting tool using MT, available to users for free
- COPPA: Corpus of Parallel Patent Applications
Conclusion:
- Data-driven approach, using parallel corpus, built automatically
- TAPTA suite: prototype level; web-lite now in production (500 documents translated per day; mostly from Chinese – about 60%)
- Future work: encourage innovation through Corpus of Parallel Patent Applications