ASLIB London: Automatic Translation Tools at WIPO

The Association for Information Management: Translating and the Computer Conference – 17 & 18 November 2011

ASLIB

ASLIB London: Automatic Translation Tools at WIPO
Bruno Pouliquen, World Intellectual Property Organization

  • UN agency in charge of Intellectual Property (including patents)
  • Millions of patent applications (about 150K per year)
  • Patentscope: search engine developed specifically for patents; user can search in different languages
  • Using Statistical Machine Translation (Moses) – inexpensive ; quality depends on the amount of data
  • CLIR: Cross-Lingual Information Retrieval; developed by WIPO; used for keyword searches
    • Enter a search query in EN, DE, ES, FR, JP, KO, PT, RU, ZH (keyword translation, automatically expanded into other languages)
    • Example: when searching for the word “toothbrush”, the search yields over 7,000 results with documents containing the word “toothbrush” in multiple languages
    • Using Google Translate, Microsoft Translator, KIPO Translate
    • TAPTA suite:
      • TAPTA-js (java suite – the user drives the translation)
      • TAPTA-Web (similar to js, but web-based)
      • TAPTA-Web-Lite: WIPO’s external gisting tool using MT, available to users for free
      • COPPA: Corpus of Parallel Patent Applications

Conclusion:

  • Data-driven approach, using parallel corpus, built automatically
  • TAPTA suite: prototype level; web-lite now in production (500 documents translated per day; mostly from Chinese – about 60%)
  • Future work: encourage innovation through Corpus of Parallel Patent Applications