PDA

View Full Version : NLP = alle interactie ts computers en gesproken / geschreven taal


Nr.10
14 augustus 2017, 02:59
Bij wijze van introductie een defintie van NLP: natural language processing.

We staan aan de vooravond van de grote doorbraak van deze technologie.

Nr.10
14 augustus 2017, 03:18
Een kleine twintig jaar gingen Lernout & Hauspie de wereld veranderen.
Goede visie maar hun technologie was niet goed genoeg.

Nu staat de technologie er wel. En die technologie lijkt uit China te komen.
Baidu heeft de beste technologie in huis voor spraakherkenning.

How Baidu's Deep Speech 2 Is Winning The Speech Recognition Game (http://blog.tutorming.com/business/baidu-deep-speech-2-recognition-voice-command-china)
18 okt 2016

Nr.10
2 september 2017, 23:15
Einde 2007 begon een voormalig werknemer van Google, de Duitser Gereon Frahling, Linguee.
http://www.linguee.nl/?from=com
In augustus 2017 kondigde Linguee DeepL aan.
DeepL: New Supercomputer-Powered Translator Beats Google and Microsoft (https://winbuzzer.com/2017/08/29/deepl-supercomputer-powered-translator-beats-microsoft-google-xcxwbn/)
29 aug 2017
DeepL beschikt over een supercomputer met een capaciteit van 5,1 PetaFLOPS, in Ijsland.
In the coming months, DeepL plans to release an API, making its translations available to digital assistants and language learning apps.
CEO Gereon Frahling, a former Google Research member started working on a search engine for translations already back in 2007 after leaving Google. Together with his partner Leo Fink they developed crawlers and different machine learning systems for checking translation quality which resulted in the launch of Linguee in 2009.
In 2010 Linguee was expanded to more language combinations and hit a milestone of one million monthly visitors, making it the world´s most widely used dictionary website. Today, Linguee has more than 200 million visitors per month.
Work on DeepL started back in 2014 when Linguee began to add machine learning tools to learn from its huge database of high-quality translations to nurture a new automated translation system. To improve the translation quality the company recruited hundreds of professional lexicographers who check and correct content generated by algorithms.
In 2016 Linguee starts working on a neural network translation system which is the core of the new DeepL translator.
Om te proberen:
https://www.deepl.com/
Artikel TechCrunch:
DeepL schools other online translators with clever machine learning (https://techcrunch.com/2017/08/29/deepl-schools-other-online-translators-with-clever-machine-learning/)
29 aug 2017
(...) In an email, Frahling told me that the time was ripe: “We have built a neural translation network that incorporates most of the latest developments, to which we added our own ideas.” An enormous database of over a billion translations and queries, plus a method of ground-truthing translations by searching for similar snippets on the web, made for a strong base in the training of the new model. They also put together what they claim is the 23rd most powerful supercomputer in the world, conveniently located in Iceland. Developments published by universities, research agencies, and indeed Linguee’s competitors showed that convolutional neural networks were the way to go, rather than the recurrent neural networks the company had been using previously. Now isn’t really the place to go into the differences between CNNs and RNNs, so it must suffice to say that for accurate translation of long, complex strings of related words, the former is a better bet as long as you can control for its weaknesses. For example, a CNN could roughly be able to be said to tackle one word of the sentence at a time. This becomes a problem when for instance, as commonly happens, a word at the end of the sentence determines how a word at the beginning of the sentence should be formed. It’s wasteful to go through the whole sentence only to find that the first word the network picked is wrong, and then start over with that knowledge, so DeepL and others in the machine learning field apply “attention mechanisms” that monitor for such potential trip-ups and resolve them before the CNN moves on to the next word or phrase. There are other secret techniques in play, of course, and their result is a translation tool that I personally plan to make my new default.

Nr.10
2 september 2017, 23:34
Linguee’s Founder Launches DeepL in Attempt to Challenge Google Translate (https://slator.com/technology/linguees-founder-launches-deepl-attempt-challenge-google-translate/)
30 aug 2017
Barely two years after bursting into the translation tech scene, neural machine translation (NMT) is everything the MT community is talking about. Microsoft, Google, Facebook, and other large technology companies have all transitioned to NMT, as did the European Patent Office and the World Intellectual Property Organization. Even end-buyers are starting to build their own systems based on open-source models. NMT systems are data and power hungry. In late August, Chinese internet company Sogou invested into translation data provider UTH to secure a large corpus of quality data. (...)
NMT:

Neural
Machine
Translation

MT = "AI-complete (https://en.wikipedia.org/wiki/AI-complete)"
Ook wel sterke AI genoemd.