Some tentative first steps towards a Star Trek universal communicator

Описание к видео Some tentative first steps towards a Star Trek universal communicator

(Greg Baker) We urgently need computerised translation software for the rest of the world's languages. We will probably lose around 90% of the world's languages in the next 80 years.

If you want to build a translator that can translate all the world's languages, you can't use Google Translate's approach of training on millions of documents because most of the world's languages don't even have a million words written down. You have to be much more parsimonious with your data.

I've been writing software that populates the Leaftop database which has the goal of being the largest lexiconary (it currently has automatically extracted an average of 300 words from each of 1400 languages), and I am also building a universal grammar extractor which can currently inflect a plural from a singular for 11% of the world's nouns. It learned all the Latin noun declensions on its own.

This is a talk for language geeks and machine learning nerds. I'll talk about the weirdest distance metric you'll ever see (and why it is so easy to code), and
I'll talk about Hiligaynon and Swahili, why Chadian Arabic was so helpful and the trouble with Khmer. You'll see more unicode character sets in one presentation than you'll see in an internationalisation conference.

https://lca2022.linux.org.au/schedule...

Videos licensed as CC BY-NC-SA 4.0

linux.conf.au is a conference about the Linux operating system, and all aspects of the thriving ecosystem of Free and Open Source Software that has grown up around it. Run since 1999, in a different Australian or New Zealand city each year, by a team of local volunteers, LCA invites more than 500 people to learn from the people who shape the future of Open Source. For more information on the conference see https://linux.conf.au/

Produced by Next Day Video Australia: https://nextdayvideo.com.au

#linux.conf.au #linux #foss #opensource

Sat Jan 15 15:45:00 2022 at Wominjeka Theatre

Комментарии

Информация по комментариям в разработке