AIR 038 | Academician Li Ming, Royal Society of Canada: How to deal with NLP problems with deep learning


Li Ming, Academician of the Royal Canadian Society, Professor at the University of Waterloo, Founder of Modern Information Theory, National "One Thousand Talents Program" Expert

At the 2016 CCF-GAIR Global Artificial Intelligence and Robotics Summit, Academician Li Ming accepted an interview with Lei Fengnet and shared his views on using deep learning to deal with NLP (natural language processing) problems and the future of deep learning in NLP research. application

Lei Feng Network: Please briefly talk about your election to the Royal Society of Canada.

Academician Li Ming: The process of being selected into the Royal Society of Canada is not too cumbersome. After being selected as a candidate, one or two academicians will write a support letter. The Academician Selection Committee will conduct the selection and be fair. Unlike the Chinese Academy of Engineering, the Royal Canadian Academy of Sciences includes science, engineering, social sciences, and law.

Lei Fengnet (search for "Lei Feng Net" public concern) : Is there a breakthrough in the scientific understanding of deep learning?

Academician Li Ming: Siri mainly uses keywords to identify semantic information. For example, if you ask “What fish eat”, it will answer “seafood museum” type information, and it is easy to make mistakes. The pattern (icon) matches are too strict and inflexible. , "Who is the president of the United States", you can get the exact answer "Obama", but for the form of the question "Who is the president of the United States?", semantic understanding based on templates cannot be identified. And the deep learning we use is different from other companies. We can make the answer to this question robust and there will not be many mistakes. This is very critical, that is, deep learning can solve this problem. . Deep learning also has its limitations. For example, it requires a lot of big data to train, and our company has the advantage that we can generate large amounts of data on our own.

Lei Fengnet: Microsoft's chat robots are trained to collect data through the Internet. When they answer questions, they may appear unfriendly. Is it possible to create a personalized chat robot in the future?

Academician Li Ming: For example, the reason why Xiao Bing, a chat robot, has so many problems is that the main reason is information replication. The deep learning model we use can filter out the most mainstream answers (friendly answers) and filter out the tributary answers. This is also the problem of deep learning. The answer is not special, but deep learning can be to a certain extent. Avoid this unfriendly answer. Or is it that deep learning will be able to accommodate all problems and achieve personalized chat because the sample is large enough? According to statistics, Peas is ten times more likely to answer questions than Xiao Bing. Many of the current deep learning are not deep learning in the true sense. Our Doudou, there are more than 20 deep learning mode, you can filter out the best answer results. In addition, the chat model of our chat robot has nothing to do with the dialect, only the input of semantic information.

Lei Fengwang: In practical applications, does the “information distance” theory have theoretical limitations or technical problems?

Academician Li Ming: Our theory of “information distance” has no theoretical limitation, but it has technical limitations. Semantic distance is not defined, it cannot be calculated, and information distance is defined, but it cannot be calculated. It can only be obtained through compression. Metrics use "information distance" to approximate "semantic distance." Specific compression methods: In simple terms, such as "tomorrow tomorrow" will be compressed into a word "tomorrow" to achieve information simplification.

Lei Fengwang: In terms of realizing the understanding of large-scale text relations (understanding the structure and meaning of human language expression), what are the advantages of deep learning compared to other algorithms and models?

Academician Li Ming: Compared with other algorithms and models, the advantage of deep learning lies in its dialogic robustness. The methods commonly used to implement robot dialogues are mostly keyword method and template matching method. However, in contrast to these two methods, the dialogue information of deep learning processing can appear in many forms, can tolerate some errors, and achieve more natural people. Machine interactive dialogue.

Lei Fengwang: What is the difference between the "Chinese and English participles" proposed by your team?

Academician Li Ming: Compared to Chinese word segmentation, our system does not need to consider English word segmentation because English words are separated by spaces and no word segmentation is required.

Lei Feng Network: Do you think, deep learning in the computer intelligent processing of natural language needs to do what aspects of improvement?

Academician Li Ming: In the next step, based on big data and good learning from deep learning, future intelligent robots can read and read newspapers. For example, after reading the Dream of the Red Chamber, future intelligent robots can distinguish the relationship between the characters, such as who likes who in the novel. Who is who's big aunt, for example, can distinguish who is tied up with who in the Romance of the Three Kingdoms. Deep learning now can answer questions like the following: Lin Daiyu went to Jia Baoyu's room. Q: Who went to Jia Baoyu's room? The current deep learning can be answered by Lin Yuyu, who can answer questions with an accuracy of 70% to 80%.

Lei Feng: Can deep learning distinguish ambiguities in demonstrative pronouns?

Academician Li Ming: At this stage, deep learning does not yet possess sufficient background knowledge to distinguish contextual language ambiguities. However, after training, I think it can be achieved. The robot dialogue can now achieve a wide range of questions and answers. However, for some small language issues, such as how to understand "chicken does not eat," deep learning has not yet taken into account.

Lei Fengwang: The current human-machine dialogue model has developed rapidly. Can we share your views on the question of "will the machine be aware of it?"

Academician Li Ming: Actually, most of the current natural language dialogues for robots are trained. In many studies, no one can clearly define: what is consciousness? I think robots are talking, similar to people talking about sleep. You ask some sleepwalking person some questions. He is not conscious when answering questions, but he can say that the head is there. The robot dialogue is actually the same.

Multi-Core Coaxial Cable

UCOAX Produce AWG 36-46 Micro Coaxial Cable, these cables are widely used for Drone/AR/VR/Notebook/Smartphone/Medical Probe and others.

Ucoax provides a wide range of standardized and customized connectivity solutions in high-frequency & high-speed applications. These solutions ensure the reliable transmission of signals, data and power in highly demanding applications.


Our selection of cable assemblies range from simple jumpers to power and high-speed data cables to complex harnesses. They are used in a wide variety of applications and industries to interconnect components, sub-systems, and equipments.


Our investment into making sure we are the best cable tooling and assembly company you can choose is second to none in our category, and our capabilities are vast. As we serve many different industries, we maintain inventory on applicators and connectors from Molex, JST, TE, Deutsch, Hirsose, and many others. Our library of crimp tools is maintained and calibrated regularly by our engineering staff, ensuring we always have the right tool for any assemblies for wire cables.


Multi-Core Coaxial Cable,Multicore Cable,Multi Core Coaxial Cable,UL Wire Colours

UCOAX , https://www.ucoax.com