Chinese’s "FSD"? Exploring Shang Tang’s unique AGI technology
In November, 2023, Tesla announced that it had started to introduce the fully autonomous driving (FSD)V12 version to employees, which will introduce the neural network system and use the latest end-to-end autonomous driving technology, instead of relying on complicated and lengthy codes. In 2024, FSD V12 began to be fully promoted, and at the same time, it became possible for FSD to enter China. For a time, "end-to-end" has become the hottest word in the field of autonomous driving all over the world.
What exactly is end to end? Does landing end to end mean that the car will become a "robot"? How much help and influence does large model, multi-modal and neural network learning have on fully automatic driving? What is the strength of China Company in the field of AGI? Some time ago, we visited the headquarters in Shang Tang and talked with Dr. Wang Xiaogang, an industry expert, co-founder and chief scientist of Shangtang Technology and president of Jueying Smart Car Group. In the interview, we also learned more about the development and planning of Shang Tang Jueying in the automobile industry.
"Shang Tang is a leader in the field of AGI"
"Shang Tang’s business is all over the world"
1. What is the difference between AGI and traditional AI? What is end to end?
AI stands for artificial intelligence, which can perform specific tasks or solve specific problems, such as speech recognition, image processing and natural language processing. It can be highly specialized but limited to specific fields. Nowadays, AI technology is mature and widely used in medical, financial, transportation and other industries.
"AI is applied to high-speed rail detection"
"AI is applied to mine operations"
"AI is applied to medical examination"
AGI (General Artificial Intelligence) refers to a system with human-like general intelligence, which can show extensive adaptability in different tasks and fields. It is relatively simple to develop an artificial intelligence system in a specific field, and it is only necessary to train the model through a large number of data and specific algorithms. However, AGI needs to simulate the extensive cognitive ability and self-learning ability of human beings, which is extremely difficult to achieve.
"AGI requires more technology"
"Shang Tang’s big artificial intelligence device"
To better understand the end-to-end, we need to compare it with the traditional autopilot control logic: the traditional autopilot system adopts a modular deployment strategy, in which each function, such as perception, prediction and planning, is independently developed and integrated into the system, and it needs to be implemented step by step. End-to-end autonomous driving can make judgments directly after "seeing" the external scene, just like people, from information input to decision execution in one go, without intermediate links.
Direct perception emphasizes that the system obtains environmental information directly from the original data without too many intermediate processing and conversion links, which is a key starting point in end-to-end communication. Direct decision-making is to directly generate driving strategies and action instructions based on the perception results, which reduces the complicated reasoning and conversion process in the middle and is also an important embodiment from end to end.
End-to-end can realize the coherence and integrity of the whole automatic driving process, which includes not only perception and decision-making, but also the transformation of decision-making into actual vehicle control actions and seamless connection and efficient cooperation in the whole process. Therefore, direct perception and direct decision-making are one of the core features of end-to-end, but they cannot be simply equated with end-to-end. End-to-end is a more comprehensive concept covering the whole autonomous driving system from input to output.
"End-to-end is the key process of AGI technology development"
Of course, end-to-end can not only be applied in the field of intelligent driving, but this big model is more like people’s way of thinking, which saves complicated steps in the middle and reduces the loss of data. From information input to strategy output, the same set of algorithm models are used, often large models containing a lot of data and information. End-to-end application is the key process of AGI technology development.
Second, the core of developing AGI technology is originality.
AGI technology has been the focus of development in various industries all over the world in recent years. As a top expert in AGI field, Wang Xiaogang also shared some views with us.
When it comes to AGI, we can’t help but mention the hottest Chat GPT, the new GPT-4o, which combines large language model with multimodal, leading the development of the whole industry. However, behind the success, it is also the common progress of many top technology companies. Microsoft has provided a large-scale hardware and software infrastructure for Open AI, and Google has been studying related basic algorithms and Transformer neural network models for many years.
"Shang Tang Ruying Digital Human Video Generation Platform"
Although there are endless large language models and related applications in China, most of them are not original, and they are likely to fall into the embarrassing situation of "the price is getting lower and lower, but the core technology is progressing slowly".
Therefore, the development of large-scale models should not be rushed to commercialization, but should focus on improving their own capabilities. The key to the future lies in the joint training of multimodal data, which requires cooperation in many fields such as physics, psychology, cognitive science, data science and mathematics. Diversified data will help balance prejudice, reduce illusions and make large models more stable and reliable.
"Language model is a hot topic in recent two years"
At present, Open AI has made some progress in the fusion training of multimodal data such as video, pictures, voice and text. Although low latency and bionic interaction are only appearances, there is a prototype of AGI behind it. The realization path of AGI depends on the quality and diversity of training data, and the alignment and fusion of multimodal data in high-dimensional space is the biggest technical difficulty at present. The development of AGI needs not only technology, but also faith and love. China’s AGI needs its own Oppenheimer. Enterprises should focus on improving core competitiveness and technical originality, instead of falling into price war, so as to promote the long-term development of China’s AGI.
"Shang Tang has always insisted on technological originality."
Shang Tang has been insisting on technical originality step by step, which is why Shang Tang can stand out among so many companies related to artificial intelligence. As early as 2014, Shang Tang team released the DeepID series face recognition algorithm, which exceeded the human eye recognition rate for the first time, and even surpassed the DeepFace algorithm released by Facebook at the same time, achieving a breakthrough from 0 to 1.
"Team Shang Tang stands out among a number of artificial intelligence companies in China."
In 2018, Shang Tang started the research on the big model. At that time, there was no infrastructure that could provide enough computing power, not even the top domestic Internet companies Ali and Tencent. Shang Tang started the infrastructure construction in Shanghai Lingang, and the AIDC artificial intelligence computing center was laid out in advance for the future AI cloud computing and cloud services. With its own large-scale infrastructure, Shang Tang can develop in the industry more easily.
"AIDC artificial intelligence computing center is located in Shanghai Lingang"
In 2023, end-to-end technology became the key word of the industry with Tesla’s release of FSD V12, but as early as 2022, Shang Tang released end-to-end technology and said that end-to-end is the future. Recently, the multimodal explosion of GPT-4o is not a new technology for Shang Tang, and it has been studied and put into use for many years. Not long ago, SenseChat V5 of Shang Tang set a new record for SuperCLUE with a total score of 80.03, and surpassed GPT-4-Turbo-0125 in Chinese comprehensive score. This is the first time that a large domestic model has surpassed GPT-4 Turbo to reach the top in the Chinese benchmark test of SuperCLUE.
SenseChat V5 in Shang Tang has set a new record for SuperCLUE in China.
Shang Tang has always insisted on the originality of AGI-related technologies, and has come to the forefront of the world. Wang Xiaogang believes that homogeneous competition will lead to waste of resources, and originality is the source power to promote the development of global artificial intelligence industry. Of course, originality also means more uncertainty and greater risks, but if it succeeds, the breakthrough will be huge for the whole industry, which is what Shang Tang wants to achieve.