Written by Yuan Jinhui; Translated by Wang Kaiyan, Dong Wenwen
This year, the controversy about AI industrialization has become a hot topic. There are not only negative phenomena such as criticism of AI “research results are hard to break through in academia, and also difficult to commercialize in industry” from academia, AI scientists leaving the industry and returning to academia, but also positive encouragement from the successful listing of a number of AI unicorns. So, is there an opportunity for AI industrialization? And where are the opportunities? On these industry hot topics, Yuan Jinhui, the founder of OneFlow, launched a systematic elaboration in the QbitAI live.
The following is based on the content of the speech, with deletions:
In previous years, society was crazy about AI. For example, there were discussions about the coming singularity, AI replacing humans, and fully automated driving by 2020. Practitioners’ salaries rose, and a large number of famous professors jumped into industry.
But in the past year, there has been a trend to turn away from AI.AI has experienced three ups and three downs, and many people are concerned about whether it is now at a low point again. I think it is significant for AI practitioners to understand the essential advantages and shortcomings of AI, so that they will not follow the herd.
We should have a basic judgment that today’s downturn is not the same as before. Since the two previous AI booms didn’t solve much of the problems. Today, even though there are doubts about AI, it has made achievements and is making inroads into more areas. It’s just that people are too optimistic, expecting too much from AI, or being unrealistic; now we should be more realistic and regard AI from an objective perspective.
In this sharing, I would like to discuss with you the following questions:
- What is the essence of deep learning?
- Is there any fundamental progress in deep learning?
- What are the limitations of deep learning?
- Is deep learning a bubble or a technological revolution?
- Are there any industrialization opportunities for deep learning? Where are the opportunities?
How to Take an Accurate and Objective View of AI
The Essence of Deep Learning
It has been almost 10 years since the rise of deep learning. At first, deep learning was mythologized by people because it obtained some amazing and miraculous results. But after more and more people got familiar with deep learning, they started to vulgarize it, saying that deep learning is nothing but a function fitting.
In simple terms, deep learning is to find a mapping that fits the data best inside a given hypothesis space, which is a mathematical definition. In the case of image classification, for example, to train AI to tag images taken by a camera to see if there are cars, people, or fruits in the image is to complete mapping from the pixels of the image to the linguistic tags. There is an infinite number of mappings available, but the best implementation is the human brain.
Fitting means that we want the computer to automatically search for a mapping that is very close to the human brain’s function in a space of mappings. Deep learning provides a great initial search space, i.e., a function space formed by multi-layer nonlinear mapping forms, and gives a set of algorithms to automatically search for the “optimal” mapping in this space.
Deep learning has a universal approximation capability; theoretically deep learning has the ability to approximate any mapping, no matter how complex it is. As shown in the figure on the right above, the one-layer perceptron model can only use hyperplane to divide space, the two-layer neural network can already represent convex polygons, and the three-layer neural network can already express voids or concave polygons.
With the improvement of big data and computing power, deep learning has defeated traditional machine learning algorithms in various fields since it came out and unified data-driven AI algorithms into neural networks, which has completed the standardization of algorithms. Previously, an AI graduate student had to learn multiple algorithms: support vector machines, decision trees, hidden Markov models, Bayesian networks, Markov random fields, and so on. Each algorithm had its own set of mathematical methods, and each field had its own most efficient algorithm.But by today, you’ll find that in almost every field, SOTA algorithms are neural networks, and behind neural networks are BP algorithms.
So the awesome thing about neural networks and deep learning is the standardization. Standardization is still taking place, and even the structure of neural networks in different fields is converging.
If you pay attention to the recent trend, you will find that Transformer is starting to enter the field of CV, and CNN is being used in NLP. Neural network structures are becoming more and more simplified and unified.
The biggest benefit of algorithm standardization is that instead of writing software for each algorithm, one set of software (i.e., a deep learning framework) can serve all domains.
Limitations of Deep Learning
Deep learning is essentially statistical learning in machine learning, so it has the limitations of machine learning as well.
For example, since deep learning is to find a function that fits the data with great quality in a hypothesis space, there will be a problem that if the pre-given hypothesis space does not contain the real solution, then no matter how to solve the problem, it can only find a solution that is close to the real answer, rather than the optimal solution. In particular, when the real solution is far from the hypothesis space, it is difficult to find the approximate solution.
Taking the figure on the left as an example, the goal is to distinguish between circles and forks, and we can see at a glance that the most suitable solution is a quadratic curve. If we limit the search in the linear model at first, it can only get a line, which is caused by small hypothesis space and results in underfitting. Given a cubic equation, it is theoretically able to learn the quadratic equation, but may not be found by the algorithm. It is certainly best if the algorithm is allowed to search in the quadratic equation.
Those familiar with machine learning know that this is the problem of picking a hypothesis space, which is the core job of algorithm scientists — given a problem, determine a great algorithm hypothesis space, so that the problem is feasible for machine learning. The hypothesis space is like a big stone, and the algorithmic scientist is like a sculptor. The sculptor keeps chipping away at the stone with a knife, cutting away all those useless pieces. When the remaining part is closer to the target, the stone is handed over to the machine to be processed and polished, and finally resulting in a wonderful sculpture.
At the same time, deep learning has to conform to the laws of statistical learning — if the hypothesis space of the problem is very complex, it may require a particularly large sample to guide the algorithm to converge to a satisfactory solution (i.e., the sample complexity problem). This affects the cost of deep learning, how expensive it is to collect and label enough data, and how much computing power is needed to train the model, etc.
By exploring the essence and limitations of deep learning, we can understand the various highs and lows that AI has experienced. That is, deep learning is effective but not a panacea. We should not be too optimistic when AI is at its peak, and we should not be too pessimistic when it is at its trough.
Is Deep Learning a Bubble or a Wave
So is AI a tidal wave, or just a bubble? I think it has to be understood in a larger context, that is, the context of information technology development.
Information technology is essentially the simulation of the real world by modeling real or physical things into program code and then running them on a computer. Because this simulation process is faster and cheaper to run than the real world, it is possible to use the simulation results to make predictions and feedback to the real world. This is the essence of why computers can bring help to humans on all occasions.
Previously, when people solved problems with the help of computers, they had to first understand real-world mechanisms through engineers and scientists, build a model, and then express it as a program, which is a white-box modeling approach. And modeling is the most difficult step when simulating the real world with computers.
However, AI brings a new approach that does not require much understanding, that is, black-box modeling — with enough data and no need for human understanding, black-box modeling can fit models that match or even exceed the white-box approach and can replace it.
For example, writing programs to recognize images without deep learning, which computer vision researchers have been working on for decades, was achieved in one fell swoop with deep learning. This new mechanism can greatly accelerate the process of migration from the real world to the virtual world. To my point of view, this is the most essential progress brought by AI and data-driven technology. From this perspective, AI is definitely a technological progress that can be recorded in the history of science and technology.
White-box + Black-box, the Right Way for Deep Learning
We can draw a diagram according to the different means of building information systems to solve problems.
These two columns on the far left are both white-box approaches before the rise of deep learning, both requiring a human to write the program. The first approach is to write the code without dividing it into modules. Those who have programming experience may have a first-hand understanding of the pitfalls of this solution, where software complexity exceeds human comprehension and gets out of hand. Therefore, the practice will introduce a huge number of programming techniques to overcome, such as object-oriented, isolation, decoupling, modularity, architecture, design patterns, and so on, which are shown in the second column.
If we are confident in the ability of deep learning to learn no matter how hard the problem is, we tend to adopt the third approach: collect the raw input and the expected output to form the training data, and give it directly to the AI. For most scenarios, this approach is too optimistic. In many cases, the problem is too complex, requiring too much training data and too much computation. If the learning has to be conducted in a large hypothesis space and with few data, the result will be disappointing.
This shows that even if you take deep learning to solve this problem, you must do modularity, decompose the big problem into multiple small problems, and these small problems have a specific combination of relationships between them. For example, autonomous driving is decomposed into a series of sub-problems such as perception and decision making, and each sub-problem is limited to the scope that AI or deep learning can solve, instead of expecting AI to solve the whole problem.
In practical scenarios, it may not be optimal to use deep learning for every subproblem. For example, if the mechanism of some modules is clear enough to be easily solved by a white-box approach (e.g., a mathematical formula), there is no need to make a mountain out of a molehill.
So, in the end, the most reasonable solution may be the “white-box + black-box” approach shown on the far right. For a particular business, business experts and AI experts cooperate to break down the problem to see what problems are suitable for AI to solve and what problems must be solved manually, which requires objective judgment and reasonable choice.
Software 2.0: The Age of Data Programming
Many thought leaders have made a good summary and generalization of the value of deep learning. For example, Andrej Karpathy, head of AI at Tesla, wrote a blog called Software 2.0 in 2017. In his point of view, it used to be Software 1.0, which means that all software has to be written by people, but now Software 2.0 can be done in a data programming way.
Training the weights of a neural network is, in essence, programming.
The ARK Foundation, which was famous for betting on Tesla and Bitcoin, wrote a report listing what they see as the one or two dozen most important technologies that will impact the future, the first of which is deep learning. ARK explains the importance of deep learning from the perspective of Software 2.0 data programming, and they predict that in 2037 AI will create more market value than the sum of all previous information technologies, up to more than $30 trillion.
From the perspective of history, the industrial revolution from the invention of the steam engine to electricity broke through the limit of human physical strength; information technology and AI break through the limit of human brainpower. Therefore, I am very optimistic about the social value created by AI technology. AI has both its essential advantages and its limitations, and we should neither over-expect nor over-depreciate it.
To summarize, our judgment on AI — it is an epic technological advancement and a universal technology that will definitely become a very important part of the digital infrastructure.
How to Predict the Future of the AI Industry?
Recently, AI industrialization seems to be popular. Many are the Party A mastered the scenario, using AI to achieve good results, which is a relatively healthy model.
Next, we will focus on AI industrialization. Many people with technical backgrounds are thinking about this problem: if they have a good hammer, how can they make money with it? In fact, there have been many problems in the exploration in recent years, mainly because AI is not standardized enough. So some people say AI does not exist the possibility of standardization, questioning whether AI has the opportunity of industrialization, I do not agree with this.
To predict the opportunities of AI industrialization, I think we have to look at it from the perspective of popularization, standardization, automation, instrumentalization, and servitization.
Trends of AI Standardization
The trend of AI standardization is actually happening. In addition to the standardization of algorithms, there is actually standardization at all levels. The standardization of algorithms brings the opportunity to standardize software. Deep learning frameworks are being standardized, and hardware, technology platforms, and best practices are also being standardized.
Taking deep learning frameworks as an example, one of the trends we can notice is the standardization of interfaces. Engineers like PyTorch’s interface algorithm the most, and we will find that all frameworks go to learn it at the API design level. There is a need for migration between frameworks, and since training and deployment are not done with the same framework, the standardization of intermediate formats is happening, but the IR of each framework is actually similar.
At the hardware level, although the chip market is highly competitive, it is more and more similar at the API design level. Many APIs imitate CUDA and there is also some trend of standardization at the graph compiler level, with some common components emerging, such as MLIR. The interface between these chips and the upper layer software interface is more consistent and the architecture from chip to cluster level is very close.
There are some pain points in the technology platform — if individuals want to build a server to install the framework, they will find that they have to deal with drivers, versions, install several frameworks, and there may be conflicts; when training models, some file access may be in other network file systems. If the server is shared by multiple users, it is also necessary to coordinate the time. It may also involve a series of problems such as data permissions, file isolation and so on. But now there are more and more standardized solutions like K8S and Docker. Some enterprises need elastic capacity expansion, some use it in the private cloud, and some occasionally need to expand it on the public cloud, which requires multi-cloud support.
The actual implementation of the business will also involve the tuning problem mentioned when we discuss the nature and limitations of deep learning, such as: which modules are solved by AI and which modules do not need AI? How much data is to be labeled and how to label these data? CNN, Bert, or transformer for the model? For these problems, larger enterprises may have discovered and precipitated some best practices, so engineers are not required to repeat the practice.
There is a recent buzzword called MLOps, a series of principles and concepts that respond to exactly the problems mentioned above. For example, it can automate, have many models, super-references, can track the results and process in each model training, debug, test monitoring visualization, integrate these continuously, and go online automatically after training.
We will see that after standardizing all these, from data preparation and model processing to test monitoring and resource management, it can be done on one platform. When algorithm scientists or business personnel work on such a platform, they use the least amount of computing power, while the process is very scientific, the efficiency is the highest, and the labor cost is greatly reduced.
We believe that according to the trend of standardization, from framework standardization to platform standardization and workflow standardization, we are able to provide enterprise customers with AI products like databases in the near future. For traditional enterprises, as long as they understand the interface, they can obtain AI capabilities through a very standardized operation.
History repeats itself. Databases have developed over decades into a very large industry, and now there are new business opportunities in combination with cloud computing. What the database has gone through is also very instructive for AI industrialization.
In the early days, people did not have databases for information systems, and each information system had to be developed separately based on file systems. Later, people found that some structured data in different information systems are common, and there is a fixed set of adding, deleting, checking and changing, which can be described by relational algebra. Therefore, relational database appeared.
If each company in the vertical industry makes a database by itself, the R&D cost will be large, and the effect will not be satisfactory. Therefore, enterprises specializing in databases gradually appear.
We can find that in each vertical industry scenario, there are similar business that needs to build various information systems. Where does this information system come from? There are two channels, one is from companies specializing in databases, and the other is to integrate information systems that combine databases with users’ specific needs, such as finance, manpower, supply chain, etc. The later intermediate enterprises are emerging.
There are some large enterprises specializing in database, such as Oracle, IBM and Microsoft. At present, databases are starting to move toward Cloud-Native data warehouses, and opportunities for standardization have emerged for intermediate integrators. Previously, a set of software was sold to each enterprise with a license, which might involve customization and the viscosity of users is not strong enough. Now, however, if all customer businesses are on the cloud, it can provide standardized products and services on the cloud. At this time, relatively large SaaS enterprises that do vertical application integration have emerged.
According to the development history of the database industry, there are actually three layers, so will AI also be divided into such three layers? We can see that the most active layer in the first few years is the AI algorithm provider layer, very much like the database field. The integrators of system information before the emergence of SaaS had low barriers and were not standardized enough. One day, can standardization and the mode of AI as a service be realized?
Previously, the entire industry neglected the lowest level of standardized infrastructure, but with the intermediate algorithm provider model blocked and the evolution of the standardization, the AI industry has attracted more and more attention in the past year, and the trend of the AI industry pattern is becoming more and more clear.
Finally, I would like to summarize my views:
- The essence of deep learning is that it provides a way to model in a black box. It is because of the black-box nature that it can be standardized to work anywhere; if it is interpretable white-box, which means it is relevant to a specific domain, the degree of standardization may be reduced.
- Compared with traditional machine learning, the biggest progress of deep learning is standardization, and standardization brings the opportunity of industrialization. What are the limitations? It needs to be given a suitable hypothesis space in advance, otherwise there may be an embarrassment like the key falling in the dark but looking for it under the street lamp.
- From the perspective of data programming Software 2.0, AI is a technological revolution, not a bubble.
- The future AI industry is likely to develop towards the hierarchical and professional division such as database. Therefore, we believe that there will be a market opportunity for standardized infrastructure, which was overlooked in the context of the great popularity of algorithm integration in previous years.
- AI industrialization opportunities lie in standardized infrastructure and the trend of Cloud-Native.
Live playback link: https://www.bilibili.com/video/BV1eL4y1a7SV/
Related articles:
Welcome to visit OneFlow on GitHub and follow us on Twitter and LinkedIn.
Also, welcome to join our Discord group to discuss and ask OneFlow related questions, and connect with OneFlow contributors and users all around the world.