Did you miss a session at the Data Summit? Watch On-Demand Here.
Did you ever feel you’ve had enough of your current line of work and wanted to shift gears? If you have, you’re definitely not alone. Besides taking part in the Great Resignation, however, there are also less radical approaches, like the one Andrew Ng is taking.
Ng, among the most prominent figures in AI, is founder of LandingAI and DeepLearning.AI, co-chairman and cofounder of Coursera, and adjunct professor at Stanford University. He was also chief scientist at Baidu and a founder of the Google Brain Project. Yet, his current priority has shifted, from “bits to things,” as he puts it.
In 2017, Andrew Ng founded Landing AI, a startup working on facilitating the adoption of AI in manufacturing. This effort has contributed to shaping Ng’s perception of what it takes to get AI to work beyond big yech.
We connected with Ng to discuss what he calls the “data-centric approach” to AI, and how it relates to his work with Landing AI and the big picture of AI today.
From bits to things
Ng explained that his motivation is industry-oriented. He considers manufacturing “one of those great industries that has a huge impact on everyone’s lives, but is so invisible to many of us.” Many countries, the U.S. included, have lamented manufacturing’s decline. Ng wanted “to take AI technology that has transformed internet businesses and use it to help people working in manufacturing.”
This is a growing trend: According to a 2021 survey from The Manufacturer, 65% of leaders in the manufacturing sector are working to pilot AI. Implementation in warehouses alone is expected to hit a 57.2% compound annual growth rate over the next five years.
While AI is being increasingly applied in manufacturing, going from bits to things has turned out to be much harder than Ng thought. When Landing AI started, Ng confessed, the was focused mostly on consulting work..
But after working on many customer projects, Ng and Landing AI developed a new toolkit and playbook for making AI work in manufacturing and industrial automation. This led to Landing Lens, Landing AI’s platform, and the development of a data-centric approach to AI.
Landing Lens strives to make it fast and easy for customers in manufacturing and industrial automation to build and deploy visual inspection systems. Ng had to adapt his work in consumer software to target AI in the manufacturing sector. For example, AI -driven computer vision can help manufacturers with tasks such as identifying defects in production lines. But that is no easy task, he explained.
“In consumer software, you can build one monolithic AI system to serve a hundred million or a billion users, and truly get a lot of value in that way,” he said. “But in manufacturing, every plant makes something different. So every manufacturing plant needs a custom AI system that is trained on their data.”
The challenge that many companies in the AI world face, he continued, is how, for example, to help 10,000 manufacturing plants build 10,000 customer systems.
The data-centric approach advocates that AI has reached a point where data is more important than models. If AI is seen as a system with moving parts, it makes more sense to keep the models relatively fixed, while focusing on quality data to fine-tune the models, rather than continuing to push for marginal improvements in the models.
Ng is not alone in his thinking. Chris Ré, who leads the Hazy Research group at Stanford, is another advocate for the data-centric approach. Of course, as noted, the importance of data is not new. There are well-established mathematical, algorithmic, and systems techniques for working with data, which have been developed over decades.
What is new, however, is building on and re-examining these techniques in light of modern AI models and methods. Just a few years ago, we did not have long-lived AI systems or the current breed of powerful deep models. Ng noted that the reactions he has gotten since he started talking about data-centric AI in March 2021 reminds him of when he and others began discussing deep learning about 15 years ago.
“The reactions I’m getting today are some mix of ‘i’ve known this all along, there’s nothing new here’, all the way to ‘this could never work’,” he said. “But then there are also some people that say ‘yes, I’ve been feeling like the industry needs this, this is a great direction.”
Data-centric AI and foundation models
If data-centric AI is a great direction, how does it work in the real world? As Ng has noted, expecting organizations to train their own custom AI models is not realistic. The only way out of this dilemma is to build tools that empower customers to build their own models, engineer the data and express their domain knowledge.
Ng and Landing AI do that through Landing Lens, enabling domain experts to express their knowledge withdata labeling. Ng pointed out that in manufacturing, there is often no big data to go by. . If the task is to identify faulty products, for example, then a reasonably good production line won’t have a lot of faulty product images to go by.
In manufacturing, sometimes only 50 images exist globally,, Ng said. That’s hardly enough for most current AI models to learn from. This is why the focus needs to shift to empowering experts to document their knowledge via data engineering.
Landing AI’s platform does this, Ng said, by helping customers to find the most useful examples that create the most consistent possible labels and improve the quality of both the images and the labels fed into the learning algorithm.
The key here is “consistent.” What Ng and others before him found is that expert knowledge is not singularly defined. What may count as a defect for one expert may be given the green light by another. This may have gone on for years but only comes to light when forced to produce a consistently annotated dataset.
This is why, Ng said, you need good tools and workflows that help experts quickly realize where they agree. There’s no need to spend time where there is agreement. Instead, the goal is to focus on where the experts disagree, so they can hash out the definition of a defect. Consistency throughout the data turns out to be critical for getting an AI system to get good performance quickly.
This approach not only makes lots of sense, but also draws some parallels. The process that Ng described is clearly a departure from the “let’s throw more data at the problem” approach often taken by AI today, pointing more towards approaches based on curation, metadata, and semantic reconciliation. In other words, there is a move towards the type of knowledge-based, symbolic AI that preceded machine learning in the AI pendulum motion.
In fact, this is something that people like David Talbot, former machine translation lead at Google, have been saying for a while: applying domain knowledge, in addition to learning from data, makes lots of sense for machine translation. In the case of machine translation and natural language processing (NLP), that domain knowledge is linguistics.
We have now reached a point where we have so-called foundation models for NLP: humongous models like GPT3, trained on tons of data, that people can use to fine-tune for specific applications or domains. However, those NLP foundation models don’t really utilize domain knowledge.
What about foundation models for computer vision? Are they possible, and if yes, how and when can we get there, and what would that enable? Foundation models are a matter of both scale and convention, according to Ng. He thinks they will happen, as there are multiple research groups working on building foundation models for computer vision.
“It’s not that one day it’s not a foundation model, but the next day it is,” he explained. “In the case of NLP, we saw development of models, starting from the BERT model at Google, the transformer model, GPT2 and GPT3. It was a sequence of increasingly large models trained on more and more data that then led people to call some of these emerging models, foundation models.”
Ng said he believes we will see something similar in computer vision. “Many people have been pre-training on ImageNet for many years now,” he said. “I think the gradual trend will be to pre-train on larger and larger data sets, increasingly on unlabeled datasets rather than just labeled datasets, and increasingly a little bit more on video rather than just images.”
The next 10 years in AI
As a computer vision insider, Ng is very much aware of the steady progress being made in AI. He believes that at some point, the press and public will declare a computer vision model to be a foundation model. Predicting exactly when that will happen, however, is a different story. How will we get there? Well, it’s complicated.
For applications where you have a lot of data, such as NLP, the amount of domain knowledge injected into the system has gone down over time. In the early days of deep learning – both computer vision and NLP – people would routinely train a small deep learning model and then combine it with more traditional domain knowledge base approaches, Ng explained, because deep learning wasn’t working that well.
But as the models got bigger, fed with more data, less and less domain knowledge was injected. According to Ng, people tended to have a learning algorithm view of a huge amount of data, which is why machine translation eventually demonstrated that end-to-end purity of learning approaches could work quite well. But that only applies to problems with lots of data to learn from.
When you have relatively small data sets, then domain knowledge does become important. Ng considers AI systems as providing two sources of knowledge – from the data and from the human experience. When we have a lot of data, the AI will rely more on data and less on human knowledge.
However, where there’s very little data, such as in manufacturing, you need to rely heavily on human knowledge, Ng added.. The technical approach then has to be about building tools that let experts express the knowledge that is in their brain.
That seemed to point towards approaches such as Robust AI, Hybrid AI or Neuro-Symbolic AI and technologies such as knowledge graphs to express domain knowledge. However, while Ng said he is aware of those and finds them interesting, Landing AI is not working with them.
Ng also finds so-called multimodal AI, or combining different forms of inputs, such as text and images, to be promising. Over the last decade, the focus was on building and perfecting algorithms for a single modality. Now that the AI community is much bigger, and progress has been made, he agreed, it makes sense to pursue this direction.
While Ng was among the first to utilize GPUs for machine learning, these days he is less focused on the hardware side. While it’s a good thing to have a burgeoning AI chip ecosystem, with incumbents like Nvidia, AMD and Intel as well as upstarts with novel architectures, it’s not the end all either.
“If someone can get us ten times more computation, we’ll find a way to use it,” he said. “here are also many applications where the dataset sizes are small. So there, you still want to process those 50 images faster, but the compute requirements are actually quite different.”
Much of the focus of AI over the last decade was on big data – that is, let’s take giant data sets and train even bigger neural networks on them. This is something Ng himself has helped promote. But while there’s still progress to be made in big models and big data, Ng now says he thinks that AI’s attention needs to shift towards small data and data-centric AI.
“Ten years ago, I underestimated the amount of work that would be needed to flesh out deep learning, andI think a lot of people today are underestimating the amount of work, innovation, creativity and tools that will be needed to flesh out data-centric AI to its full potential,” Ng said. “But as we collectively make progress on this over the next few years, I think it will enable many more AI applications, and I’m very excited about that.”