Collection: The computing architecture of device-cloud collaborative graph neural network learning over distributed environments as well as its applications

2022-11-15

share

图片7_副本.jpg

The team conducts a systematic research for device-cloud collaborative learning over large scale of graph data (which in general consists of billions of nodes and millions of edges). Specifically, the team is the first to set up the collaborative learning mechanism for cloud and edge modeling with a thorough technology of the architectures that enable such mechanism. The research can boost the co-evolution of learning models between cloud and edge, providing efficient utilization of computing resource over edge and cloud. 

The device-cloud collaborative computing platform of graph neural network over large scale data

The research has overcome the bottlenecks of dynamical learning of graph neural network over large scale of data, device-cloud collaborative learning through heterogeneous system, and lightweight deep learning framework of inference on-device, and proposed efficient algorithms for subgraph segmentation, subgraph storage and node sampling, devised large-scale graph neural network learning, inference via co-design of software and hardware, supported the implementation of device-cloud collaborative learning. The research is the first systematic implementation of edge-cloud collaborative AI.  

The research has built a universal infrastructure which consists of following components:1)large scale of graph neural network (AliGraph): Aligraph consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios; 2) lightweight deep learning framework of inference on-device (Walle): Walle is a highly efficient and lightweight deep learning framework to supports inference  on-device. The core desire behind MNN is to boost the performance of diverse ML tasks on the heterogeneous hardware backends of mobile devices and cloud servers; 3) device-cloud collaboration architecture (luoxi): luoxi consists of a deployment platform, distributing ML tasks to billion-scale mobile devices in time; a compute container, providing a cross-platform and high-performance execution environment, while facilitating daily task iteration; and a data pipeline, supporting more natural and reasonable data flow throughout mobile devices and the cloud.

The research has been awarded with 2021 Science and Technology Progress Award of Chinese Institute of Electronics (1st Prize) and the Super Artificial Intelligence Leader (SAIL) Award at the World Artificial Intelligence Conference 2020.

图片8_副本.jpg

图片9_副本.jpg

Chain Technology Framework Supporting Co-evolution of End Cloud Models

The empowered scenarios over device-cloud collaborative computing

The device-cloud collaborative computing platform empowers many of scenarios such as digital economy, industry and Judicial Judgment.

GNN over Aliyun serves more than 100 enterprises and institutions, and is applied more than 100 billion times a day on average, promoting the intelligent upgrading of the digital economy through the combination of cloud services and device intelligence; The lightweight on-device inference serves the transportation departments (such as the highway administration), large enterprises (steel and coal enterprises), small and medium-sized enterprises and merchants, and achieves intelligent forward movement in production safety monitoring, product quality analysis, logistics and human flow management; The project provides core technical support, such as graphic neural network, for the Zhejiang Provincial High People's Court, covering 45 civil, criminal and administrative causes, and provides services in 26 pilot courts across the province. The rate of sentencing in court has reached more than 90%, saving more than 50% of the time compared with traditional trials, which provides technical support for the implementation of the transformation of the intelligent judgment mode of "instant action and instant trial".

Enabled technologies over the device-cloud collaborative computing

The research has built the first industrial end-to-end general-purpose device-cloud collaborative computing machine learning infrastructure, establishing the computing framework and algorithm code open-source, leading the technology field of enabling device-cloud collaborative applications, being applied to more than 100 enterprises and institutions in the fields of government affairs, public security, finance and insurance, the Internet and industry, and enabling core scenarios such as vision, recommendation, touch and voice. It supports more than 300 million active users' real-time online demand for end-to-end inference 150 billion times a day (up to 223.5 billion times), and reshapes business models such as online economy and intelligent justice by technical means. The sales revenue increased by 6.857 billion yuan and the profit increased by 2.111 billion yuan in three years.

图片10_副本.jpg

Project Research Results Empower Intelligent Justice

Integrating Science and Education to Cultivate Artificial Intelligence Innovative Talents

The open source platform of the project has become a part of "a new generation of AI science and education innovation open platform – wise ocean". Through the construction of a large-scale AI open science and education platform and ecological community, it has promoted the effective connection of the talent chain, industrial chain and innovation chain, and participated in the AI+X micro specialty of "China's five initiatives, six school alliances, and enterprise participation", Teaching AI knowledge to non-computer professionals.

图片11_副本.jpg

Wise Ocean: New Generation AI Science and Education Platform

图片12_副本.jpg

AI+X Major of Five Initiatives from Alliances of Six Universities with Participation of Enterprises 

图片13_副本.jpg

The architecture of Wise Ocean