The Impact of IC Chips with Artificial Intelligence
























- Slides: 24
The Impact of IC Chips with Artificial Intelligence on Telecommunication Infrastructure Vimicro AI Chip Technology Corporation Yinong Zhang, Ph. D. Apr. 25 th, 2018
PART ONE AI 2. 0 and Telecommuncation
AI 2. 0 Definition from the Chinese Academy of Engineering AI 2. 0 refers to artificial intelligence in the era of big data and ubiquitous network, and artificial intelligence that integrates data-driven and knowledge guidance techniques. AI 2. 0 changes the computing itself , transforms big data into knowledge, and supports better human decision-making. We are still at the very early stage of AI 2. 0 3
Four Key Factors in Promoting AI sin or es s Alg ith m Bu ü The breakthrough of 。 deep learning algorithm has led to the prosperity of the current generation of AI ü Identifying appropriate usage scenarios with business value determines whether an AI product is successful or not Da ü The massive amount of extracted ,or simulated, or multidimensional data is used to train AI model for later real time inference ta What is the killer application? Ch ip ü Deep learning has higher demand on parallel computing and data exchange. IC Chip is instrumental in providing the appropriate processing power 4
The Market Scale for AI, AI Chips, and China Mobile Internet Global AI Chip Market Scale ($100 M) Global AI Market Scale ($100 M) System vs. chip ratio: 10: 1 Chip grows faster In Ch in a, Data Rate and Spectrum Efficiency pe ru se rc on tri China Mobile Internet User Scale (million ) bu tes $5 0 t o. A I 5 G provides steam for a new gowth engine 5
Challenges to Network Architecture after Introducing AI Real Time Training Data derived from mobile network may have user privacy information and has the risk of privacy leakage. The cost of data transmission is high. If the unprocessed data are uploaded for cloud data analysis , the network could be frequently jammed ü Enhances security checking on edge devices. Plan it up in the front. ü Pushes AI to edge devices (base stations, terminals), with reasonable amount of AI in edge devices, and extensive amount of AI in cloud server. The amount of computing on the cloud side is way too large, assuming data from edge devices are aggregated without pre-processing This consumes a lot of computing resources and becomes the bottleneck of data processing. Mobile network is sensitive to round trip delay, which makes cloud data processing for real-time (<10 s) training & interference unrealistic; 6
Applications of AI in Telecommunication Network Infrastructure Operation & Maintenance • Network healthiness analysis and early warning • Network alarm root cause analysis • Network fault debug and fix • Network fault prediction • …… Network Optimization • Traffic regulation and congestion prevention • Parameter management and RF automation Service Assurance • Intelligent collection and monitoring of operation data • Network energy saving • Environment aware usage experience improvement • Mobile network coverage and capacity optimization • Intelligent service recommendation • Mobile network load balancing • Qo. S assurance of different type of requests • 5 G massive MIMO automatic configuration • …… Network Management Network Service • …… Usage Analysis Operational Deployment Management • Hot spot analysis and prediction • IP network management • Optimal deployment of network services • User mobility pattern analysis • Intelligent network software deployment • Server architecture and optimal wiring deployment • User network NPS prediction • …… • Smart forward pass management and combination • …… • Intelligent network time slicing management • Intention based closed loop network control • …… 7
Examples of Better Telecommunication through AI OFDM Channel Estimation & Signal Detection ü Trains model using the data generated from simulation based on channel statistics ü Estimates CSI implicitly and recovers the transmitted symbols directly 1 Mobile Network Compatibility Improvement Time Slice Management and Optimization ü Trains model using compatibility & interoperability workarounds among baseband chips, cell phones, protocol stack, base station and carriers. ü Trains model using the distribution of user requested content, where and when the network is congested. ü Selects the best strategy to avoid compatibility & interoperability issues and keep links connected 2 ü Optimizes network slice assigned to an user based on his/her mobility and the prediction of content to be requested 3 Personalized Mobile Edge Caching ü The edge caching and edge computing nodes have automated content distribution and traffic scheduling ability, . Pushing the content to the edge nodes closer to the user, can reduce the repeated transmission of redundant data in the network, reduce the response time, etc Key: How to acquire data for training? 4 8
Examples of Better Services through AI ü Classifies users with real data such as terminal type, chip type, transmission capacity, service type, buffer length, location, movement speed, motion trajectory, etc Customer Personas ü Provides targeted services on demand, ensures better user experience and improves network utilization Fraud Detection and Loophole Prevention ü Identifies telecom & network fraud based on deep learning ü Recognizes firmware/software vulnerability based on deep learning Human Computer Interaction ü Converts human’s voice command to machine executable command ü Generates text automatically from voice ü Data mining from converted text 9
Risk: Attack the Network with Traditional Approaches Edge Device Transmission Network eavesdropping Illegal access / invasion How to guarantee the authenticity of the original data ? How to guarantee the security of data transmission? Safety is as important as AI itself Cloud Server Network attack / invasion How to guarantee the fairness of the weight matrix? Attackers can even poison the training set and let the deep learning system learn 10 a wrong model.
Risk: Attack the Network with AI Hackers or adversaries may also take the advantage of deep learning , and sometimes may even take the lead No general solution yet In side channel attack which leverages electromagnetic radiations to intercept cryptographic keys, deep learning has been used to achieve better results than regular template attack Action: integrates cryptographic computing into a SOC chip, rather than using a standalone SE chip. Disturbs electromagnetic radiation with other irregular on-chip processing and fools the adversaries In static code analysis, deep learning is used to identify vulnerabilities and automatically initiate attacks Action: the same AI code for vulnerability analysis can be used to find weaknesses and reinforce them beforehand. While driving, misleads the deep learning model to interpret traffic signs as other nasty meanings Action: the adversary algorithm is only effective for AI based object recognition. Additional pattern recognition algorithms can be used to vote 11 for a better decision.
PART TWO AI 2. 0 and IC Design
CPU, GPU, FPGA, or ASIC:A Lesson From Bit Coin Emerging Computing Demands Pose Challenges to Traditional Processor Architectures Performance Cost Ratio of CPU, GPU and ASIC in Bit Coin Mining Perf/Cost: 20 X 7 nm 10 nm 3 years 5 years in 10 process nodes, much faster than industry norm of 2 year per process node Perf/Cost: 1600 X In the field of Bit Coin mining, as long as there is real demand for computing capability and power saving, it is just a matter of time before ASIC(s) coming into the horizon It takes Bit Coin equipments only three years to walk through CPU → GPU → FPGA → ASIC path. The intensity of the enormous demand accelerates the evolution. What is the curve for AI deployment? 13
Data Driven Computing and Moore’s Law CPU frequency / perf growth is close to the limit “Data Driven Moore” CPU performance growth rate in the past few years is much slower than that of the bit coin chip ü Deep learning has inherent parallelism , capable of hundreds to thousands MAC operations per clock cycle, much more comparing to the dynamic parallelism seen by most muticore / superscalar / VLIW CPUs ü Also, different from MIMD & SIMD, a programmer does not need to intervene deeply in detailed coding. ü Brain-inspired chip may have several magnitude more of MACs ü Integrated flash ü Integrates Re-RAM or other alternatives ü Integrates HBM/HMC through TSV, silicon interposer and SIP ü CPU’s Instruction Per Clock cycle improvement is very limited over the last 20 years, less than 2 X. ; The cost for improvement, eg. adding cache, branching prediction, ROB, data prediction, and multi-core is very high. ü On the other hand, data driven applications still have much headroom. The academia shifting the research focus in recenty years to memory architecture is not by coincidence. The hot areas of the ISCA discussion since 1991 The hot areas of the ISCA papers discussion over the past 91 years 14
Typical Deep Learning Chips The representative of the cloud AI chip : TPU from Google Cloud Side 64 K MACs Floating point 75 W 20 DDR chips Edge Side 512 - 4 K MACs Fixed point 2 W 1 -2 DDR chips The representative of the edge AI chip : Starlight from Vimicro. AI ü The main computing power comes from the MAC matrix. ü MAC matrix, systolic array, SIMD, cluster, heterogeneous processing & communication engines and so on are referring to essentially similar parallel processing paradigm ü Architectural wise, the parallelism are expressed by the deep learning algorithm itself ü A good architecture and a poor architecture could have several times gap in terms of performance, but not dozens of times. 15
Chip Vendors and IP Vendors in Deep Learning Era AI 芯片领军企业 $4亿收购Movidius 云端AI服务器 16年发布星光智能一号 华为 Mate 10 Face. ID运算 Ali 达摩院 Well Known Chip Vendors “Dian. Nao”系列芯片,电脑语指令集 Startup Chip Vendors IP Vendors 主打生态 压缩算法, FPGA 异构智能 Ali 投资 AI with DSP 16
Edge AI Chip: the Main Focus of Deep Learning IC Design ü Edge inference ü Extracts structured data through deep learning, and enables smart index/search on servers ü So far, the spec for a general purpose deep learning module can be heavily leveraged as the spec for a telecom domain specific deep learning module Deep Learning Architectural Focus ü The AI module in a SOC conducts convolution centric computing ü Other blocks of the SOC typically use dedicated hardware acceleration or flexible DSP for application specific purpose computing ü Improves the overall utilization of MAC array, rather boast on peak performance ü Reduces data exchange between chip and external DDR, so as to save power SOC Integration üA big enough shipping volume is mandatory to get good wafer price, to amortize the cost of tape-outs and IPs. üLimited by the availability analog IP(s), the 7 nm & 10 nm process is not realistic yet ü The higher the TOPS/W is, the better Emphasize on Perf vs Power Ratio ü Power saving benefits heat dissipation as well, which could impacts the lifespan of a chip 17
Other Architectural Considerations in Edge AI Chip Memory Architecture Security Future Tool Flexibility Chain Element 18
PART THREE Analogies between AI Implementation in Telecom Domain and Surveillance Domain
SVAC: Surveillance Video Audio Codec 《Technical specifications for surveillance video and audio coding for public security 》 Crime Crack Down Municipal Service Public Security Peace & Order Protection Society Management ü Calls for speeding up the AI related standards to facilitate AI penetration l SVAC 1. 0: published on 2010, similar compression ratio to H. 264, with security & structured data support l SVAC 2. 0: published on 2017, similar compression ratio to H. 265 /HEVC , enforced deployment l SVAC 3. 0: expected to publish on 2022. integrates deep learning into compression seamlessly to further improve compression ratio l SVAC is also seeking to become an ITU standard Standard Ecosystem Path for Evolution ü In China , there is a consensus that surveillance is among the first wave adopters for AI 2. 0 deployment. The experience accumulated from SVAC promotion, in particular, experience for structured data extraction, cryptographic integration and open platform, can be leveraged in promoting AI in telecom. ü Sometimes, these “other” considerations in standard formulation are as important as 20 the AI algorithm themselves
SVAC Feature —— Structured Data Description Channel SVAC Compression Flow Face Recognition Video & Audio Data . . . AI Analysis Vehicle Recognition Extracted Surveillance Data SVAC Codec Video Audio Extracted Encryption Data Authentication Video Audio Extracted Data SVAC Video Stream Trajactory Recognition RFID Sensor SVAC Smart Camera MAC Address Perception Access Card Perception IOT Sensor Identity Card Perception Boundary Crossing Sensor Perceived Info AI Inferred Info ID: ID card, access card, bank card, RFID People: face,cloths, texture, bag, glass, height, direction Behaivor: boundary crossing, movement, speed, data throughput Vehicle: license plate, model, color, Sign, Speed, direction Environment: atmosphere, soil, wind, rain, snow, haze SVAC Smart Camera Behavior: Invasion, wrong direction, boundary crossing, wandering, moving, traffic rule violation, enormous amount of head count. ü SVAC smart cameras infer from AI models the identity, characteristics and behaviors of people, vehicles and objects. The inference results, together with sensor perceived data, are inserted into the video to be encoded , which then realizes the structured description of the video. ü Advantage of structured data: 1. enables text search, 2. interoperability, 3. Expansion Capability 21
SVAC Feature —— Security Enforcement ID Authentication Command Reliability Ensure the identity and authenticity of all users and devices in the video system Extensive Security Video Integrity Verify whether the video content has been tampered with Control Verify whether command comes from trusted source Verify whether the content of command has been tampered with End to End Encryption Ensure that video at all stages of transmission, storage, and downloading cannot be stolen. SVAC Security Management Platform Cryptographic Key Service System Data Tampering Prevention Data Encryption Data encryption and authentication are enforced in data transmission, storage, playback, downloading, replication and any other steps SVAC Video Stream Device Certification Abstract Signature Client Device SVAC Smart Camera Device Authentication Reliable Video 22
SVAC Feature —— Open Platform for the Entire Ecosystem Chip Provide industry solutions and customized development support System Solution Provide chip specification and reference design RDK PCBA SVAC Product Provide user manual and reference design SDK Series Provide reference SDK for decoding, extended information parsing, and so on Platform Equipment Provide user manual and reference design SDK for further development 23
Thank YOU