The remarkable performance achieved in a variety of application areas (natural language processing, computer vision, games, etc.) has led to the emergence of heterogeneous architectures to accelerate machine learning workloads. In parallel, production deployment, model complexity and diversity pushed for higher productivity systems, more powerful programming abstractions, software and system architectures, dedicated runtime systems and numerical libraries, deployment and analysis tools. Deep learning models are generally memory and computationally intensive, for both training and inference. Accelerating these operations has obvious advantages, first by reducing the energy consumption (e.g. in data centers), and secondly, making these models usable on smaller devices at the edge of the Internet. In addition, while convolutional neural networks have motivated much of this effort, numerous applications and models involve a wider variety of operations, network architectures, and data processing. These applications and models permanently challenge computer architecture, the system stack, and programming abstractions. The high level of interest in these areas calls for a dedicated forum to discuss emerging acceleration techniques and computation paradigms for machine learning algorithms, as well as the applications of machine learning to the construction of such systems.
The workshop brings together researchers and practitioners working on computing systems for machine learning, and using machine learning to build better computing systems. It also reaches out to a wider community interested in this rapidly growing area, to raise awareness of the existing efforts, to foster collaboration and the free exchange of ideas.
This builds on the success of our previous events:
Topics of interest include (but are not limited to):
Novel ML systems: heterogeneous multi/many-core systems, GPUs and FPGAs;
Software ML acceleration: languages, primitives, libraries, compilers and frameworks;
Novel ML hardware accelerators and associated software;
Emerging semiconductor technologies with applications to ML hardware acceleration;
ML for the construction and tuning of systems;
Cloud and edge ML computing: hardware and software to accelerate training and inference;
Computing systems research addressing the privacy and security of ML-dominated systems;
April 15 April 30, 2022
Notification to authors:
April 30 May 18, 2022
Papers should be in double column IEEE format of between 4 and 8 pages including references. Papers should be uploaded as PDF and not anonymized.
Submissions can be made at https://easychair.org/my/conference?conf=4thaccml.
Papers will be reviewed by the workshop's technical program committee according to criteria regarding a submission's quality, relevance to the workshop's topics, and, foremost, its potential to spark discussions about directions, insights, and solutions on the topics mentioned above. Research papers, case studies, and position papers are all welcome.
In particular, we encourage authors to keep the following options in mind when preparing submissions:
Tentative Research Ideas: Presenting your research idea early one to get feedback and enable collaborations.
Works-In-Progress: To facilitate sharing of thought-provoking ideas and high-potential though preliminary research, authors are welcome to make submissions describing early-stage, in-progress, and/or exploratory work in order to elicit feedback, discover collaboration opportunities, and generally spark discussion.
Technical Director, IEEE & ST Fellow - System Research and Applications, STMicroelectronics
Title: Hybrid precision for ultra-tiny neural computing
The quest to conceive machine learning approaches to achieve efficient storage, minimal computing, maximum accuracy, cheap silicon area and very low power consumption is just at the beginning. Challenges are a-head of researchers when trying to use low bit-depth neural networks in term of tools, algorithms, software, hardware etc. Case studies encompassing anomaly detection and associated classifications design are complex tasks if neural networks are investigated targeting ultra-low power devices for in sensor, in actuators and microcontroller computing. Deeply Quantized Neural Networks (DQNNs) promises most interesting advantages to these and other machine learning workloads. However, given the extended hyper-parameter space, the design and the training of DQNN is not a trivial task. Unfortunately, current off-the-shelf microcontrollers are not yet capable to exploit at best their potentialities. Realization of custom energy efficient hardware accelerators sometime may represent a viable short-cut in terms of energy efficiency, especially applied to a raising field such as in-sensing neural computing. Hybrid Neural Networks variants developed with experimental deep learning tools, can achieve interesting accuracies compared to more traditional design ML approaches. In this talk all above mentioned aspects will be discussed with reference to latest efforts of ST including a) tools for efficient deployment of DQNN on micro controllers for heterogeneous use cases b) custom ultra-low power hardware circuitry for real-time execution of the Hybrid Neural Network with traditional CMOS technologies and implemented with field-programmable gate array, c) latest ST solutions for in sensor deep learning computing. Part of the talk will include associated demo and code inspection as useful to get a better comprehension of the various topics, within the allocated time slot.
Danilo graduated in Electronic Engineering at Politecnico di Milano in 1992. Since 1991, Danilo PAU joined STMicroelectronics. He worked on HDMAC and MPEG2 video memory reduction, video coding, transcoding, embedded graphics, and computer vision. Currently, his work focuses on developing tools for deploy deep learning on tiny devices with associated applications. Since 2019 Danilo is an IEEE Fellow. He served as Industry Ambassador coordinator for IEEE Region 8 South Europe, was ice-chairman of the “Intelligent Cyber-Physical Systems” Task Force within IEEE CIS, and Member of the Machine Learning, Deep Learning and AI in the CE (MDA) Technical Stream Committee IEEE Consumer Electronics Society (CESoc). With over 80 patents, 124 publications, 113 MPEG authored documents and 50 invited talks/seminars at various worldwide Universities and Conferences, Danilo's favorite activity remains mentoring undergraduate students, MSc engineers and PhD students from various universities in Italy, US, France, and India.
Nvidia Deep Learning Institute Certified Instructor and Ambassador, University of Debrecen
Title: NVIDIA AI and Data Science
Thanks to the technological advances of recent years, there is a significant market, industrial and societal demand for AI solutions. The variety and complexity of the field of applications incuding security, the need to process large and quality amounts of data, and communication issues pose several challenges. Today, there is no sector where there is no demand or result for the application of AI. Increasingly complex solutions have led to the advance of general GPU server-based computing and the emergence of IOT tools, target-specific hardware, and simulations. The application areas include available mobile devices and an increasing number of target-specific devices such as cameras, robots, and manufacturing implementations. Today, theoretical and technological solutions allow us to analyze data and efficiently implement simulation, data augmentation, synthesis, and real-time data processing, even in distributed or remote environments, to accurately describe the complete physical environment.
Dr. Laszlo Kovacs is a certified Instructor and Ambassador at Nvidia Deep Learning Institute and researcher and assistant professor at the Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen. He focuses on machine learning-based data-driven solutions from industry to healthcare, including advanced HPC and embedded AI solutions on both educational and research lines. An essential element of this is the Self-Driven Vehicle Research Laboratory. Within the framework of the laboratory, we study autonomous vehicles equipped with variable levels of self-driving functions, sensor networks, communications, and the ability to process sensor data for smart cities based on artificial intelligence and build them in real and model size.
PhD researcher, KU Leuven
Title: Enabling Energy-Eficient Acceleration of Probabilistic Machine Learning
In real-world systems, deep learning models are often combined with other models and algorithms, e.g., particle filtering for autonomous driving, Monte Carlo tree search for reinforcement learning (like AlphaGo), Bayesian denoising for speech recognition, etc. Such hybrid approaches attempt to manage real-world uncertainty and complexity, which the pure end-to-end learning approach does not yet handle efficiently. One such model used to augment DL is probabilistic circuits (PC), which allows robust inference even with missing inputs (e.g., due to faulty or switched-off sensors), making it attractive for edge IoT applications. Unfortunately, PC exhibit irregular graph-like computational patterns that are not suitable for DL accelerators designed for tensors. This talk will discuss these challenging computational patterns of PC and the specialized processor and compiler stack we designed for the energy-efficient acceleration of this emerging workload.
Nimish Shah is a PhD researcher working under the guidance of Prof. Marian Verhelst at KU Leuven. His research focuses on parallel computer architectures and digital systems targetting challenging computation problems from machine learning, linear algebra, and graph analytics. He likes designing hardware/software co-optimized solutions involving novel scheduling techniques, customized hardware dataflow, specialized on-chip interconnection & memory hierarchy, low-precision arithmetic, and low-power VLSI techniques. Before his PhD, Nimish was working on GPU memory subsystems at Nvidia.
|Time||22nd June 2022|
|10:00 – 10:10||Welcome|
|10:10 – 11:00||Keynote talk: Hybrid precision for ultra-tiny neural computing (Danilo Pau, STMicroelectronics)|
|11:00 – 11:30||Coffee break|
|11:30 – 12:10||Invited talk 1: NVIDIA AI and Data Science (László Kovács, University of Debrecen)|
|12:10 – 12:30||Paper talk: QONNX: Representing Arbitrary-Precision Quantized Neural Networks
Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Muhizi, Matthew Trahms, Shih-Chieh Hsu and Javier Duarte
|12:30 – 12:50||Paper talk: Productive Reproducible Workflows for DNNs: A Case Study for Industrial Defect Detection
Perry Gibson and José Cano
|12:50 – 13:00||Short invited talk: Code Generation and Optimization for Deep-Learning Computations on GPUs via Multi-Dimensional Homomorphisms
Richard Schulze, Ari Rasch, Sergei Gorlatch
|13:00 – 14:00||Buffet lunch|
|14:00 – 14:50||Invited talk 2: Enabling Energy-Eficient Acceleration of Probabilistic Machine Learning (Nimish Shah, KU Leuven)|
|14:50 – 15:10||Paper talk: HW-Aware Initialization of DNN Auto-Tuning to Improve Exploration Time and Robustness
Dennis Rieber, Moritz Reiber, Oliver Bringmann and Holger Fröning
|15:10 – 15:30||Paper talk: Scaling MLPerf Inference vision benchmarks with Qualcomm Cloud AI 100 accelerators
Arjun Suresh, Gavin Simpson and Anton Lokhmotov
|15:30 – 16:00||Coffee break|
|16:00 – 16:20||Paper talk: SAMO: Optimised Mapping of Convolutional Neural Networks to Streaming Architectures
Alexander Montgomerie-Corcoran, Zhewen Yu and Christos-Savvas Bouganis
|16:20 – 16:40||Paper talk: Convolution Operators for Deep Learning Inference: Libraries or Automatic Generation?
Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Enrique S. Quintana-Orti and Francisco D. Igual
|16:40 – 17:00||Paper talk: ProGNNosis: A Data-driven Model to Predict GNN Computation Time Using Graph Metrics
Axel Wassington and Sergi Abadal
|17:00 – 17:05||Closing remarks|