ScaledML 2020 notes

26 – 27 Feb at the Computer History Museum in Mountain View

This was the 5th year of this conference. Matroid, who put it on, pick a useful mixture of academic and industry speakers, from people working at the front edge of getting products using machine learning into early adoption.  The most entertaining talk was from Josh Bloom, on applying ML in astrophysics. The most significant talks were from Jim Keller (Intel) on how Moore’s Law continues, and from Dennis Abts on his 14th chip design, Groq’s Tensor Streaming Processor. 

http://scaledml.org/2020/

——

Oren Etzioni Allen Institute for AI 

Paper https://www.technologyreview.com/s/615264/artificial-intelligence-destroy-civilization-canaries-robot-overlords-take-over-world-ai/ 

———

Matroid  turning research into product

Look for definition and understanding of ‘adequate accuracy’ for the context of the problem

Understand the rate of change of adequate accuracy – predictor of time to develop and investment requirements. Army of annotators available ? 

Can do model compression and optimization to match low capability edge AI chips. 

————-

Megan Kacholia VP engineering Google. https://ai.google/ 

https://www.blog.google/outreach-initiatives/accessibility/impaired-speech-recognition/

Tensor Flow TF 2.1 released Jan 2020 https://github.com/tensorflow/tensorflow/releases

————–

Andrej Karpathy Tesla https://www.tesla.com/autopilotAI 

Aiming for full self driving. Using the fleet (of customers’ cars) for data gathering.

——————

Ilya Sutskever Open AI 

Dactyl robot hand manipulation Rubik’s cube https://openai.com/blog/learning-dexterity/

Musenet for music . AI Dungeon game discussion at https://www.reddit.com/r/AIDungeon/

—————-

Wes McKinney Ursa Labs  forum for discussion https://discuss.ossdata.org/ 

Opensource support model Apache Arrow Looking for Swift and Julia developers

—————-

Savin Goyal Netflix Framework for AI development https://metaflow.org/ opensource

Sandbox, free, at AWS

———–

Panel discussion.  Another framework https://mlflow.org/  MLsys conference 

————–

Posters : Pure Storage;  Logical Clocks (Sweden) ; Samsung Iot chip, no system design

———–

Ion Stoica UC Berkeley  https://rise.cs.berkeley.edu/ 

————

David Aronchick Microsoft Leads open source machine learning at Azure

https://www.davidaronchick.com/  david_aronchick@microsoft.com

Structured schemas required for ML Ops. Design and test discipline for both data and algorithms.

———–

Joshua Bloom UC Berkeley 

https://bids.berkeley.edu/events/physics-machine-learning-workshop

Towards Physics-informed ML Inference In Astrophysics 

Searching for Planet 9   One hot encoding

Physics informed deep learning https://arxiv.org/abs/1711.10561

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations:  Maziar Raissi, Paris Perdikaris, George Em Karniadakis

Reverse-Engineering Deep ReLU Networks https://arxiv.org/abs/1910.00744

David Rolnick, Konrad P. Kording

——————-

Matei Zaharia Databricks Scaling Machine Learning Development with MLflow 

More ml devtools . Reproducible runs Auto logging to support data versioning

———

Jim Keller Intel 

Moore’s law continues  1000 scalars. Abstraction layers are critically important 

Extreme ultraviolet lithography is the next phase of chip manufacture.

EUV is a step function enables 100x finer printing

Once you have stable data and a stable platform, the platform can evolve from CPU to GPU to special purpose accelerator.

————–

Josh Romero Nvidia Scaling Deep Learning on the Summit Supercomputer 

Used Horovid (Uber) framework for DL training.  Needed hierarchical all reduce 

——–

Peter Mattson Google MLPerf: driving innovation by measuring performance 

Need benchmarks for training, inference, mobile. Hard to get contributors. MLCommons non-profit formed to encourage innovation. People’s Speech dataset aiming for 100k hours of transcribed speech by diverse speakers.

Sean Lie Cerebras Wafer-Scale ML 

——–

Dennis Abts  Ditching the ‘C’ in CPU: Groq’s Tensor Streaming Processor (TM)

https://research.google/people/author36240/

Dataflow in a superlane 220 Mibytes shared SRAM

Memory has an address and a direction 1 Teraop/sec/mmsquared

Deterministic instruction time INT8, FP16 

14th chip.  No arbiter, no replay mechanism, no flow control in chip, no hardware interlocks – orchestrated by the compiler. 

Groq Announces World’s First Architecture Capable of 1,000,000,000,000,000 Operations per Second on a Single Chip

Click to access Groq-Rocks-NNs-Linley-Group-MPR-2020Jan06.pdf

Slides and video are now available at https://info.matroid.com/scaledml-media-archive-preview Matroid ask for an email address in exchange for access.

———–