26 – 27 Feb at the Computer History Museum in Mountain View
This was the 5th year of this conference. Matroid, who put it on, pick a useful mixture of academic and industry speakers, from people working at the front edge of getting products using machine learning into early adoption. The most entertaining talk was from Josh Bloom, on applying ML in astrophysics. The most significant talks were from Jim Keller (Intel) on how Moore’s Law continues, and from Dennis Abts on his 14th chip design, Groq’s Tensor Streaming Processor.
——
Oren Etzioni Allen Institute for AI
———
Matroid turning research into product
Look for definition and understanding of ‘adequate accuracy’ for the context of the problem
Understand the rate of change of adequate accuracy – predictor of time to develop and investment requirements. Army of annotators available ?
Can do model compression and optimization to match low capability edge AI chips.
————-
Megan Kacholia VP engineering Google. https://ai.google/
https://www.blog.google/outreach-initiatives/accessibility/impaired-speech-recognition/
Tensor Flow TF 2.1 released Jan 2020 https://github.com/tensorflow/tensorflow/releases
————–
Andrej Karpathy Tesla https://www.tesla.com/autopilotAI
Aiming for full self driving. Using the fleet (of customers’ cars) for data gathering.
——————
Ilya Sutskever Open AI
Dactyl robot hand manipulation Rubik’s cube https://openai.com/blog/learning-dexterity/
Musenet for music . AI Dungeon game discussion at https://www.reddit.com/r/AIDungeon/
—————-
Wes McKinney Ursa Labs forum for discussion https://discuss.ossdata.org/
Opensource support model Apache Arrow Looking for Swift and Julia developers
—————-
Savin Goyal Netflix Framework for AI development https://metaflow.org/ opensource
Sandbox, free, at AWS
———–
Panel discussion. Another framework https://mlflow.org/ MLsys conference
————–
Posters : Pure Storage; Logical Clocks (Sweden) ; Samsung Iot chip, no system design
———–
Ion Stoica UC Berkeley https://rise.cs.berkeley.edu/
————
David Aronchick Microsoft Leads open source machine learning at Azure
https://www.davidaronchick.com/ david_aronchick@microsoft.com
Structured schemas required for ML Ops. Design and test discipline for both data and algorithms.
———–
Joshua Bloom UC Berkeley
https://bids.berkeley.edu/events/physics-machine-learning-workshop
Towards Physics-informed ML Inference In Astrophysics
Searching for Planet 9 One hot encoding
Physics informed deep learning https://arxiv.org/abs/1711.10561
Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations: Maziar Raissi, Paris Perdikaris, George Em Karniadakis
Reverse-Engineering Deep ReLU Networks https://arxiv.org/abs/1910.00744
David Rolnick, Konrad P. Kording
——————-
Matei Zaharia Databricks Scaling Machine Learning Development with MLflow
More ml devtools . Reproducible runs Auto logging to support data versioning
———
Jim Keller Intel
Moore’s law continues 1000 scalars. Abstraction layers are critically important
Extreme ultraviolet lithography is the next phase of chip manufacture.
EUV is a step function enables 100x finer printing
Once you have stable data and a stable platform, the platform can evolve from CPU to GPU to special purpose accelerator.
————–
Josh Romero Nvidia Scaling Deep Learning on the Summit Supercomputer
Used Horovid (Uber) framework for DL training. Needed hierarchical all reduce
——–
Peter Mattson Google MLPerf: driving innovation by measuring performance
Need benchmarks for training, inference, mobile. Hard to get contributors. MLCommons non-profit formed to encourage innovation. People’s Speech dataset aiming for 100k hours of transcribed speech by diverse speakers.
—
Sean Lie Cerebras Wafer-Scale ML
——–
Dennis Abts Ditching the ‘C’ in CPU: Groq’s Tensor Streaming Processor (TM)
https://research.google/people/author36240/
Dataflow in a superlane 220 Mibytes shared SRAM
Memory has an address and a direction 1 Teraop/sec/mmsquared
Deterministic instruction time INT8, FP16
14th chip. No arbiter, no replay mechanism, no flow control in chip, no hardware interlocks – orchestrated by the compiler.
Groq Announces World’s First Architecture Capable of 1,000,000,000,000,000 Operations per Second on a Single Chip
Click to access Groq-Rocks-NNs-Linley-Group-MPR-2020Jan06.pdf
Slides and video are now available at https://info.matroid.com/scaledml-media-archive-preview Matroid ask for an email address in exchange for access.
———–