Blog

ScaledML 2020 notes

26 – 27 Feb at the Computer History Museum in Mountain View

This was the 5th year of this conference. Metroid, who put it on, pick a useful mixture of academic and industry speakers, from people working at the front edge of getting products using machine learning into early adoption.  The most entertaining talk was from Josh Bloom, on applying ML in astrophysics. The most significant talks were from Jim Keller (Intel) on how Moore’s Law continues, and from Dennis Abts on his 14th chip design, Groq’s Tensor Streaming Processor. 

http://scaledml.org/2020/

——

Oren Etzioni Allen Institute for AI 

Paper https://www.technologyreview.com/s/615264/artificial-intelligence-destroy-civilization-canaries-robot-overlords-take-over-world-ai/ 

———

Metroid  turning research into product

Look for definition and understanding of ‘adequate accuracy’ for the context of the problem

Understand the rate of change of adequate accuracy – predictor of time to develop and investment requirements. Army of annotators available ? 

Can do model compression and optimization to match low capability edge AI chips. 

————-

Megan Kacholia VP engineering Google. https://ai.google/ 

https://www.blog.google/outreach-initiatives/accessibility/impaired-speech-recognition/

Tensor Flow TF 2.1 released Jan 2020 https://github.com/tensorflow/tensorflow/releases

————–

Andrej Karpathy Tesla https://www.tesla.com/autopilotAI 

Aiming for full self driving. Using the fleet (of customers’ cars) for data gathering.

——————

Ilya Sutskever Open AI 

Dactyl robot hand manipulation Rubik’s cube https://openai.com/blog/learning-dexterity/

Musenet for music . AI Dungeon game discussion at https://www.reddit.com/r/AIDungeon/

—————-

Wes McKinney Ursa Labs  forum for discussion https://discuss.ossdata.org/ 

Opensource support model Apache Arrow Looking for Swift and Julia developers

—————-

Savin Goyal Netflix Framework for AI development https://metaflow.org/ opensource

Sandbox, free, at AWS

———–

Panel discussion.  Another framework https://mlflow.org/  MLsys conference 

————–

Posters : Pure Storage;  Logical Clocks (Sweden) ; Samsung Iot chip, no system design

———–

Ion Stoica UC Berkeley  https://rise.cs.berkeley.edu/ 

————

David Aronchick Microsoft Leads open source machine learning at Azure

https://www.davidaronchick.com/  david_aronchick@microsoft.com

Structured schemas required for ML Ops. Design and test discipline for both data and algorithms.

———–

Joshua Bloom UC Berkeley 

https://bids.berkeley.edu/events/physics-machine-learning-workshop

Towards Physics-informed ML Inference In Astrophysics 

Searching for Planet 9   One hot encoding

Physics informed deep learning https://arxiv.org/abs/1711.10561

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations:  Maziar Raissi, Paris Perdikaris, George Em Karniadakis

Reverse-Engineering Deep ReLU Networks https://arxiv.org/abs/1910.00744

David Rolnick, Konrad P. Kording

——————-

Matei Zaharia Databricks Scaling Machine Learning Development with MLflow 

More ml devtools . Reproducible runs Auto logging to support data versioning

———

Jim Keller Intel 

Moore’s law continues  1000 scalars. Abstraction layers are critically important 

Extreme ultraviolet lithography is the next phase of chip manufacture.

EUV is a step function enables 100x finer printing

Once you have stable data and a stable platform, the platform can evolve from CPU to GPU to special purpose accelerator.

————–

Josh Romero Nvidia Scaling Deep Learning on the Summit Supercomputer 

Used Horovid (Uber) framework for DL training.  Needed hierarchical all reduce 

——–

Peter Mattson Google MLPerf: driving innovation by measuring performance 

Need benchmarks for training, inference, mobile. Hard to get contributors. MLCommons non-profit formed to encourage innovation. People’s Speech dataset aiming for 100k hours of transcribed speech by diverse speakers.

Sean Lie Cerebras Wafer-Scale ML 

——–

Dennis Abts  Ditching the ‘C’ in CPU: Groq’s Tensor Streaming Processor (TM)

https://research.google/people/author36240/

Dataflow in a superlane 220 Mibytes shared SRAM

Memory has an address and a direction 1 Teraop/sec/mmsquared

Deterministic instruction time INT8, FP16 

14th chip.  No arbiter, no replay mechanism, no flow control in chip, no hardware interlocks – orchestrated by the compiler. 

https://groq.com/wp-content/uploads/2020/01/Groq-Rocks-NNs-Linley-Group-MPR-2020Jan06.pdf

Slides and video are now available at https://info.matroid.com/scaledml-media-archive-preview Matroid ask for an email address in exchange for access.

———–

Understanding the new business of AI

Adding to the agreement and reaction to the useful Andreesen Horowitz post on The New Business of AI from last week ..

Image from the 2017 AI Index, Stanford Institute for Human -centered AI

Reaching the sunlit uplands of AI is going to be rather harder than many of its investors and protagonists have predicted. https://a16z.com/2020/02/16/the-new-business-of-ai-and-how-its-different-from-traditional-software/

In particular, many AI companies have:

  • Lower gross margins due to heavy cloud infrastructure usage and ongoing human support;
  • Scaling challenges due to the thorny problem of edge cases;
  • Weaker defensive moats due to the commoditization of AI models and challenges with data network effects.

Adding a couple of examples to illustrate some of the systems design issues :

Frank Denneman, from VMware, on paralellism used for scaling training models https://frankdenneman.nl/2020/02/19/multi-gpu-and-distributed-deep-learning/

Emily Potyraj, from Pure Storage, on optimizing ECG data layout to improve deep learning training performance. https://towardsdatascience.com/what-format-should-i-store-my-ecg-data-in-for-dl-training-bc808eb64981

All of this points to a continuing requirement for a high degree of skilled problem analysis and systems design in order to make best use of AI/ML . There’s an opportunity for existing services companies to dramatically improve with judicious use of ML/AI .

More people working more

People at work

As unemployment rates go down, economists begin to explore what it would take to get more people working for longer.

This research concludes that flexibility in working hours, controlled by the person working would make a significant difference in total effort available for work.

More flexibility in working hours requires management and scheduling so that people who need to collaborate and interrupt each other agree on times and days for that, and that there will be other work time when they work alone. Some jobs are mainly customer facing, so the majority of the hours are interruptable.

People who have managed international teams on assorted time zones have a head start on understanding how to do this.

Vanguard sponsored research published by the American Economic Association https://www.aeaweb.org/research/older-workers-labor-force-tonetti-interview

Quantum computing

Snapshot, November 2019

Research
State of the art summary, references and curated comments Scott Aaronson https://www.scottaaronson.com/blog/

Small companies

IonQ Trapped ion computing
PsiQ Silicon photonics https://psiquantum.com/
QCWare Software services
Rigetti Quantum cloud services
Xanadu Quantum photonic processors https://www.xanadu.ai/

Large companies
Google https://ai.google/research/teams/applied-science/quantum/
IBM https://www.ibm.com/quantum-computing/
Microsoft https://www.microsoft.com/en-us/quantum/
Honeywell Trapped ion Qbits https://www.honeywell.com/en-us/company/quantum

Upcoming conference

https://q2b.qcware.com/ San Jose 10 -12 December 2019

AI and new jobs

Last year when we were preparing for the AI and ML panel at the Markets Group meeting, we spent a lot of effort to prepare for questions on potential and actual adverse effects – but no-one asked. The audience were institutional investors, many of them managing pension funds for employees, so we really had expected pointed questions about the potential for removal of existing jobs and about how new occupations might arise.

Prompted by a blog post from Timothy Taylor, and quoting from a paper titled ‘The Wrong Kind of AI’ , it seems useful to think “about the future of work as a race between automation and new, labor-intensive tasks. Labor demand has not increased steadily over the last two centuries because of technologies that have made labor more productive in everything. Rather, many new technologies have sought to eliminate labor from tasks in which it previously specialized. All the same, labor has benefited from advances in technology, because other technologies have simultaneously enabled the introduction of new labor-intensive tasks. These new tasks have done more than just reinstate labor as a central input into the production process; they have also played a vital role in productivity growth.”

References

IZA DP No. 12292 Institue of Labor Economics The Wrong Kind of AI?
Artificial Intelligence and the Future of Labor Demand APRIL 2019
Daron Acemoglu MIT and IZA
Pascual Restrepo Boston University

Consolidation in high capacity interface business

First Nividia announced it was to acquire Mellanox. Now Xilinx announces the acquisition of Solarflare. These are the two big sources of expertise in the high capability, high throughput Network Interface Card market.

When this sort of consolidation happens, it’s a signal to watch for one or more smaller players to emerge, potentially with expertise and money coming from the acquired companies, to develop the next state change in one of the often overlooked but critical enablers of the very large scale datacenters enabling cloud operational scale.

The competition is AWS, who have built their own ASIC, used in the NICs in the Nitro System. James Hamilton describes the system, used for I/O acceleration, security, and to implement a hypervisor.

February 2019 https://perspectives.mvdirona.com/2019/02/aws-nitro-system/
March 2019 https://nvidianews.nvidia.com/news/nvidia-to-acquire-mellanox-for-6-9-billion
April 2019 https://www.prnewswire.com/news-releases/xilinx-to-acquire-solarflare-300837025.html

More machine learning – ScaledML

27 – 28 March 2019

The ScaledML conference is growing up; from a Saturday at Stanford to a two day event at the Computer History Museum with sponsors. http://scaledml.org/2019/

Two big new themes emerged

  • Concern for power efficiency (Simon Knowles, Graphcore, talked about Megawatts; Pete Warden, Tensorflow talked about milliwatts and energy harvesting
  • Development platforms – Adam D’Angelo, Quora, was particularly clear on how Quora operate development to efficiently support a small number of good developers

David Paterson gave the first talk on Domain Specific architectures for Neural Networks – an updated version of this talk https://cacm.acm.org/magazines/2018/9/230571-a-domain-specific-architecture-for-deep-neural-networks/fulltext

The roofline performance model is a useful way to visualize comparative performance. For future performance improvements functionally specific architectures are the way forward; this requires both hardware updates (what Google is doing with the TPUs) and improved compiler front and back ends.

Fig 3 from the Domain Specific Architectures paper linked above.

Intel recognizes this trend – Wei Li described the work his team is doing to incorporate domain specific support into Xeon processors. This blog post has the gist of what he presented.

Most of the talks are here on YouTube