Source: infoq.com
Horace He recently published an article summarising The State of Machine Learning Frameworks in 2019. The article utilizes several metrics to argue the point that PyTorch is quickly becoming the dominant framework for research, whereas TensorFlow is the dominant framework for applications deployed within a commercial/industrial context.
He, a research student at Cornell University, counted the number of papers discusing either PyTorch or TensorFlow that were presented at a series of well-known machine learning oriented conferences, namely ECCV, NIPS, ACL, NAACL, ICML, CVPR, ICLR, ICCV and EMNLP. In summary, the majority of papers were implemented in PyTorch for every major conference in 2019. PyTorch outnumbered TensorFlow by 2:1 in vision related conferences and 3:1 in language related conferences. PyTorch also has more references in papers published in more general Machine Learning conferences like ICLR and ICML.
He argued that the reasons that PyTorch is gaining ground includes its simplicity, its simple to use and intuitive API, and (at least) acceptable performance, when compared to TensorFlow.
On the other hand, the author’s metrics for measuring industry adoption show that TensorFlow is still the leader. The metrics used were: job listings, GitHub popularity, count of medium articles, etc. He posited that the answer to why the disparity between academia and industry is threefold. First of all, the overhead of a Python runtime is something that many companies will try to avoid where possible. The second reason is that PyTorch offers no support for mobile “edge” ML. Coincidentally, Mobile support has just been added to PyTorch by Facebook in version 1.3, which was released earlier this month. The third reason is the lack of features around serving, which means that PyTorch systems are harder to productionalize than equivalent systems developed using TensorFlow.
In the past year, PyTorch and TensorFlow have been converging in a several ways. PyTorch introduced “Torchscript” and a JIT compiler, whereas TensorFlow announced that it would be moving to an “eager mode” of execution starting from version 2.0. Torchscript is essentially a graph representation of PyTorch. Getting a graph from the code means that we can deploy the model in C++ and optimize it. TensorFlow’s eager mode provides an imperative programming environment that evaluates operations immediately, without building graphs. This is similar to PyTorch’s eager mode in both advantages and shortcomings. It helps with debugging, but then models cannot be exported outside of Python, be optimized, run on mobile, etc.
In the future, both frameworks will be closer than they are today. New contenders may challenge them in areas like code generation or Higher Order Differentiation. He identified a potential contender as JAX. This is built by the same people who worked on the popular Autograd project, and features both forward- and reverse-mode auto-differentiation. This allows computation of higher order derivatives “orders of magnitude faster than what PyTorch/TensorFlow can offer”.
Horace He, the author of the article can be contacted via Twitter; he has published both the code used to generate the datasets and also interactive charts from the article.