PyTorch for Newbies
As a newbie in this field surrounded by an incredible amount of information I find myself consistently searching the web for resources to help guide me through any given library. Below you will find brief overviews of what PyTorch has to offer along with links to their technical documentation and articles for a more in depth look. Hopefully this information will help guide you, as it has me, in your beginning in deep learning.
History
PyTorch was created by Facebooks AI group and was released in September of 2016. Since its inception it has gained popularity by professional developers and researchers due to its ease of use, debugging capabilities, speed, and Pythonic abilities. Now, five years later, PyTorch is one of the leaders in machine learning (ML) and is contending with TensorFlow for first place per Google Trends.
What is PyTorch?
PyTorch is used for deep learning and artificial intelligence. Primarily it is used for applications such as computer vision and natural language processing.
PyTorch is essentially Numpy with strong GPU acceleration. To further break this down, torch.zeros(3) & np.zeros(3) [Example 1] will return the same data structure just stored differently (tensor vs array). What makes a tensor object powerful is its ability to tap into the resources of a GPU which significantly increases computational speeds. There are many other similarities between PyTorch and Numpy. If you’d like to see more take a look at this article.
[Example 1]
import torch
import numpy as npx = np.zeros(3)
y = torch.zeros(3)
print(x,y)
Returns:
[0. 0. 0.] tensor([0., 0., 0.])
Why PyTorch?
As previously mentioned PyTorch is extremely Pythonic. This is a huge advantage in the ML world as 57% of Data Scientists and Machine Learning Developers use Python. Since the majority of these communities use Python they feel right at home when they jump into PyTorch. This furthers the points of ease of use and the potential for users to quickly learn its in’s and out’s.
Another benefit is your ability for Data Parallelism. Data Parallelism allows PyTorch to use multiple CPU’s and GPU’s increasing the capabilities of THE MACHINE. (probably my favorite stand up act)
PyTorch has the ability to use dynamic computation graphs which allows you to change how the network behaves on the fly. This is a more complex feature set that is broken down well in this technical article.
PyTorch vs TensorFlow
Both of these open source programs have their pro’s and con’s. Over the years they have both mimicked feature sets off each other to improve their own platforms. Below are a list of their strengths and weaknesses:
TensorFlow:
Pros
- built-in high-level API
- Tensorboard — visualizing training
- TensorFlow serving — production ready
- mobile support
- solid documentation and community support
Cons
- static graph
- proprietary debugging method
- quick changes are not easy to do on the fly
Pytorch:
Pros
- Pythonic
- Ability to use Python libraries & PDB (Python Debugger)
- Dynamic graph
- quick editing on the fly
Cons
- 3rd party needed for visualization (Seaborn/MatPlotLib)
- API server needed for production
Based on my research, the general consensus found was Data Scientists love PyTorch and it is highly recommended for research developers. TensorFlow seems to have a more powerful and customizable platform but, as with any customization, there is added boilerplate code needed.
To summarize this section, PyTorch is better suited for rapid prototyping in research & small scale projects while TensorFlow will suite your needs better for large scale production environments and integrated visualizations.
PyTorch Support
Per their GitHub, PyTorch has a 90-day release cycle and has an additional section for end users to add bug reports too. Based on their release cycle and the activity seen within their repo their library is constantly being upgraded. Another great point is there are four maintainers on the platform along with hundreds of major contributions noted under ‘The Team’ section.
Additional Resources
Below are links to helpful resources that will further your dive into PyTorch:
Here is an article showing the seven top paid PyTorch and Keras courses