Deep Learning Frameworks Compared

Deep Learning Frameworks Compared

November 23, 2019 100 By Stanley Isaacs


hello world it’s Siraj in this video we’re
going to compare the most popular deep learning frameworks out there right now
to see what works best the deep-learning space is exploding with
frameworks right now ,it’s like every single week some major tech company
decides to open source their own deep learning library and that’s not
including the dozens of deep learning frameworks being released every single
week on github by cowboy developers how *how many layers you have?* let’s start off with scikit-learn scikit
it was made to provide an easy-to-use interface for developers to use
off-the-shelf general-purpose machine learning algorithms for both supervised
and unsupervised learning , scikit provides functions that like you apply classic
machine learning algorithms like support vector machines logistic regressions and
k nearest neighbor very easily but the one type of machine learning algorithm
he doesn’t let you implement is a neural network it doesn’t provide GPU support
either which is what helps neural networks scale , since like two months ago pretty much every single general-purpose
algorithm that psyche learned implemented has since been implemented
in tensorflow sidekick you just got LEARNED , there’s also caffe which was
basically the first mainstream production grade deep learning library
started in 2014, the cafe isn’t very flexible think of a neural network
as a computational graph in cafe each note is considered a layer so if you
want new layer types you define the full forward backward and gradient updates
these layers are building blocks that are unnecessarily big there’s an endless
list of them that you can pick from intensive flow each note is considered a
tensor operation like matrix add, matrix multiply or convolution and a
layer can be defined as a composition of those operations so tender flows
building blocks are smaller which allows for more modularity ,cafe also requires a
lot of unnecessary verbosity if you want to support both the CPU and the GPU you
need to implement extra functions for each and you have to define your model
using a plain text editor that is just ghetto model should be defined
programmatically because it’s better for modularity between different components
, also caffe main architect now works on the tensorflow team we’re all out of
caffe speaking of modularity let’s talk about
keras :
Keras has been the go-to source to get started with deep learning
for a while because it provides a very high level API to build deep learning
models Kera sits on top of the other deep learning libraries like Theano
and tensorflow it uses an object-oriented design so everything is
considered an object be that layers models optimizers and all the parameters
of the model can be access object properties like model.layers[3].output will give you the output tensor for the third layer in the model and
model.layers[3].weights is a list of symbolic weight tensors this is a
cleaner interface as opposed to the functional approach of making layers
function that create weights when being called great documentation it’s all
gucci yes i’m bringing that back but because it’s so general-purpose it lacks on
the side of performance Keras has been known to have performance issues when
used with a tensorflow backend since it’s not really optimized for it but it
does work pretty well with the Theano backend , the two frameworks that are
neck-and-neck right now in the race to be the best library for both research
and Industry are tensorflow and Theano Theano currently outperforms tensorflow
on a single GPU potential flow outperforms piano for parallel execution
across multiple gpus , Theano has got more documentation because it’s been around
for a while and it’s got native windows support which tensorflow doesn’t yet
dammit windows in terms of syntax let’s just take a look at some code to see
some differences we’re going to compare two scripts in
tensorflow and beyond they both do the same thing initializing
phony data and then learn the line of best fit for that data is it can predict
future data points let’s look at the first step in both tensorflow and Theano
for generating the data pretty much the same way using numpy arrays so there’s
not really a difference there let’s look at the model initialization
parts ,this is the basic “y=mx+b” formula in tensorflow it doesn’t
require any special treatment of the x and y variables they’re just they’re natively but
Theano we have to specifically say that the variables are symbolic inputs to the
function the tensorflow syntax for defining the B&W variables is cleaner
than we implement our gradient descent function which is what helps us learn or
trying to minimize the mean squared error over time which is what makes our
model more accurate as we train the syntax for defining what we’re
minimizing is pretty similar then we look at the actual optimizer which helps
us do that will notice a difference in syntax again the flow just gives you access to a
bunch of optimizers right out-of-the-box things like gradient descent or Adam
Theano makes you do this from scratch then we have our training function which
is again more verbose see the trend here theano so far is making us implement
more code than tensorflow so it seems to give us more fine-grained control but at
the cost of readability finally we’ll get to the actual training part itself
they look pretty identical but tensorflow methodology of encapsulating the
computational graph feels conceptually cleaner than pianos tensorflow is just
growing so fast that it seems inevitable that whatever feature it lacks right now
because of how new it is it will gain very rapidly I mean just look at the amount of
activity happening in the tension flow repo versus the Theano repo on get up
right now and while keras serves as an easy use wrapper around different
libraries it’s not optimized for tensorflow a better alternative you want
to learn and get started easily with deep learning is TF learn
which is basically keras but optimized for tensorflow so to sum things up the
best library to use for research is tensorflow the world-class researchers
at both open AI and deep mine are now using it for production best library to
use is still tensorflow is it scaled better across multiple GPUs and its
closest competitor Theano .Lastly for learning the best library to use is TF
learn which is a high-level wrapper around tensorflow that lets you get
started really easily also shout out to rahul do for being able to generate an
upbeat midnight file badass of the week please subscribe for more
programming videos for now I’ve got to go worship tensorflow some more so
thanks for watching