We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In the previous chapters, we have discussed the theory behind the perceptron and logistic regression, including mathematical explanations of how and why they are able to learn from examples. In this chapter, we will transition from math to code. Specifically, we will discuss how to implement these models in the Python programming language. All the code that we will introduce throughout this book is available in this GitHub repository as well: https://github.com/clulab/gentlenlp. To get a better understanding of how these algorithms work under the hood, we will start by implementing them from scratch. However, as the book progresses, we will introduce some of the popular tools and libraries that make Python the language of choice for machine learning – for example, PyTorch, and Hugging Face’s transformers. The code for all the examples in the book is provided in the form of Jupyter notebooks. Fragments of these notebooks will be presented in the implementation chapters so that the reader has the whole picture just by reading the book.
The previous chapter was our first exposure to recurrent neural networks, which included intuitions for why they are useful for natural language processing, various architectures, and training algorithms. In this chapter, we will put them to use in order to implement a common sequence modeling task. In particular, we implement a Spanish part-of-speech tagger using a bidirectional long short-term memory and a set of pretrained, static word embeddings. Through this process, we have also introduced several new PyTorch features such as the pad_sequence, pack_padded_sequence, and pad_packed_sequence functions, which allow us to work more e?iciently with variable length sequences for recurrent neural networks.
In this chapter, we provide an implementation of the multilayer neural network described in Chapter 5, along with several of the best practices discussed in Chapter 6. Still keeping things fairly simple, our network will consist of two fully connected layers: a hidden layer and an output layer. Between these layers, we will include dropout and a nonlinearity. Further, we make use of two PyTorch classes: a Dataset and a DataLoader. The advantage of using these classes is that they make several things easy, including data shuffling and batching. Last, since the classifier’s architecture has become more complex, for optimization we transition from stochastic gradient descent to the Adam optimizer to take advantage of its additional features such as momentum and L2 regularization.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.