How to Download & Install NLTK on Windows/Mac

Installing NLTK in Windows

In this part, we will learn that how to make setup NLTK via terminal (Command prompt in windows).

The instruction given below are based on the assumption that you don’t have python installed. So, first step is to install python.

Installing Python in Windows

Step 1) Go to link https://www.python.org/downloads/, and select the latest version for windows.

Installing Python in Windows

Note: If you don’t want to download the latest version, you can visit the download tab and see all releases.

Installing Python in Windows

Step 2) Click on the Downloaded File

Installing Python in Windows

Step 3)Select Customize Installation

Installing Python in Windows

Step 4) Click NEXT

Installing Python in Windows

Step 5) In next screen

  1. Select the advanced options
  2. Give a Custom install location. In my case, a folder on C drive is chosen for ease in operation
  3. Click Install

Installing Python in Windows

Step 6) Click Close button once install is done.

Installing Python in Windows

Step 7) Copy the path of your Scripts folder.

Installing Python in Windows

Step 8) In windows command prompt

  • Navigate to the location of the pip folder
  • Enter command to install NLTK
    pip3 install nltk
  • Installation should be done successfully

Installing Python in Windows

NOTE: For Python2 use the commandpip2 install nltk

Step 9) In Windows Start Menu, search and open PythonShell

Installing Python in Windows

Step 10) You can verify whether the installation is accurate supplying the below command

import nltk

Installing Python in Windows

If you see no error, Installation is complete.

Installing NLTK in Mac/Linux

Installing NLTK in Mac/Unix requires python package manager pip to install nltk. If pip is not installed, please follow the below instructions to complete the process

Step1) Update the package index by typing the below command

sudo apt update

Step2) Installing pip for Python 3:

sudo apt install python3-pip

You can also install pip using easy_install.

sudo apt-get install python-setuptools  python-dev build-essential

Now easy_install is installed. Run the below command to install pip

sudo easy_install pip

Step3)Use following command to install NLTK

sudo pip install -U nltk
sudo pip3 install -U nltk

Installing NLTK through Anaconda

Step1) Please install anaconda (which can also be used to install different packages) by visiting https://www.anaconda.com/products/individual and select which version of python you need to install for anaconda.

Installing NLTK through Anaconda

Note: Refer to this tutorial for detailed steps to install anaconda

Step 2)In the Anaconda prompt,

  1. Enter command
    conda install -c anaconda nltk
  2. Review the package upgrade, downgrade, install information and enter yes
  3. NLTK is downloaded and installed

Installing NLTK through Anaconda

NLTK Dataset

NLTK module has many datasets available that you need to download to use. More technically it is called corpus. Some of the examples are stopwords, gutenberg, framenet_v15, large_grammarsand so on.

How to Download all packages of NLTK

Step 1)Run the Python interpreter in Windows or Linux

Step 2)

  1. Enter the commands
import nltk
nltk.download ()
  1. NLTK Downloaded Window Opens. Click the Download Button to download the dataset. This process will take time, based on your internet connection

Download all Packages of NLTK

NOTE: You can change the download location by Clicking File> Change Download Directory

Download all Packages of NLTK

Step 3) To test the installed data use the following code

>>> from nltk.corpus import brown
>>>brown.words()

[‘The’, ‘Fulton’, ‘County’, ‘Grand’, ‘Jury’, ‘said’, …]

Download all Packages of NLTK

Running the NLP Script

We are going to discuss how NLP script will be executed on our local PC. There are many libraries for Natural Language Processing present in the market. So choosing a library depends on fitting your requirements. Here is the list of NLP libraries.

How to Run NLTK Script

Step1) In your favorite code editor, copy the code and save the file asNLTKsample.py

from nltk.tokenize import RegexpTokenizer
tokenizer = RegexpTokenizer(r'\w+')
filterdText=tokenizer.tokenize('Hello Guru99, You have build a very good site and I love visiting your site.')
print(filterdText)

Run NLTK Script

Code Explanation:

  1. In this program, the objective was to remove all type of punctuations from given text. We imported “RegexpTokenizer” which is a module of NLTK. It removes all the expression, symbol, character, numeric or any things whatever you want.
  2. You just have passed the regular Expression to the “RegexpTokenizer” module.
  3. Further, we tokenized the word using “tokenize” module. The output is stored in the “filterdText” variable.
  4. And printed them using “print().”

Step2) In the command prompt

  • Navigate to the location where you have saved the file
  • Run the command Python NLTKsample.py

Run NLTK Script

This will show output as :

[‘Hello’, ‘Guru99’, ‘You’, ‘have’, ‘build’, ‘a’, ‘very’, ‘good’, ‘site’, ‘and’, ‘I’, ‘love’, ‘visiting’, ‘your’, ‘site’]