ridicurious.com
Powershell

Web scraping sentences from online dictionary using PowerShell

INTRODUCTION 

Everybody comes across a word that you don’t understand how to use it in a sentence, I face this often as I do a ton of readings. normally I would have done a simple google search, let’s suppose for the word “Elixir”, which will give me few websites with sentence examples.

srch

I would have opened one of these websites and got the example sentences, but I noticed some uniformity in data presentation and the URL on a website yourdictionary.com, upon inspecting the source code I easily traced out the HTML Tags in which data was enclosed.

Hence, I thought why not harvest this website’s data (Data Scraping) and get all sentences for a word.

HOW IT WORKS

To implement this solution using Powershell, I identified the HTML Tag in which data was residing and its class (“Li_Content”) to filter exactly the sentences I want.

Once I had the sufficient information a simple Invoke-Webrequest to the site with my query word (“Elixir”) following the URL  did most of the work

Invoke-WebRequest "http://sentence.yourdictionary.com/Elixir" 

Then some data wrangling on the HTML tag and class to extract the sentences, which would look like in the following image

HOW TO USE IT

Run the function ‘Get-Sentence‘ with your word and use -WordLimit parameter to control the length, or -Count parameter to the number of sentences

You can also use -HighlightWord switch to make highlight the Word you queried in each sentence.

Following animation also demonstrate how to run the function

SCRIPT

 

Author of “PowerShell Guide to Python“, “Windows Subsystem for Linux (WSL)” and currently writing the most awaited book: “PowerShell to C# and Back” !


Subscribe to our mailing list

* indicates required

Related posts

Consistent $Profile across All PowerShell Hosts and a Backup on GitHub

Prateek Singh
8 years ago

PowerShell to C# & back: Hello World Explained

Prateek Singh
4 years ago

Download & Install All WSL Distros with PowerShell

Prateek Singh
5 years ago
Exit mobile version