My top preference for data munging and harvesting from The Web is Internet Explorer, Yes, Internet Explorer! 🙂 because I can create an InternetExplorer.Application object and access the HTML DOM to scrape web data as and when required.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$InternetExplorer = New-Object -ComObject InternetExplorer.Application -property @{ | |
Navigate = 'http://www.twitter.com/following' | |
Visible = $true | |
} |
The problem arises when all information on the web page is not populated by default when the page loads, and you’ve to manually scroll down the Internet Explorer’s scroll bar to populate more content. But how to Programmatically Scroll Internet Explorer?
Luckily, to our rescue .Net libraries provide a ScrollTo() function which can be utilized to scroll and populate content on a webpage, and this is very handy with web scraping techniques.
Hence this quick post for people who may find it useful, because I found the answer after a lot of research 🙂
The example in above animation is used for harvesting data from Twitter and you can find a full blog post here
SCRIPT:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$InternetExplorer = New-Object -ComObject InternetExplorer.Application -property @{ | |
Navigate = 'http://www.twitter.com/following' | |
Visible = $true | |
} | |
# Wait unitl IE is busy | |
While($InternetExplorer.busy){ Write-Verbose "Internet Explorer is Busy, waiting for few seconds";Start-Sleep -Seconds 5 } | |
$start = Get-Date;$VerticalScroll = 0 | |
Write-Host "Scrolling the WebPage to auto-populate all profiles for next 60 Secs" -ForegroundColor Yellow | |
# 60 Secs to Infinitely scroll webpage, So that all items are populated that only come when you scroll down | |
While((Get-Date) -lt $($start + [timespan]::new(0,0,60))) { | |
$InternetExplorer.Document.parentWindow.scrollTo(0,$VerticalScroll) | |
$VerticalScroll = $VerticalScroll + 100 | |
} |
Also, take a look at the web series here to know more about Data munging/harvesting and where Internet Explorer’s scrolling can be used for Web data munging and harvest data from the internet in any way possible. Above mentioned web series covers –
- Automating “From the Blog Archives” Tweets using Powershell
- Pumping Reddit user trend to AWS CloudWatch with Powershell
- Capturing & Analyzing online users Trend on Reddit with Powershell
- Powershell fiddling around Web scraping, Twitter – User Profiles, Images and much more
- Get example Sentence’s for a Word using Web scraping on online dictionary
- Get-Quote using Powershell
- PowerShell: Web-hosted Image Scraping
- [ Powershell ] Data Harvesting all dictionary words for each alphabet from Web
- PowerShell: Get Synonyms using Online Thesaurus
- Powershell: How to get Cricket Live Scores to Your Powershell Console
- PowerShell: Import / Query All Windows System Error Codes for Description
Please do follow me on twitter for more Interesting PowerShell material and don’t forget to Show-off more cool Web Scraping techniques you learn to your colleagues, thanks for reading. Cheers! 😉
Prateek Singh
Related posts
0 Comments
Leave a ReplyCancel reply
Categories
Author of Books
Awards
Open Sourced Projects
Author at
Blog Roll
Mike F RobbinsDamien Van Robaeys
Stéphane van Gulick
Kevin Marquette
Adam Bertram
Stephanos Constantinou
Francois-Xavier Cat
Ravikanth Chaganti
Roman Levchenko
Blog Stats
- 1,132,421 People Reached
[…] Scrolling Internet Explorer with Powershell […]
[…] on September 6, 2017 submitted by /u/Prateeksingh1590 [link] [comments] Leave a […]