Powershell : Decompiling – Compiled HTML Help (.CHM) files and Data Wrangling
WHAT IS COMPILED HTML HELP (.CHM)?
Microsoft Compiled HTML Help is a Microsoft proprietary online help format, consisting of a collection of HTML pages, an index and other navigation tools. The files are compressed and deployed in a binary format with the extension .CHM, for Compiled HTML. The format is often used for software documentation, like for Sysinternals tools.
Today me and my friend were looking for a approach through which we can Decompile .chm files into HTML and then parse the HTML DOM to extract some information. After some googling I found that there is Windows command line utility HH.exe shipped with Windows operating system which can decompile the .CHM files to HTML using some command line options.
So I wrapped up the commands into a Powershell function, like below
and then extracted the required information using following piece of code
HOW TO RUN :
Here I chose Compiled HTML Help file of ProcMon.exe (Process Monitor – SysInternal Tool) as a sample .chm file.
Hope you find it useful, happy learning 🙂
Prateek Singh [twitter-follow screen_name=’SinghPrateik’]