This blog is part of a series that shows example PowerShell code for those learning the language.
This time we’re using PowerShell to count lines, count words, find the largest word and find the most frequently used words in a text file. To make it interesting, we’re using a plain text version of “Alice in Wonderland” downloaded from the Project Guttenberg site.
This example explores string manipulation and the use of hash tables. It also shows the use of Write-Progress.
| # # Counting words in a text file # Uses the text from Alice in Wonderland # from http://www.gutenberg.org/ebooks/11.txt.utf-8 # Clear-Host $SearchWord = “WONDERLAND” $File | foreach { Write-Progress -Activity “Processing words…” -Completed $Dictionary.GetEnumerator() | ? { $_.Name.Length -gt 4 } | |
In case you were wondering what the output would look like, here it is:
Reading file .Alice.TXT…
3339 lines read from the file.
There were 25599 total words in the text
There were 2616 distinct words in the text
The word WONDERLAND was found 3 times.
The longest word was DISAPPOINTMENT
Most used words with more than 4 letters:
Name Value
—- —–
ALICE 385
LITTLE 128
ABOUT 94
AGAIN 83
HERSELF 83
WOULD 78
COULD 77
THOUGHT 74
THERE 71
QUEEN 68
BEGAN 58
TURTLE 57
QUITE 55
HATTER 55
DON’T 55
GRYPHON 55
THINK 53
THEIR 51
FIRST 50
THING 49
Very neat. Fyi, i had to remove the double quote before the escape character to run properly (line 23). I ran this against the summary field of our ticketing system (10,000 entries just for my group). Most of the results are: Can’t, won’t, unable, access, working, please, needs, include lol.
LikeLike