Quantcast
Channel: Active questions tagged excel - Stack Overflow
Viewing all articles
Browse latest Browse all 88066

Replace Whole Words from a Predefined List

$
0
0

I currently have coding which will review an equipment description field, the aim of which to standardize entries. That is - whatever is found in COL A, replace with COL B I want to post the answer back to a new clean description column (that will work OK, no dramas on that section, but I don't need any messages etc, and this may be doing 100,000+ descriptions at a time, so looking for efficient coding).

However when it applies the Replace function, it also replaces part words, instead of distinct whole words, no matter how I sort the words on the Dictionary tab. ** 99 times out of a hundred there are no preceding or trailing spaces in Col A entries, but there are rare occasions...

Description Examples:

AIR COMPRESSOR

LEVEL GAUGE OIL SEPARATOR GAS COMPRESSOR

PRESS CTRL VV

PRESSURE GAUGE FLAME FRONT

PRESS as part of word becomes PRESSURE, e.g.:

COL A:              COL B:                    
COMPRESSSOR              COMPRESSOR                
PRESSURE                 PRESSURE                  
PRESSURE   GAUGE         PRESSURE GAUGE
PRESS                    PRESSURE
AIR COMPRESSOR           AIR COMPRESSOR

I think I'm very close to getting this right, but I can't figure out how to adjust to make it run and replace whole words only - I think it is the order of where I have stuff, but not 100% sure, or if something is missing.
I would greatly appreciate your help with this.

Thanks, Wendy

Function CleanUntil(original As String, targetReduction As Integer)

Dim newString As String
newString = original

Dim targetLength As Integer
targetLength = Len(original) - targetReduction

Dim rowCounter As Integer
rowCounter = 2

Dim CleanSheet As Worksheet
Set CleanSheet = ActiveWorkbook.Sheets("Dictionary")

Dim word As String
Dim cleanword As String

' Coding for replacement of WHOLE words - with a regular expression using a pattern with the \b marker (for the word boundary) before and after word
Dim RgExp As Object
Set re = CreateObject("VBScript.RegExp")
With RgExp
    .Global = True
        '.IgnoreCase = True     'True if search is case insensitive.  False otherwise
End With

'Loop through each word until we reach the target length (or other value noted), or run out of clean words to apply
'While Len(newString) > 1     (this line will do ALL descriptions - confirmed)
'While Len(newString) > targetLength   (this line will only do to target length)

While Len(newString) > 1

    word = CleanSheet.Cells(rowCounter, 1).Value
    cleanword = CleanSheet.Cells(rowCounter, 2).Value

    RgExp.Pattern = "\b"& word & "\b"

    If (word = "") Then
        CleanUntil = newString
        Exit Function
    End If

    ' TODO: Make sure it is replacing whole words and not just portions of words
'    newString = Replace(newString, word, cleanword)     ' This line works if no RgExp applied, but finds part words.
    newString = RgExp.Replace(newString, word, cleanword)

    rowCounter = rowCounter + 1
Wend

' Once word find/replace finished, set close out loop for RgExp Object with word boundaries.
Set RgExp = Nothing

' Finally return the cleaned string as clean as we could get it, based on dictionary
CleanUntil = newString

End Function

Viewing all articles
Browse latest Browse all 88066

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>