Remove Duplicate Mail from Outlook for Free

Remove Duplicate Mail from Outlook for Free

Source code at the top for the TLDRers because I’m a gentleman. Please take the time to read my tale.

This code is VBA in Outlook; I think the keyboard shortcut is Alt + F11.

  • Make a backup PST before running this code
  • Open the VBA editor (ah, nostalgia of making aol progs)
  • Paste it
  • Hit the play button
  • Select Outlook folder (it’s recursive, so it will iterate down from that folder)
  • Saves removals log to %DESKTOP%\dupes_removed.txt

You may want to modify the criterion that constitutes a match for your dataset. Mine had weird whitespace differences so weren’t true duplicates. I was going to get all fancy and use Levenshtein method for string comparison, but I didn’t since efficient. I use OutlookObject.Subject, OutlookObject.SentOn, and OutlookObject.Sender; If those match I delete.

Sub Main()
    'Select a source folder
   
    Dim PickFolder As Folder
    Set PickFolder = Outlook.Application.Session.PickFolder

    Call processFolder(PickFolder)
End Sub
Public Sub SaveToFile(Data As String)
    Open GetDesktop & "\dupes_removed.txt" For Append As #1
        Write #1, Data
    Close #1
End Sub
Function GetDesktop() As String
    Dim oWSHShell As Object

    Set oWSHShell = CreateObject("WScript.Shell")
    GetDesktop = oWSHShell.SpecialFolders("Desktop")
    Set oWSHShell = Nothing
End Function
 Private Sub processFolder(ByVal oParent As Outlook.MAPIFolder)

        Dim oFolder As Outlook.MAPIFolder
        Dim oMail As Outlook.MailItem

        Dim OutlookObject As Object
        Dim DuplicatesFound As Long
        Dim EmailDict As Object
        Dim Filter As String
        Dim Items As Object
        
        Set EmailDict = CreateObject("Scripting.Dictionary")
        
        DuplicatesFound = 0
        
        EmailDict.RemoveAll
        
        For Each OutlookObject In oParent.Items
            Select Case TypeName(OutlookObject)
                Case "MailItem"
                    'email item
                    On Error Resume Next
                    strKey = OutlookObject.Subject & ";;" & OutlookObject.SentOn & ";;" & OutlookObject.Sender
                    'Debug.Print strKey
                    
                    If EmailDict.exists(strKey) Then
                        DuplicatesFound = DuplicatesFound + 1
    
                        OutlookObject.Subject = "**DUPLICATE**"
                        OutlookObject.Save
                        DoEvents
                    Else
                        EmailDict.Add strKey, True
                    End If
                    
                    'Debug.Print OutlookObject.UserProperties
                Case "AppointmentItem"
                    DoEvents
                Case Else
                    Debug.Print TypeName(OutlookObject)
            End Select
        Next
        
        Filter = "[Subject] = '**DUPLICATE**'"
        
        Set Items = oParent.Items.Restrict(Filter)
        
        While Items.Count > 0
            On Error Resume Next
            'Debug.Print "   " & Items.GetFirst.Subject
            Items.Remove 1
        Wend
        
        Debug.Print "Removed " & DuplicatesFound & " dupes"
        Call SaveToFile("Removed " & DuplicatesFound & " dupes")

        If (oParent.Folders.Count > 0) Then
            For Each oFolder In oParent.Folders
                processFolder oFolder
            Next
        End If
End Sub
Sub Dedupe(objFolder As Folder)
    Dim OutlookObject As Object
    Dim DuplicatesFound As Long
    Dim EmailDict As Object
    Dim Filter As String
    Dim Items As Object
    
    Set EmailDict = CreateObject("Scripting.Dictionary")
    
    DuplicatesFound = 0
    
    EmailDict.RemoveAll
    
    For Each OutlookObject In objFolder.Items
        Select Case TypeName(OutlookObject)
            Case "MailItem"
                'email item
                strKey = OutlookObject.Subject & ";;" & OutlookObject.SentOn & ";;" & OutlookObject.Sender
                'Debug.Print strKey
                
                If EmailDict.exists(strKey) Then
                    DuplicatesFound = DuplicatesFound + 1

                    OutlookObject.Subject = "**DUPLICATE**"
                    OutlookObject.Save
                    DoEvents
                Else
                    EmailDict.Add strKey, True
                End If
                
                'Debug.Print OutlookObject.UserProperties
            Case "AppointmentItem"
                DoEvents
            Case Else
                Debug.Print TypeName(OutlookObject)
        End Select
    Next
    
    Filter = "[Subject] = '**DUPLICATE**'"
    
    Set Items = objFolder.Items.Restrict(Filter)
    
    While Items.Count > 0
        On Error Resume Next
        'Debug.Print "   " & Items.GetFirst.Subject
        Items.Remove 1
    Wend

    MsgBox "Found " & DuplicatesFound & " duplicates"
End Sub
This script would not have been possible had it not been for the people on Stackoverflow, internet hugs to everyone.

There are two things in this life that I excel :

  • Having kids
  • Automating tasks I don’t want to do manually because I am lazy efficient

I found myself in a pickle recently with a user’s inbox having loads of duplicates. All the 3rd party utilities I found online wouldn’t detect them (even ODIR)

This was a result of mail initially being imported via an OLM file with Outlook for Mac- Don’t. Ever. Do. This. It crashed in the middle of an import that was syncing up to 365. When it failed, I decided we needed to this with a PST and Outlook for Windows.

Due to privacy concerns the client insisted upon doing the migration via VM in their office. Otherwise I’d have used a migration wizard. Anyways I was able to script my way out of it, and I figured this could help someone else out down the road.


Comments are closed.