In Part I of this series, I demonstrated how to use Range Objects, Arrays and VBA to Filter and Delete data from a table. Today, I’ll use Power Query to Filter data from Table A based on criteria in Table B.

Load Data

I’ll begin by loading data from an Excel Workbook to Power Query in Excel. Here is the data I created in Part I of the series where I grabbed NFL winning percentages for all 32 NFL Teams. I replicated the data 9 more times giving me a total of 320 records.

Just like in Part I, I want to filter out the 4 teams in the NFC East. So, if everything works correctly, I should have 280 records when I am done.

Load Data Into Power Query

In Excel, click on the data menu (1). In the Get & Transform Data Group A(2), click on the Get Data icon (3).

In the pop-up menu, click on From File (4). From the next pop-up menu, select your file type. I’m using an Excel Workbook as my data source, so I’ll select Excel Workbook.

From the Import Dialog Window, navigate to the file that has your data, highlight the file (6) and click on the import button (7).

In the Navigator Dialog Window, tick the box next to “Select Multiple Items” (8). Next, tick the box next to the name of each worksheet you wish to import into Power Query (9). In this instance, there are 2 worksheets and I want to import both worksheets. Lastly, click on the Transform Data Button (10).

The data is now loaded into Power Query. There are two tables showing in the queries pane. The table with all records is named “Data” (Table A) and the table with the 4 teams I wish to filter out of the data are in the table named “crit” (Table B).

Now that I have all data that I need loaded to Power Query, I want to perform any transformations on the data and/or tables that are needed before I can use Table B as a filter on Table A.

Transformations

Power Query added a few transformations when I imported the worksheets into Power Query. The other transformation I need, is to convert the criteria table (Table B) into a List.

  1. Click on the crit Table in the Queries Pane
  2. Click on the Transform menu across the top of the Power Query Window
  3. Click on the Convert To List icon in the Any Column Group

The crit table (4) has been converted to a list (5). Note the different icon in the queries panel.

Filter Manually

Let’s take a look at what happens when I deselect the 4 teams that I want to remove from the filter drop-down in the Data Table. in the Text Filters drop-down (1), I removed the tick marks for the 4 NFC East teams (2)

A new Applied Step is added to the Applied Steps Pane (3). There are now 280 records (4) which is what I was after.

All well and good, however I want to use the list as the list of items to exclude, I don’t want to manually deselect teams. It is easy enough for this example, but imagine if I have a much larger table with many more items I need to exclude. Let’s take a look at the expression created when I manually deselected the 4 teams:

= Table.SelectRows(#"Changed Type", each ([Team] <> "Dallas Cowboys" and [Team] <> "New York Giants" and [Team] <> "Philadelphia Eagles" and [Team] <> "Washington Redskins"))

The expression is telling Power Query to return all Rows except those where the team is Dallas Cowboys, New York Giants, Philadelphia Eagles, or Washington Redskins. Those are the 4 teams I have listed on my criteria list “crit” so I need to replace

"Dallas Cowboys" and [Team] <> "New York Giants" and [Team] <> "Philadelphia Eagles" and [Team] <> "Washington Redskins"

with

List Contains

Also note in the expression, that Power Query is using the Applied Step Name #”Changed Type” in the place of the Data Table, so I will want to use the same Applied Step in my revised expression:

= Table.SelectRows(#"Changed Type", each List.Contains(crit,[Team])=false)

That is much easier to read. Note that I used =false to exclude the teams from the crit list. Here is the data table with the revised expression showing the 280 records as expected.

List Changes – Data Updates

The idea is to pass a dynamic list to filter the data table so I’ll check to see what happens to the data table when the list changes. I added the Cincinnati Bengals to the data source, saved and closed the workbook.

I returned to the Power Query Editor and refreshed all to update the list and the table. I added a fifth team so the list should return the 5 teams and the record count in the data table should now be 270.

I now have a List (Table B) that is filtering a Table (Table A). The expression now refers to a a list so that the filter is dynamic – as the data source expands and/or contracts, I can refresh all and the Data table will show more or fewer records depending on the teams listed in the data source.

Power BI Desktop

The steps are the same if you are using Power BI Desktop.

Tidy Up

That’s it for today. Grab both files from my OneDrive

Stay tuned for Part 3 in the series.

, , , , , , , ,

Recently, I was working with Oracle Business Intelligence Enterprise Edition (OBIEE) using the Smart View Add-In for Excel. I noticed that at times after ZoomIn to bottom level (Zoom Level =2) Child Level account from a Top Level Parent Account, I was left with a few Sub-total Accounts that I did not want in my final table. I decided that the best approach was to filter the data for the unwanted G/L’s and delete the visible rows. This is a good dynamic approach so that if the list expands or contracts in the future, the user can easily maintain the list on a worksheet that is read into the array that will be passed to the filter as the criteria.

The Data –

I grabbed some NFL team data from Wikipedia for the current 32 teams in the NFL:

And I added a second sheet to use to list the criteria I want to load into the array to be used to filter the first sheet

I listed the 4 teams from the NFC East (Least). I’ll remove these teams from the list since they were all terrible teams in 2019.

Code

The full code is at the bottom, but there are a few code blocks I want look at specifically

Load Criteria Array from Worksheet Range

The Range in this case is very straightforward. It is a list that begins in A1 and continues down column A with no blanks. Therefore, I can use the CurrentRegion property of the Range Object to load the worksheet range to the criteria array. If you have a more complex range, you will need to update this piece of code.

'Transfer criteria range to array
'Use CurrentRegion Property of Range Object
    Criteria_2D = wsCrit.Range("A1").CurrentRegion

Since I transferred a worksheet range to an array- the array now has 2 dimensions where the rows are the 1st dimension and the columns are the 2nd dimension. Let see what happens when I try to use this array as the criteria for the filter:

'Call the filter function
'Pass the array as the filter criteria
     x = GetFilterDeleteRows(ws:=wsData, _
                             FilterCriteria:=Criteria_2D, _
                             ColNumber:=ColumNumber)

There are still 31 teams in the table. I have 4 teams listed on the criteria sheet, so there should only be 28 teams showing. With the 2D array the only team deleted was the Dallas Cowboys, the first team listed on the criteria sheet. To get all of the teams on the criteria sheet, I am going to have to convert the array from 2 dimensions to 1 dimension.

Make the 1D Array the Same Size as the 2D Array

I’ll start the conversion process by making sure the 1D array is the same size as the 2D array for the rows part of the 2D array. Use the ReDim statement:

 'Now that we know the size of the 2D array
 'We can resize the 1D array
        ReDim Criteria_1D(LBound(Criteria_2D) To UBound(Criteria_2D))

Load the 2D Array Into the 1D Array

Now that the 1D array is the same size as the 2D array, I can safely load the contents of the 2D array into the 1D array:

    'Load 2D array into 1D array
        For i = LBound(Criteria_2D, 1) To UBound(Criteria_2D, 1)
            Criteria_1D(i) = Criteria_2D(i, 1)
        Next i

Filter and Delete Data Based on Criteria Array

Now that I have the criteria loaded into the 1D array, I can pass the array to my function to filter the data and delete the visible rows after the filter:

    'Call the filter function
    'Pass the array as the filter criteria
        x = GetFilterDeleteRows(ws:=wsData, _
                                FilterCriteria:=Criteria_1D, _
                                ColNumber:=ColumNumber)

Filter Data

This code chunk will filter the data based on the criteria array. I see that the team name I am filtering for is in Column A (Field Number 1) so I passed “1” to the function. There are other, better ways to find the position of the field to make the code more dynamic in case the column containing the team name were to move:

 'Filter range based on criteria array
        rng.AutoFilter _
            Field:=ColNumber, _
            Criteria1:=FilterCriteria, _
            Operator:=xlFilterValues

Delete Visible Rows Except Header Row

This bit of code will delete the visible rows based on the criteria I filtered for. Note that I use the Resize Property of the Range Object so that the header row is not deleted. Again, simply passing Field Number as “1”.

 'Filter range based on criteria array
        rng.AutoFilter _
            Field:=ColNumber, _
            Criteria1:=FilterCriteria, _
            Operator:=xlFilterValues

Now that I have the array with 1 dimension as the filter criteria, I am getting the correct expected results. The 4 teams from the NFC East have been deleted leaving 28 teams.

Multiple Instances

My data only has 1 unique instance of team. What if there is more than 1 instance that I want to delete? No problem, the filter will show all instances of the criteria and once visible, the SpecialCells Method of the Range Object with the xlCellTypeVisible enumeration will handle all instances of the given criteria.

I made 9 additional copies of of the data for a total of 10 instances of each team giving a total of 320 records in the table.

There are 10 instances of each team and I want to delete 4 teams for a total of 40 records. When all is done and said I should have 280 records remaining.

The final data after running the procedure again to filter and delete all instances of the 4 teams. 280 records remain expected – sweet succeess!

The Main Procedure

Here’s the main procedure

Option Explicit

Sub FilterOutUnwantedValues()

    'Developer      :   Winston Snyder
    'Created Date   :   January 23, 2020
    'Purpose        :   Remove rows of data the user does not want in the final dataset
    'Comments       :

    'Objects
        Dim wb As Workbook
        Dim wsData As Worksheet
        Dim wsCrit As Worksheet

    'Variables
        Dim x As Long
        Dim r As Long
        Dim i As Long
        Dim Criteria_2D() As Variant
        Dim Criteria_1D() As String
        
    'Constants
        Const ColumNumber As Long = 1

    'Initialize
        Set wb = ThisWorkbook
        Set wsData = wb.Worksheets("Data")
        Set wsCrit = wb.Worksheets("crit")

    'Transfer criteria range to array
    'Use CurrentRegion Property of Range Object
        Criteria_2D = wsCrit.Range("A1").CurrentRegion

    'Now that we know the size of the 2D array
    'We can resize the 1D array
        ReDim Criteria_1D(LBound(Criteria_2D) To UBound(Criteria_2D))

    'Load 2D array into 1D array
        For i = LBound(Criteria_2D, 1) To UBound(Criteria_2D, 1)
            Criteria_1D(i) = Criteria_2D(i, 1)
        Next i

    'Check contents of 1D Array
        For i = LBound(Criteria_1D) To UBound(Criteria_1D)
            Debug.Print i, Criteria_1D(i)
        Next i

    'Call the filter function
    'Pass the array as the filter criteria
        x = GetFilterDeleteRows(ws:=wsData, _
                                FilterCriteria:=Criteria_1D, _
                                ColNumber:=ColumNumber)

    'Tidy up
        Erase Criteria_1D
        Erase Criteria_2D
        Set wsData = Nothing
        Set wsCrit = Nothing
        Set wb = Nothing

End Sub

The Filter and Delete Function

Here is the function to filter and delete the data

Option Explicit

Public Function GetFilterDeleteRows(ws As Worksheet, _
                                    FilterCriteria As Variant, _
                                    ColNumber As Long) As Long


    'Developer      :       Winston Snyder
    'Date           :       January 23, 2020
    'Purpose        :       Filter a range based on criteria in array
    '                       Once filtered, delete all visible rows except header row
    '
    'Paramers       :       ws              :   A worksheet object
    '               :       FilterCriteria  :   A one dimensional array of values to filter the data for
    '               :       ColNumber       :   The Column Number where the filter is to be applied
    '===================================================================================================

    'Objects
        Dim rng As Range
        Dim rngDelete As Range
        
    'Variables
        Dim i As Long
        
    'Initialize
        With ws
            Set rng = .Range("A1").CurrentRegion
        End With

    'Filter range based on criteria array
        rng.AutoFilter _
            Field:=ColNumber, _
            Criteria1:=FilterCriteria, _
            Operator:=xlFilterValues

    'Create a range object of all visible rows
    'Do not include the header row
        Set rngDelete = rng.Offset(1, 0) _
                           .Resize(rng.Rows.Count - 1, rng.Columns.Count) _
                           .SpecialCells(xlCellTypeVisible) _
                           .EntireRow

    'Delete all visible rows
    'Do not include the header row
        rngDelete.Delete

    'Remove the filter
        ws.ShowAllData
        ws.AutoFilterMode = False

    'Destroy objects
        Set rng = Nothing
        Set rngDelete = Nothing

    'Return
        GetFilterDeleteRows = 0

End Function

Tidy Up

That’s it for today. You can download the workbook from my OneDrive. Stay tuned for Part II. I’ll demonstrate some other options for filtering and deleting data.

, , , , ,

DandelionFinal2

Dandelions are weeds. I’ve spent more hours weeding them out of my family’s yard and my grandparents’ yards than I care to count. Yet, when you come across them in a meadow, they are very nice to look at and add a kind of tranquility in their own right. They are edible and make a nice tasting wine. Here’s a recipe from AllRecipes. Let us know how it turns out.

Today’s post, however, is not about dandelions or wine making – it about the Filter Property of ADO Recordsets.


    edit: As is my usual practice, I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see these links for a detailed explanation of Late / Early Binding.

    1. Beyond Excel
    2. JP Software Technologies


In my last post on ADO Recordsets, I demonstrated some VBA to load a Worksheet Range into a Recordset – check it out here.

First Step

Before I filter the Recordset, I would like to manually filter the dataset so I can determine what the final results should be so I can compare to make sure everything goes correctly with the Recordset Filter.

RecordsAllFinal

Here’s some data I borrowed from Marco Russo and Alberto Ferrari. It looks as though it may have come from the AdventureWorks Database. There are a total of 60,398 records.

Now I’ll filter on the SalesManager field to look at records that are not related to Marco.

RecordsFilteredFinal

OK, 25,109 records remain after I filter out records for Marco so When I filter the Recordset, I should receive 25,109 records. I removed the AutoFilter, now I am ready to Filter the Recordset

Global Constants

In my last post. on ADO Recordsets, I began by adding some Global Constants to a module named “M_Globals”. I’m going to add a few new constants for the Filter Group Enumeration. I may use them, I may not. But at least I have them defined if I do need them.

'Filter Group Enumeration
    Public Const gcladFilterNone = 0                 'No filter. This value removes the current filter and restores all records to view.
    Public Const gcladFilterPendingRecords = 1       'Use the pending records. This value allows viewing only those records that have changed but have not yet been sent to the server. This value is only applicable for batch update mode.
    Public Const gcladFilterAffectedRecords = 2      'Use only records affected by the last Delete, Resync, UpdateBatch, or CancelBatch call.
    Public Const gcladFilterFetchedRecords = 3       'Use the last fetched records. This value allows viewing the records in the current cache returned as a result of the last call to retrieve records (implying a resynchronization).
    Public Const gcladFilterConflictingRecords = 5   'Use the conflicting records. This value allows viewing only those records that failed the last batch update.

Load The Recordset

I won’t clutter this tutorial by reposting the same code I posted in my last article on ADO Recordsets, check the Sub() out here.

Filter The Recordset

Now that I have a Recordset, I just need to add a bit of code to filter it. Recall, I am interested in all records where the Sales Manager is not Marco. So my criteria string will be something like “SalesManager <> ‘Marco Russo'”

I just need to add 6 lines to my original Sub() and of those, 2 lines are comment lines (I could use fewer lines, I’m using additional lines for clarity)

        Dim strFilter As String
        'Filter string
            strFilter = "SalesManager <> 'Marco Russo'"
        'Filter the Recordset and display the filter record and field count to check results
            rs.Filter = strFilter
            Debug.Print "The filtered recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"

Returns:

The original recordset contains 60,398 records and 23 fields
The filtered recordset contains 25,109 records and 23 fields

Perfect! The Filtered Recordset matches with the results I obtained earlier by manually filtering the Range.

Gimmee The Data…

Most likely, we want to return the dataset back to the user in either a new workbook or a new worksheet. For today, I’ll return the Filtered Recordset back to the same Workbook on a new Worksheet.

Add A Worksheet

I’ll create a Function to add a worksheet to a workbook so that I have a safe place to return the results of the Filtered Recordset

Public Function AddWorksheet(wb As Workbook) As Worksheet

    'Declare variables
        Dim wsNew As Worksheet
        
    'Add worksheet to end of other worksheets in the workbook
        With wb
            Set wsNew = .Worksheets.Add _
                                    (After:=.Worksheets(.Worksheets.Count))
                            
        End With
        
    'Return object to function
        Set AddWorksheet = wsNew
        
    'Tidy up
        Set wsNew = Nothing
        
End Function

And I call the Function here:

        'Add a worksheet for the filtered results
            Set wsResults = AddWorksheet(wb:=wb)

CopyFromRecordset Method

The Range Object has a CopyFromRecordset Method, so I’ll use that:

        'Copy the filtered recordset to the results range
        'The CopyFromRecordset Method does not include headers
            wsResults.Cells(1, 1).CopyFromRecordset rs

And the output:
RecordsetOutFinal

It’s looking good. The record count matches with what I expected from the manual filter process at the top of the post. The only problem is that the CopyFromRecordset Method does not include the field headers, so I’ll need a small Sub() to get the field headers and then output the Recordset to Cell(2,1) instead of Cell(1,1).

The Fields Collection

The Recordset Object has a Fields Collection, so I can loop through the Fields Collection to get the Field Names. The gotcha here is that the Fields Collection begins as zero – so be aware of that.

Here’s the Sub():

Public Sub GetRSFieldNames(ws As Worksheet, _
                           rs As Object)
                           
    'Declare variables
        Dim x As Long
        
    'Get field names
        For x = 0 To rs.Fields.Count - 1
            ws.Cells(1, x + 1).Value = rs.Fields(x).Name
        Next x
    
End Sub

Here is how I called the Sub():

        'Output Recordset Field Names to the worksheet
            Call GetRSFieldNames(ws:=wsResults, _
                                 rs:=rs)

And the Output:
RecordWFieldsFinal

Looks pretty good – I just want to add some formatting to improve readability.

CharlieDanielsFinal

Fiddle Factor

No – not Charlie Daniels pictured here sawing on a fiddle and playing it hot as in The Devil Went Down To Georgia. Rather, Fiddle Factor is a term I learned from one of my supervisors which refers to the amount of time and energy spent formatting an Excel Report. The more time and energy spent – the higher the Fiddle Factor.

But I think formatting is very important. Not only does it make data and information easier to read and understand, but if it is done well, it actually draws or invites the reader in. Stephen Few has quite a bit to say about well-done formatting on his blog, Perceptual Edge.

Enough of my soap box, my goal in this case is not so lofty. I just want to add a bit of color to the header row, fit the column width to the data and maybe play with the zoom level:

Here is the Sub() to format the output:

Sub FormatOutput(ws As Worksheet)

    'Declare variables
        Dim LastColumn As Long
        Dim rngHeader As Range
        Dim lngColor As Long
        
    'initialize
        lngColor = RGB(68, 84, 106)
        
    'Get last column of header row range
        LastColumn = GetLast(ws:=ws, _
                             strType:="c")
                             
    'Create Range Object - header row range
        With ws
            Set rngHeader = .Range(.Cells(1, 1), .Cells(1, LastColumn))
        End With
        
    'Format the header row range
        With rngHeader
            .Interior.Color = lngColor
            .Font.Bold = True
            .Font.Color = vbWhite
        End With
        
    'Format Dates
        With ws
            .Range("L2").EntireColumn.NumberFormat = "MM/DD/YYYY"
        End With
               
    'Set zoom level
        ws.Activate
        ActiveWindow.Zoom = 75
        
    'Fit column width to data
        Columns.AutoFit

End Sub

And here is how I call the Sub():

        'Format the output
            Call FormatOutput(ws:=wsResults)

The Full Monty

Here is the main Sub() with the additions to Filter the Recordset, Output the Recordset to a new Worksheet, and Format the data:

Sub FilterRecordset()

    'Declare variables
        Dim wb As Workbook
        Dim wbADO As Workbook
        Dim ws As Worksheet
        Dim wsResults As Worksheet
        Dim rng As Range
        Dim rngResults As Range
        Dim cn As Object
        Dim rs As Object
        Dim cmd As Object
        Dim strWorksheet As String
        Dim strSQL As String
        Dim strWorkbookADO As String
        Dim strFilter As String
                
    'Excel environemnt
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With

    'Initialize
        Set wb = ThisWorkbook
        
        'Get worksheet to be loaded into recordset
            strWorksheet = GetSelectedSheet(strPrompt:="Select a cell on the worksheet to be loaded into the recordset", _
                                            strTitle:="Worksheet To Recordset")
                                            
        'Create a new workbook to hold all data from the selected worksheet
            Set wbADO = Workbooks.Add
            
        'Copy everything from the selected worksheet to the new workbook
            Call CopyData(wbSource:=wb, _
                          wbDestination:=wbADO, _
                          strSource:=strWorksheet)
                          
        'Cleanup the destination workbook
            Call CleanupWorkbook(wb:=wbADO)
            
        'Save and close the data workbook
            With wbADO
                .SaveAs wb.Path & "\" & Mid(wb.Name, 1, Len(wb.Name) - 5) & "_ADO.xlsx", FileFormat:=xlOpenXMLWorkbook
                strWorkbookADO = wbADO.FullName
                .Close
            End With

        'Create a range object to measure source data against final recordset data
            Set ws = wb.Worksheets(strWorksheet)
            Set rng = ws.Range("A1").CurrentRegion

        'SQL string
            strSQL = "SELECT * FROM [Data$]"
            
        'Filter string
            strFilter = "SalesManager <> 'Marco Russo'"

        'Create ADO Connection Object
            Set cn = GetADOConnection()
            cn.Open ("Provider=Microsoft.ACE.OLEDB.12.0;" & _
                     "Data Source=" & strWorkbookADO & ";" & _
                     "Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1'")

        'Create ADO Command Object
            Set cmd = GetCommand()
            Set cmd.ActiveConnection = cn
            cmd.CommandType = gcladCmdText
            cmd.CommandText = strSQL                        'Pass SQL String to the command object

        'Create ADO Recordset Object and load records
            Set rs = GetRecordset()
            With rs
                .CursorLocation = gcladUseClient
                .CursorType = gcladOpenDynamic
                .LockType = gcladLockOptimistic
                .Open cmd
            End With

        'Compare recordset results to original data
            Debug.Print "The original recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"
            Debug.Print "The range contains " & Format(rng.Rows.Count - 1, "##,##0") & " rows and " & rng.Columns.Count & " columns" '-1 to discount header row
            
        'Filter the Recordset
            rs.Filter = strFilter
            
        'Add a worksheet for the filtered results
            Set wsResults = AddWorksheet(wb:=wb)
            
        'Output Recordset Field Names to the worksheet
            Call GetRSFieldNames(ws:=wsResults, _
                                 rs:=rs)
            
        'Copy the filtered recordset to the results range
        'The CopyFromRecordset Method does not include headers
            wsResults.Cells(2, 1).CopyFromRecordset rs
            
        'Format the output
            Call FormatOutput(ws:=wsResults)
    
            
        'Tidy up
            'Close objects
                rs.Close
                cn.Close
                
            'Destroy objects
                Set rs = Nothing
                Set cmd = Nothing
                Set cn = Nothing
                Set rng = Nothing
                Set ws = Nothing
                Set wsResults = Nothing
                Set wbADO = Nothing
                Set wb = Nothing
                
            'Excel environemnt
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                    
End Sub

And the final output:

RecordWFieldsAutoFitComplete

Tidy up

Final Thoughts

This post was about Filtering ADO Recordsets. The Filter I used was very simple and only scratches the surface of what is possible. You may use the Filters in combinations with AND, OR, LIKE and Wildcard Characters. Make sure you check out the link to Recordset Filter Property. Lots of great information.

I don’t like that I hard coded the Filter String inside the Sub(). It would be better to offer a user form at run-time to read the fields in the recordset and prompt the user to make choices through Combo Boxes, Check Boxes, etc…

Other Recordset Posts At dataprose.org

Additional Resources

Downloads

Download the file from OneDrive. The filename is Excel – Recordset_v3.xlsm

Credits

Data courtesy Microsoft Excel 2013 Building Data Models with PowerPivot by Alberto Ferrari and Marco Russo (Mar 25, 2013)