I like music – lots of different kinds of music, including The Beatles pictured here. These days, I’m more into “Adult Alternative” music, but I still like to listen to some stuff from the ’60’s and ’70’s every now and again.
Around these parts, we get “Breakfast With The Beatles” with Andre Gardner every Sunday from 7am-9am. Check your local listings to see if it is available in your area or you can stream it on KSLX. Really excellent show if you are into The Beatles.
However, today’s post is not about music or The Beatles. It is about Merging Excel Workbooks. Over on the LinkedIn Excel Groups there are many questions about merging data from several Excel Workbooks into 1 Workbook. Many folks suggest linking the workbooks using a formula. Others suggest using the INDIRECT() function. I prefer to use some VBA to copy the desired data from the Source Workbooks to the Destination Workbook.
All testing, screen shots, code in this post are from Office 365. If you are using another version of Excel, your results may vary (though they should not).
Linking Files
A common way to get data from one file to another is to link them. In this sample, I opened 2 Excel files, entered a value in 1 file, activated another file, typed, “=” and clicked on a cell in the 1st file:
DestinationWB.xlsx in the active workbook and cell $A$1 is selected. The formula bar give us:
- [SourceWB.xlsx] <- The name of the workbook
- Sheet1! <- The name of the worksheet
- $A$1 <- The cell reference
When I close the source workbook, the file path is added to the linked cell formula:
The folder location ‘C:\Data\ has been added to the linked formula.
There are a few things I do not like about linked cells
- The absolute references were added automatically. If I drag the formula to the right, the formula will still reference $A$1, not $B$1 as we may require
- What happens if a Row is added at Row 1 pushing all subsequent rows down by 1? This could cause an error in the linked formula, erroneous results, or at least the need for additional maintenance.
- What happens if the source file(s) are moved from the referenced folder location?
Here is the formula with the file in the folder ‘C:\OriginalFolder\.
Linking Files – File Migration
One problem with linking files is that files move in our directory structures from time-to-time. We may initiate this move ourselves as we come up with a new way to organize our file structures. Other times this move may be initiated by the IT Department as they are updating or migrating storage.
Let’s see what happens to the original link formula when I move the source file.
I moved the file from ‘C:\OriginalFolder\ to ‘C:\NewFolder\ in Windows Explorer.
Now when I try to update the link value in the Destination Workbook, I receive an error message
I can click on the change source button, but I would like a solution that provides for as little maintenance needs as possible.
Linking Files – Data Moves On Worksheet
The original linked formula is linked to cell A1 in the source file. What happens if, unbeknownst to you, a user inserts a blank row at A1 and all of the data shifts from A1 to A2?
The Edit Links Dialog status reads, “OK”, but it is not. The Linked Cell formula is still linked to Cell $A$1 and the updated Cell Value is now 0. We could use Find-Replace to update from $1 to $2, but that could have unintended consequences. What if other workbook moved data 2-3 rows? You see the point I’m sure.
Copy Data From Source To Destination
I propose to loop through a folder and copy some data from a worksheet for each workbook found in the folder – I would like the solution to be as dynamic as possible. As I am looping, I will paste the data on a worksheet in Top-Down fashion such that the first file results will be near the top, then the data from the second file and so on.
For today, my source files all have an Excel Table in them, though we could make the process work with Ranges as well with just a bit more work. For additional resources working with Excel Tables, see the links at the bottom of the post.
Some Functions & Properties
First, I’ll look at some VBA functions that I am going to use in the final code. These functions will make your code more dynamic and user-friendly.
CurDir
CurDir returns the current path. I use CurDir to trap the current path so I can restore it at then before I change the path to make navigating the file structure faster. Use as follows:
Option Explicit Sub foo2() Dim strDirectory As String strDirectory = CurDir() Debug.Print strDirectory End Sub
Output:
C:\Users\wsnyder\Documents
ChDir
ChDir changes the the current path. I use ChDir to change the current path to get the user closer the final folder they will eventually choose using the FileDialog to select a folder for processing. Use as follows:
Option Explicit Sub foo2() Dim strDirectory As String strDirectory = "C:\Data\" ChDir (strDirectory) Debug.Print CurDir() End Sub
Output:
C:\Data
FileDialog Property
The FileDialog Property of the Application Object returns a FileDialog Object. This give you the ability to interact with users at runtime by allowing the user to choose File(s) or Folder(s) to work with. The FileDialog accepts one argument, the DialogType.
There are 4 DialogTypes in the MsoFileDialogType Enumeration :
- msoFileDialogFilePicker
- msoFileDialogFolderPicker
- msoFileDialogOpen
- msoFileDialogSaveAs
For today’s purposes, I’ll use msoFileDialogFolderPicker
Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker)
The Setup…
As I mentioned earlier, each source file has an Excel Table in it. I would like to loop through each file in the source folder and copy the DataBodyRange of the Excel Table to the destination workbook. Additionally, on the first pass, I would also like to copy the HeaderRowRange to create headers in the destination workbook. Lastly, I would like to add some data to the right of the data from the source files, such as the date the data was copied and the name of the source file.
Create Some Sample Data and Files
I quickly whipped up some sample data using Dick’s Random Data Generator and creatively saved the files as File1.xlsx, File2.xlsx, File3.xlsx.
Again, I highlighted the data [Ctrl] + [a] and added an Excel Table [Ctrl] + [t] in each file. I applied a different Table Style to each Table simply to highlight the fact that there are 3 different Tables in 3 different Excel files. If you need to brush up on Excel Tables or need to start at the beginning:
Loop Through Files In A Folder
When I need to loop through files in a folder – I use the FileSystemObject (FSO). The FileSystemObject is a top-level object in the Microsoft Scriping Runtime Library (Scrrun.dll). Here are some additional references if you need to brush up or are not familiar with FSO.
edit: I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see the “Additional Resources” at bottom for links to detailed explanation of Late / Early Binding.
Copying Data From Source To Destination
It’s finally time to copy the data from the source workbooks to the destination workbook. I’m using three files with 50 records each, but you could use this code with an unlimited number of records or variable number of records and unlimited number of files (3, 10, 50,…) – as long as you do not exceed 1,048,576 rows (though I would never use that many row in Excel – time to consider a database).
Option Explicit Sub CopyDataFromSourceFiles() 'Author : Winston Snyder 'Created Date : 1/26/2014 'Comments : Assumes each source file contains at least one list object (Excel Table) 'Delare variables Dim wb As Workbook Dim wbData As Workbook Dim ws As Worksheet Dim wsData As Worksheet Dim rngData As Range Dim rngDestination As Range Dim lo As ListObject Dim fso As Object Dim fsoFolder As Object Dim fsoFile As Object Dim strSelectedFolder As String Dim strCurrentPath As String Const strSpecifiedPath As String = "C:\" Dim lngRows As Long Dim blnFlag As Boolean 'Excel environment - speed things up With Application .ScreenUpdating = False .DisplayAlerts = False .EnableEvents = False .Calculation = xlCalculationManual End With 'Initialize variables Set wb = ThisWorkbook Set ws = wb.Worksheets("Data") blnFlag = True 'Clear data from control workbook from previous consolidations ws.UsedRange.ClearContents 'Get the current path, so reset the path at the end of the procedure strCurrentPath = CurDir() 'Set the target directory to get the user closer to the working folder 'This will minimize the time the user must spend drilling into the file system 'once they are presented with the FileDialog ChDir (strSpecifiedPath) 'Create a FileSystemObject Set fso = GetFSO 'Prompt the user to select a folder 'Return the path of the selected folder strSelectedFolder = GetSelectedFolder 'Get the FSO Folder of the selected folder Set fsoFolder = fso.GetFolder(strSelectedFolder) 'Loop each file in folder 'Copy data from each file to control workbook For Each fsoFile In fsoFolder.Files Debug.Print fsoFile.Name Set wbData = Workbooks.Open(fsoFile) Set wsData = wbData.Worksheets("Sheet1") 'Get next blank row from destination worksheet 'If first time, need row 1, else, next blank row lngRows = GetRows(ws:=ws) If blnFlag = False Then lngRows = lngRows + 1 'The Destination Range Set rngDestination = ws.Cells(lngRows, 1) 'If first time, include the header row With wsData For Each lo In .ListObjects If blnFlag = True Then Set rngData = Union(lo.HeaderRowRange, lo.DataBodyRange) blnFlag = False Else Set rngData = lo.DataBodyRange End If Next lo End With 'Copy the Data Range to the Destination Range rngData.Copy rngDestination.PasteSpecial xlPasteValuesAndNumberFormats 'Close the source file wbData.Close Next fsoFile 'Tidy up 'Restore to original path ChDir (strCurrentPath) 'Restore Excel environment With Application .ScreenUpdating = True .DisplayAlerts = True .EnableEvents = True .Calculation = xlCalculationAutomatic End With 'Destroy objects Set fsoFolder = Nothing Set fso = Nothing Set rngData = Nothing Set rngDestination = Nothing Set ws = Nothing Set wb = Nothing End Sub '------------------------------------------------------------------------------------ Private Function GetRows(ws As Worksheet) As Long Dim r As Long With ws r = .Cells(Rows.Count, 1).End(xlUp).Row GetRows = r End With End Function '------------------------------------------------------------------------------------ Private Function GetFSO() Dim fso As Object Set fso = CreateObject("Scripting.FileSystemObject") Set GetFSO = fso Set fso = Nothing End Function '------------------------------------------------------------------------------- Private Function GetSelectedFolder() As String Dim diaFolder As FileDialog Dim strFolder As String Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker) With diaFolder .AllowMultiSelect = False .Show strFolder = .SelectedItems(1) End With GetSelectedFolder = strFolder End Function
I formatted the output and hid some rows to show that there are 151 records as expected (3 files * 50 records each + 1 header row).
My favorite part of the code is here
'If first time, include the header row With wsData For Each lo In .ListObjects If blnFlag = True Then Set rngData = Union(lo.HeaderRowRange, lo.DataBodyRange) blnFlag = False Else Set rngData = lo.DataBodyRange End If Next lo End With
The properties of the ListObject (Excel Table) such as HeaderRowRange and DataBodyRange are 2 reasons why the ListObject is far superior to the Range Object. Couple these kinds of properties with the fact that you can move the Excel Table anywhere on the worksheet you want and add rows to the Table or redact rows from the Table and the consolidation code will still work flawlessly. No Excel Hell! Awesome!
Tidy Up
Final Thoughts
That’s it for today. That was a long post and there is quite a bit going on. I spent a bit of time on why not to link files and then covered the gamut: some VBA Functions, VBA, the FileDialog Property, the FileSystemObject and ListObjects (Excel Tables) and some ListObject properties. Hopefully, I was able to answer more questions than I opened. The only question remaining in my mind – Who was The Walrus? :-)
Downloads
Download the zip archive from SkyDrive. The name is MergeExcelFiles.zip
Additional Recources
Early / Late Binding
Beyond Excel
JP Software Technologies
Merge Excel Workbooks
Merging Data from Multiple Workbooks into a Summary Workbook in Excel
RDBMerge, Excel Merge Add-in for Excel for Windows