While working with excel, I believe, everyone faces this situation sometime or other. How to figure out duplicates? There could be different dimensions of duplicate data. For example, it could be at sheets level. You want to figure out if two sheets are similar or the same. Some time, you might be more interested in rows or cells level.
On the other hand, you might be looking for actions and that could be just identification, or removal of duplicates or sometime there could be more actions. Lets look at some scenarios and and find probable solutions
Find if Two Sheets are same or not
If the sheets are in question have different number of rows or columns, then it is obvious that those two excel sheets do not contain same data. number of rows and columns can quickly be verified by visually checking
Find which rows are same in two sheets
Before we explore this situation, let us visualize what we are talking about Suppose, we have two sheets that have different values in some cells. You can have a look at the two pictures of sheet1 and sheet2. Contents of the cell which has different value has been highlighted
We will take another sheet to compare sheet1 and sheet2. in this article we are showing the sheets are in the same Excel workbook. But approach would be the same for sheets in different excel workbooks. The trick that I am going to use is compare content of a row in in one sheet to the content of same row in the other sheet. To do that, we will use and excel function called CONCAT. this function concatenates the content of the the cells that is passed to it as range. with the formula CONCAT(A1:D1), it will concatenate contents of A1, B1, C1 and D1.
Let us not touch the excel sheet1 and sheet2 as these are our input and we want to compare them. let's create a new sheet call sheet3 and put the formula =CONCAT(Sheet1!A2:D2)=CONCAT(Sheet2!A2:D2) in A2 cell of sheet3. We copy and paste the formula till A8. Since in this example, we are focusing on the trick, I have made the example file fairly small. Only 8 rows of data including the header and data till D column
So, what this formula does is concatenates the content of all the cells that have data in row2 of sheet1 and compares it with the concatenated contents of row2 of sheet2. If it matches, then the value in A2 is true and otherwise it is False
If we copy the formula for all the rows including row 1 till the number of rows that we have in the data sheets, it will show a list of True and False. If there is a single False, we know there is some change and exactly which row has differences
In the following example it shows that row number 4 and 7 data do not match
Where is the duplicate row - if any sheet was sorted?
Sometime it happens that the sheets are filtered, sorted etc to do some analysis. So, the entire row has moved it's position. This is discussed in another blog: Trace a duplicate row in another sheet