Quantcast
Channel: Active questions tagged row - Stack Overflow
Viewing all articles
Browse latest Browse all 447

Identifying duplicate rows within dataframes and between dataframes based on two columns R

$
0
0

I have two dataframes. For this question, I will create 2 fake example dataframes:

ppl <- data.frame(id = c(1123, 5323, 5342, 1234, 6434, 2342),state = c('ME', 'WY', 'FL', 'MA', 'TN', 'ME'),name = c('Paul', 'Lisa', 'Simone', 'James', 'Ruby', 'Paul'))

and

ppl_2 <- data.frame(id = c(1823, 5123, 5842, 1004, 6034, 1342),state = c('MI', 'TN', 'NM', 'TX', 'OR', 'NH'),name = c('Fred', 'Ruby', 'Estelle', 'Sylvia', 'Jim', 'Bob'))

To identify the rows where two columns (state and name) are duplicated within a dataframe, I can use:

ppl[duplicated(ppl[, c(2,3)]),]

However, I am not sure how to extend this to find the duplication between two dataframes. I want to identify the row where TN Ruby exists in both.


Viewing all articles
Browse latest Browse all 447

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>