Hi Jay I've testing this function and have found some odd behaviour. Keep in mind these files have in excess of 100000 lines. If I compare the same file against itself, it works perfectly (both large files I'm testing on). If I compare short (30 line) files which only have a little overlap, again it works perfectly. If I compare the two different large files I get very few matches (13 when I would expect 1000's). Any ideas what would kill it? I'm conscious that one file is not english and might have some odd characters like the Spanish and French use. At a loss now in debugging since it only bombs on really big files (it ends almost instantaniously). Cheers David