On May 21, 2006, at 5:19 PM, David Cottrell wrote: > Hi All > > Can anyone see a way to speed-up the function below? > > What I have is two text files which always start with a word and > are delimited in > some way (usually tabs). I need to build a list of matching lines. > The problem is > that the files are very long (one is all the words in English, the > other Spanish). > > My approach is to build a dynamic array of words in the first list > (fast) and then > scan this list with each item in the other list (what this function > does). The > problem is it takes hours (or appears to). > > Any suggestions welcome. > > Cheers David, I couldn't actually test this, so it may be bug infested, but I think the concept is sound. It optimizes register calculations and comparisons, minimizing reading from and writing to RAM. This should shave a little time off your routine, but the thing that would make the biggest difference would be to sort your gArrayF1 as you're building it. Then you could use a MUCH faster search to see if each word in the other file was there. I expect it would cut the time at least in half--perhaps dramatically more! Is that a possibility? It would likely be worth building a second array of indices and sorting that for the search. I may play with that a little further. Let me know if you need any help with bug extermination. hth, e-e =J= a y " '------------------------------------------------------------ local dim as ptr p, p1, pF1 dim as long pEnd, L, F1item, target local fn scanFIle2 (delim as long) dim as long handleSize if gFile2Hnd = 0 then exit fn handleSize = FN GetHandleSize( gFile2Hnd ) if handleSize < 8 then exit fn HLock(gFile2Hnd) p = [gFile2Hnd] pEnd = p + handleSize - 1 gMatchCount = 0 for p = p to pEnd'loop through each line for p1 = p + 1 to pEnd' Search for end of word (delim) long if p1.0`` < _"a" long if p1.0`` == delim' Found a valid word--process it L = p1 - p - 1 target = ( L << 8 ) + p.0`` pF1 = @gWordArrayF1( 0 ) for F1item = 0 to gNoInIndex'Search for match in gWordArrayF1 long if target = pF1.0%' Len & 1st char match while L > 1 if | p + L | <> | pF1 + L | then exit "nextWord" L -- wend gMatchCount++' Words match gMatchedItems(gMatchCount) = F1item end if "nextWord" pF1 += sizeOf( gWordArrayF1( 0 ) )' Go to next word in array next xelse' Not a valid char for p = p1 + 1 to pEnd' Bad word--move to next if p.0`` == delim then exit for next end if end if next p1 // ??? Can this go before the loop ??? //IF SYSTEM(_sysVers) => 1000 THEN FN KillSpinningCursor p = fn skipToEndOfLine( p )'error next p gDataReady = _True compress dynamic gMatchedItems end fn '------------------------------------------------------