Thursday, July 07, 2005

more file procecssing stats

So I can't get over the fact that I want to write the GUI in Swing. I had the idea to use Ruby on the back end to get the list of files, then write the whole list to a file. Then use Java to pickup the file, parse it, search (per user input) for the file and use Swing as the GUI! All, of course, within a Ruby script. Sound impossible? Check it out:


time ruby finderPrinter.rb
17370

real 0m0.598s
user 0m0.427s
sys 0m0.164s

So that's 17370 files, written to disk in .598 seconds! Beautiful! File is ~1.5 meg. How about how fast Java can process the file?


java -classpath . JReader
17370
took 169 milliseconds
Looked for 'a' in all files: 17258 169 milliseconds


Incredible. Here's the java code:

public void readFile(String fileName) throws FileNotFoundException{
//see how long it takes to read in the file
//iterate the list
//then search through the list for a given name
long timeStarted = System.currentTimeMillis();
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String currentLine = null;
try{
while(reader.ready()){
currentLine = reader.readLine();
fileList.add(currentLine);
}
reader.close();
}
catch(IOException ex){
ex.printStackTrace();
}

for(Iterator fileIter = fileList.iterator();fileIter.hasNext();){
String filePath = (String)fileIter.next();
if(filePath.indexOf("a") > 0){
resultList.add(filePath);
}
}
System.out.println(fileList.size());
System.out.println("took " + ((System.currentTimeMillis() - timeStarted)) + " milliseconds");
System.out.println("Looked for 'a' in all files: " + resultList.size() + " " + ((System.currentTimeMillis() - timeStarted)) + "milliseconds");
}


Pretty simple, but that's exactly what I want. The two Collections (fileList and resultList) are ArrayLists. If it operates this quickly, it *should* be acceptable to get a GUI in place that can execute with these ideas. Real kicker is to run the Ruby script that then kicks off the java code...


Finding files
found
17373
done writing file, now kicking off java process
17373
took 143 milliseconds
Looked for 'a' in all files: 17260 143milliseconds

real 0m0.895s
user 0m0.630s
sys 0m0.211s

Acceptable? Just have to code the gui to find out.

No comments: