|
Author |
Message |
|
|
Posted Oct 21, 2008 at 3:15:32 AM
Subject: Slow Search
Hi,
I have server with 7TB of storage with ext3. I need a way to speed up my search on the filesystem. Simple find takes ages to search and I cannot use locate because I could not find any option to find files based on filesize with it.
Is there any application which can save filesystem index into a database so that query according to the attributes.
Or please suggest me a best way to find files based on filesize on such a large storage.
Thanks in advance.
|
|
Back to top
|
|
PerlCoder
Joined Jun 30, 2008 Posts: 148
Other Topics
|
Posted:
Oct 25, 2008 7:38:04 PM
I've never used either of these, but maybe they have what you are looking for?
http://beagle-project.org/Main_Page
http://www.lesbonscomptes.com/recoll/
PerlCoder (http://indicium.us)
|
|
Back to top
|
|
Egyptian
Joined Jun 21, 2007 Posts: 5
Other Topics
|
Posted:
Oct 26, 2008 9:42:02 AM
you may need to do some research on the best filesystem to use. there are other filesystems that may be better suited to your usage of the file system. eg. jfs, xfs, reiserfs.
i say this coz not all filesystems are created equally. some filesystems are made with the objective of handling large file sizes (greater than 1 gb) etc.
hope that helps
|
|
Back to top
|
|
Slightcrazed
Joined Apr 30, 2008 Posts: 8
Other Topics
|
Posted:
Oct 27, 2008 1:44:47 PM
[quote=Egyptian]you may need to do some research on the best filesystem to use. there are other filesystems that may be better suited to your usage of the file system. eg. jfs, xfs, reiserfs.
i say this coz not all filesystems are created equally. some filesystems are made with the objective of handling large file sizes (greater than 1 gb) etc.
hope that helps[/quote]
I doubt that he is in a position to change the used FS, and even if he is, that isn't going to speed up search in any perceivable way.
I wrote (but sadly, can't FIND) a script a while back that did find in a multi-threaded way for the same reason. Instead of searching the entire tree with a single thread, it dove into predefined directories and did its job using multiple find processes for each directory, all of which returned results to stdout.
The problem with this method is that A - you need a multiprocessor or multi-core system to handle the individual threads efficiently, and B - you better have a storage back-end with some seriously high I/O limits. If, (and that is a big IF) you're system is processor bound when doing a find, then this method will probably net you some speed. If not, then this probably won't do you much good.
Otherwise, the suggestion of a search indexer like beagle is valid.... though not nearly as flexible as find.
while beer in fridge:
drink(beer)
else:
get(beer)
|
|
Back to top
|
|