Hi,
The most effective way to use BEDOPS is to process data by chromosome. Each utility takes a chromosome name as an option. For example:
bedops -e --chrom chr18 file1.bed file2.bed file3.bed file4.bed
This will do work only on chr18. So, looping through every chromosome and running each in parallel is often very effective. This can be done cluster-wide, or as background processes on a multiprocessor machine. Then, you can quickly glue the results back together with a simple call to cat (or starchcat if outputs are in Starch format).
True multi-threading (not multi-processing) is not very effective for most utilities as outputs maintain sorted order. Even if you can calculate items in parallel, it must all be synchronized at the output. We are experimenting with where multithreading can be used effectively, however.
If you have a specific example of what you are trying to achieve, we often have ways to show how to speed things up. For example, bedextract can be much faster than bedops -e when its data preconditions are met. Also, bedmap --faster can be a huge help in cases where it applies, as can piping commands efficiently, sometimes using named pipes. We do take great care in keeping our tools general and fast, so we are definitely interested in cases where the tools are not performing at speeds that you find acceptable. Please send details.
We are also looking to have some pretty big speedups in our next release of BEDOPS for several utilities.