Recent Posts

Pages: 1 ... 5 6 [7] 8 9 10
61
bedops / Re: problem with bedops -n, not an element of
« Last post by sjn on May 22, 2015, 04:24:52 PM »
I tried some quick things here and didn't see a problem with -n/-e 2.

Technically, there needs to be newline at the end of the file.  Could you add a --ec to the call and see if bedops -n 2 tells you what the problem is?

<edit>
Looks like Alex just gave the same advice about the ending newline.  --ec can help to find what's wrong.
62
bedops / Re: problem with bedops -n, not an element of
« Last post by AlexReynolds on May 22, 2015, 04:23:47 PM »
Check that the last line of your sorted BED file (mask.txt) ends with a newline character.

You can use the following command to verify this:

$ tail -1 foo.bed | cat -e

You should see a dollar sign symbol at the end of the line. For example:

$ tail -1 test/vec_test4.bed | cat -e
chr1   3568000   3568150   id-4$


If you don't see that dollar sign, that element does not have the required trailing newline character. See the following thread for suggestions on how to add a newline to the end of a file:

http://unix.stackexchange.com/questions/31947/how-to-add-a-newline-to-the-end-of-a-file
63
bedops / problem with bedops -n, not an element of
« Last post by cohendm on May 22, 2015, 05:55:57 AM »
Running Bedops 2.4.14 on Mac OS 10.8.5. Having problems with the following command:

bedops -n 2 RNAPII.USHP.all mask.txt > RNAPII.USHP.masked

The script runs without error. However, for some reason, the very last interval of my mask.txt file is being ignored by this function. The interval is chrY:59034017-59034077, and is preceded by 1,000,000 other intervals, correctly sorted, and including other intervals on the Y chromosome. The -n function removes everything as expected EXCEPT this very last interval. Puzzled as to why this would be the case. If instead I try the following

bedops --element-of 2 RNAPII.USHP.all mask.txt > intersectwithmask.bed

then I see everything as expected, including the intersecting intervals for the chrY:59034017-59034077 region in the RNAPII.USHP.all bed file. So this is not an issue with data already being absent from the reference file, or a data corruption issue in the mask.txt file.
64
bedops / Re: Bedops Intersect not consistent with Bedtools
« Last post by sjn on May 21, 2015, 06:58:27 AM »
Yes - there is a distinction between a genomic intersection and a genomic element-of (subset) in bedops.

bedops --intersect is a literal coordinate intersection, while bedops --element-of is a subset feature.  With the latter, you will receive the entire element if it meets your overlap criterion and all the extra columns are there.

Consider 2 files:
cat a.bed
chr1  10  100  id-1  2.34  SNV312

cat b.bed
chr1  5   77 file2  4.56  +

If I intersect those with bedops -i a.bed b.bed, I receive:
chr1 10 77

it isn't clear what extra columns should go in this intersected region.  Should it be from a.bed or b.bed?  What if a.bed has multiple elements in the intersection, or what if you use 100 input files instead of just these 2?  Philosophically and generally, the intersected results make up a new BED file with regions that are different from elements found in either a.bed or b.bed as shown above.

If you instead use bedops -e 1 a.bed b.bed, you get:
chr1  10  100  id-1  2.34  SNV312

because the element in a.bed overlaps something in b.bed by 1 bp or more (you can change the overlap criterion from 1 to a different value or use a percentage, like 50%).

If you are looking for the literal coordinate intersection (--intersect/-i) and you want extra columns from 1 (or more) of your input files, then you couple bedops with bedmap.  You first get the literal coordinate intersection with bedops, and then you map on information from the file(s) of interest.  For example,
bedops -i a.bed b.bed | bedmap --echo --echo-map-id --delim "\t" - a.bed

will pull out the information in the 4th column of a.bed and map it onto the intersected regions.  Other columns can be mapped too with --echo-map-score and --echo-map, which maps information from all columns.  --echo-map is the most general, and you sometimes need a small awk or cut statement at the end to pull out particular columns of interest.  We can help with that if needed.

Hope that helps.
65
bedops / Re: Bedops Intersect not consistent with Bedtools
« Last post by alisterd17 on May 21, 2015, 06:36:33 AM »
Thanks, that appears to have solved most of the issues.

There are still a few entries that appear in the bedtools output that don't in the bedops. It has to do with having extra columns in the bed files. One of my input files for the intersect call has 3 additional columns giving information about the region such as accession number and strand orientation.

When I call bedtools with the file having added columns as my -a I get this additional information in my output bedfile. However when I call bedops regardless of file order I don't have these columns in my output file, and only get the first 3 basic columns. My downstream analysis requires this information for certain tasks. Is there any way to keep this information in my output using bedops?
66
bedops / Re: Bedops Intersect not consistent with Bedtools
« Last post by sjn on May 20, 2015, 02:05:26 PM »
The most likely culprit is that you need to sort the files for use with bedops.
sort-bed your-file.bed > your-file.sorted.bed

Then, take the intersection.  The output from bedops will also then be sorted and you can use results immediately for downstream processing.
67
bedops / Bedops Intersect not consistent with Bedtools
« Last post by alisterd17 on May 20, 2015, 01:31:39 PM »
Hi everyone,

I was looking to convert current calls to bedtools intersect in my code to bedops intersect or similar. When I used small size bed files I was able to get identical output between the two functions, knowing where the overlaps were and what the outputs should be beforehand. However when I tried larger files ~ 200000+ lines I received varied output, specifically bedops didn't detect many regions of overlap between the two bed files. only 6 chromosomes were shown to have regions of overlap whilst all of them did in actuality, something that the bedtools output showed.

I made my calls to both programs as follows:

./bedtools intersect -a exome.bait.bed -b cardio.bed
./bedops --intersect cardio.bed exome.bait.bed

I am using bedtools v 2.2.2, and bedops v 2.4.5. I'm not at liberty to attach the files themselves, but if anyone else has had this probelm before or if they have any ideas or suggestions of how to fix it please let me know.

Thanks,
Alister
68
bedmap / Re: fragment length extension
« Last post by cohendm on May 12, 2015, 12:14:57 PM »
Thanks for the fast and detailed reply, Shane. This is a huge help! Will give it a go per your examples.
69
bedmap / Re: fragment length extension
« Last post by sjn on May 12, 2015, 11:32:12 AM »
Hi.  Thanks for joining the forum.

To use BEDOPS in a strand-sensitive way, you'll have to break your analysis up a bit.  If your strand information is in the 6th column, for example:

awk '$6 == "+"' myfile.bed \
  | bedmap --echo --count .... \
 > result.forward-strand.bed

Do the same on the opposing strand, and, if you'd like, you can glue things back together using bedops -u into a final result.

By default, bedmap will map anything in your second file that overlaps the first file's region by 1 bp or more (not necessarily the 5' end).  This seems to do what you want by default, but you can add the --bp-ovr 1 flag explicitly if you like.

When extending a region, you'll want to use bedops --range since you want a non-symmetrical padding.  The --range option with bedmap is a bit different and it's a symmetrical thing.

I'll pretend that you have two files, both of which have strand information in the 6th column.  And, I'll pretend that you want to add 10 bp of upstream padding.
a.bed
b.bed

awk '$6 == "+"' a.bed | bedops -u --range -10:0 > a.plus.bed
awk '$6 == "-"' a.bed | bedops -u --range 0:10 > a.minus.bed

You would need to parse out per-strand information from b.bed if you want strand-sensitive results.  So, your original 2 files becomes 4.

bedmap --echo --count a.plus.bed b.plus.bed > plus.final.bed
bedmap --echo --count a.minus.bed b.minus.bed > minus.final.bed

If you want these results in one file at the end:
bedops -u plus.final.bed minus.final.bed > final.answer.bed

Hope that helps.  Most important to everything in BEDOPS is that your files are sorted properly using the sort-bed program - make sure to do that for a.bed and b.bed before anything else.  The other outputs you generate will be in sorted order, so you don't have to do that again.
Shane
70
bedmap / fragment length extension
« Last post by cohendm on May 12, 2015, 11:17:44 AM »
New to the BEDOPS toolset, and I have a very basic question. The documentation for bedmap indicates that it will count 5' ends of tags that map to regions of interest, and that you can extend the boundaries of that region, if desired, to smoothen the data. I would like to perform an analysis that counts tags that intersect another bed file, but based not solely on 5' end of the tag, but rather the fragment length (i.e. count the tag if any portion of it intersects). Is there an easy way to do this? I want to preserve strand information in this fragment length analysis (that is, I want to extend the 5' end only in one direction by an arbitrary length). Thanks!
Pages: 1 ... 5 6 [7] 8 9 10