hi,
I have two bed file like follows:
file1.bed
scaffold11296 36365 36414 7
scaffold11296 36471 36526 3
First column is "Scaffold_ID",second column is "Start", third column is "End" and fourth column total number of reads within that range.
Sorted_genome.bed
scaffold11296 36302 36334 -
scaffold11296 36303 36334 +
scaffold11296 36339 36370 +
scaffold11296 36340 36369 +
scaffold11296 36365 36395 -
scaffold11296 36366 36394 -
scaffold11296 36367 36395 -
scaffold11296 36368 36395 -
scaffold11296 36394 36414 -
scaffold11296 36471 36502 +
scaffold11296 36483 36516 +
scaffold11296 36495 36526 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
Now I would like to find how many of the 7 reads within scaffold11296 36365 36414
has "+" and "-" reads. Similarly for 3 reads within scaffold11296 36471 36526
how many has "+" reads and "-" reads. I tried "bedtools annotate" but it didn’t yield what I want. All I want want like following:
scaffold11296 36365 36414 7 5 2
scaffold11296 36471 36526 3 0 3
where "5" is number of "-" reads and 2 is number of "+" reads within scaffold11296 36365 36414
. Similarly "0" is the number of "-" reads and "3" is the number of "+" reads within scaffold11296 36471 36526 3
.
Kindly guide me.