Applications of BEDOPS > sort-bed

Problem with multiple comments in a bed file

(1/1)

pachkov:
Hi again!

I have got rathe unusual bed file which confuses sort-bed.
This is how it looks like:

#Deleted in new
10      3100141 3100242 HISEQ:130:C2EWUACXX:8:2202:9761:39782   1       -
#Deleted in new
10      3100151 3100252 HISEQ:130:C2EWUACXX:8:2106:1494:66677   1       +
#Deleted in new


sort-bed says the following:

Non-numeric start coordinate.  See line 3 in tmp.bed.
(remember that chromosome names should not contain spaces.)

I think that this is a wrong behaviour. All lines starting with "#" should be skipped.
Is that right?

Best,

Mikhail

AlexReynolds:
Thanks for the report, which we will investigate.

In the meantime, you can also do this to strip comment lines and sort your file:

$ grep -v '^#' unsorted.bed | sort-bed - > sorted.bed

sjn:
sort-bed only removes header lines that start with a '#' (or other supported header lines: see bedmap --help or docs for that list).  In this case, these are not all at the top and they won't be stripped.  It tries to read 'in' as a numeric start coordinate and dies a miserable death.

Alex's suggestion will work well for your input type.

Shane

pachkov:
Thank you both!

I do exactly what Alex suggested but it would be nice to get ignoring all comments in the sort-bed.

Best,

Mikhail

Navigation

[0] Message Index

Go to full version