pybedtools文档--读取和写出

一、读取

pybedtools主要是使用BedTool对所有参考格式进行读取, 不但能够读取bed,gff, gtf,还可以读取gz等格式。

from pybedtools import BedTool

snps = BedTool('snps.bed.gz') # [1]

genes = BedTool('hg19.gff') # [1]



pybedtools.bedtool.BedTool

class pybedtools.bedtool.BedTool(fn=Nonefrom_string=Falseremote=False)[source]

__init__(fn=Nonefrom_string=Falseremote=False)[source]

Wrapper around Aaron Quinlan's BEDtools suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.

fn is typically the name of a BED-like file, but can also be one of the following:

a string filename

another BedTool object

an iterable of Interval objects

an open file object

a "file contents" string (see below)

If from_string is True, then you can pass a string that contains the contents of the BedTool you want to create. This will treat all spaces as TABs and write to tempfile, treating whatever you pass as fn as the contents of the bed file. This also strips empty lines.

Typical usage is to point to an existing file:

a=BedTool('a.bed')

But you can also create one from scratch from a string:

>>> s='''... chrX  1  100... chrX 25  800... '''>>> a=BedTool(s,from_string=True)

Or use examples that come with pybedtools:

>>> example_files=pybedtools.list_example_files()>>> assert'a.bed'inexample_files>>> a=pybedtools.example_bedtool('a.bed')


二、写出

如果你想要保存结果为一个句柄,以便后续使用,使用BedTool.saveas()方法。通过Bedtool复制该文件进行操作。这个方法同事也会让你有选择的上传UCSC基因组浏览器的一个特征,而不是打开这些文件后,手动添加trackline。

>>> c=a_with_b.saveas('intersection-of-a-and-b.bed',trackline='track name="a and b"')

>>> print(c.fn)

intersection-of-a-and-b.bed

>>> # opening the underlying file shows the track line

>>> print(open(c.fn).read())

track name="a and b

"chr1        155    200    feature2        0      +chr1        155    200    feature3        0      -chr1        900    901    feature4        0      +

>>> # printing file-based BedTool objects will not print the track line

>>> print(c)

chr1        155    200    feature2        0      +chr1        155    200    feature3        0      -chr1        900    901    feature4        0      +

值得注意的是BedTool.saveas()方法是不是返回一个新的BedTool目标,这个目标指向于硬盘上新创建的文件。也可以允许你在你的多命令链中插入该命令。



BedTool.saveas(*args**kwargs)[source]

Make a copy of the BedTool.

Optionally adds trackline to the beginning of the file.

Optionally compresses output using gzip.

if the filename extension is .gz, or compressed=True, the output is compressed using gzip

Returns a new BedTool for the newly saved file.

A newline is automatically added to the trackline if it does not already have one.

Example usage:

>>> a=pybedtools.example_bedtool('a.bed')>>> b=a.saveas('other.bed')>>> b.fn'other.bed'>>> print(b==a)True

>>> b=a.saveas('other.bed',trackline="name='test run' color=0,55,0")>>> open(b.fn).readline()"name='test run' color=0,55,0\n">>> ifos.path.exists('other.bed'):... os.unlink('other.bed')



另外,如果你不想要加入一个track line,你也可以使用 BedTool.moveto() ,这个方法比较快,比较适合大文件。这个命名是重命名,而不是进行复制,也就意味着,如果试图使用原来的文件,就不会奏效,因为那个文件已经补存在了。

慎用

>>> d=a_with_b.moveto('another_location.bed')



BedTool.moveto(*args**kwargs)[source]¶

Move to a new filename (can be much quicker than BedTool.saveas())

Move BED file to new filename, fn.

Returns a new BedTool for the new file.

Example usage:

>>> # make a copy so we don't mess up the example file>>> a=pybedtools.example_bedtool('a.bed').saveas()>>> a_contents=str(a)>>> b=a.moveto('other.bed')>>> b.fn'other.bed'>>> b==a_contentsTrue


你可能感兴趣的:(pybedtools文档--读取和写出)