Biopython序列比对-Toy模板网

这篇具有很好参考价值的文章主要介绍了Biopython序列比对。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

从InterPro网站（https://www.ebi.ac.uk/interpro/download/Pfam/）下载多序列比对文件Pfam-A.seed.gz（含多个多序列比对）

wget https://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.seed.gz

解压，取第一个多多序列比对文件

cat Pfam-A.seed | while read line; do if [[ ${line} != "//" ]]; then echo ${line}; else; echo ${line}; break; fi; done > Pfam-A-1.seed

InterPro 通过将蛋白质分类为家族并预测结构域和重要位点，对蛋白质进行功能分析。为了以这种方式对蛋白质进行分类，InterPro 使用了由组成 InterPro 联盟的几个不同数据库（称为成员数据库）提供的预测模型（称为特征）。我们将这些成员数据库中的蛋白质特征整合到一个单一的可搜索资源中，利用它们各自的优势来生成一个强大的集成数据库和诊断工具。

from Bio import AlignIO
align_file = "/path_to_file/Pfam-A-1.seed"
### 1. 读取序列比对文件
## read方法用于读取给定文件中可用的单个比对数据。
# 文件格式为 Stockholm
align = AlignIO.read(open(align_file), "stockholm")
# 常见的多序列比对格式还有 "clustal" "phylip"等
print("Alignment length %i" % align.get_alignment_length())
for record in align:
    print(record.seq + " " + record.id)

## parse方法返回可迭代的对齐对象，可以对其进行迭代以获得实际的对齐方式
alignments = AlignIO.parse(open(align_file), "stockholm") 
print(alignments) 

for alignment in alignments: 
    print(alignment)

### 2. 双序列比对
from Bio import pairwise2
from Bio.Seq import Seq 
seq1 = Seq("ACCGGT") 
seq2 = Seq("ACGT")

alignments = pairwise2.align.globalxx(seq1, seq2)
print(alignments)

for alignment in alignments: 
    print(alignment)

## 格式化输出
from Bio.pairwise2 import format_alignment 
alignments = pairwise2.align.globalxx(seq1, seq2) 
for alignment in alignments: 
    print(format_alignment(*alignment)) 

### 3. Biopython通过Bio.Align.Applications模块为许多序列比对工具提供接口。
from Bio.Align.Applications import ClustalwCommandline

参考
https://www.yiibai.com/biopython/biopython_sequence_alignments.html
https://biopython.org/wiki/AlignIO文章来源地址https://www.toymoban.com/news/detail-650433.html

到了这里，关于Biopython序列比对的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！