我家有明星四胞胎:寻找该蛋白序列可能的蛋白质结构域和生物功能位点。
来源:百度文库 编辑:查人人中国名人网 时间:2024/10/05 14:52:55
MMATQHTQYPDARLSSPIVLDQCDLVTRACGLYSEYSLNPKLRTCRLPKHIYRLKYDAIV
LRFISDVPVATIPIDYIAPMLINVLADSKNAPLEPPCLSFLDEIVNYTVQDAAFLNYYMN
QIKTQEGVITDQLKQNIRRVIHKNRYLSALFFWHDLSILTRRGRMNRGNVRSTWFVTNEV
VDILGYGDYIFWKIPIALLPMNSANVPHASTDWYQPNIFKEAIQGHTHIISVSTAEVLIM
CKDLVTSRFNTLLIAELARLEDPVSADYPLVDDIQSLYNAGDYLLSILGSEGYQIIKYLE
PLCLAKIQLCSQYTERKGRFLTQMHLAVIQTLRELLLNRGLKKSQLSKIREFHQLLLRLR
STPQQLCELFSIQKHWGHPVLHSEKAIQKVKNHATVLKALRPIIIFETYCVFKYSVAKHF
FDSQGTWYSVISDRCLTPGLNSYIRRNQFPPLPMIKDLLWEFYHLDHPPLFSTKIISDLS
IFIKDRATAVEQTCWDAVFEPNVLGYSPPYRFNTKRVPEQFLEQEDFSIESVLQYAQELR
YLLPQNRNFSFSLKEKELNVGRTFGKLPYLTRNVQTLCEALLADGLAKAFPSNMMVVTER
EQKESLLHQASWHHTSDDFGEHATVRGSSFVTDLEKYNLAFRYEFTAPFIKYCNQCYGVR
NVFDWMHFLIPQCYMHVSDYYNPPHNVTLENREYPPEGPSAYRGHLGGIEGLQQKLWTSI
SCAQISLVEIKTGFKLRSAVMGDNQCITVLSVFPLESSPNEQERCAEDNAARVAASLAKV
TSACGIFLKPDETFVHSGFIYFGPKQYLNGIQLPQSLKTAARMAPLSDAIFDDLQGTLAS
IGTAFERSISETRHILPSRVAAAFHTYFSVRILQHHHLGFHKGSDLGQLAINKPLDFGTI
ALSLAVPQVLGGLSFLNPEKCLYRNLGDPVTSGLFQLKHYLSMVGMSDIFHALVAKSPGN
CSAIDFVLNPGGLNVPGSQDLTSFLRQIVRRSITLSARNKLINTLFHASADLEDELVCKW
LLSSTPVMSRFAADIFSRTPSGKRLQILGYLEGTRTLLASKMISNNAETPILERLRKITL
QRWNLWFSYLDHCDSALMEAIQPIRCTVDIAQILREYSWAHILGGRQLIGATLPCIPEQF
QTTWLKPYEQCVECSSTNNSSPYVSVALKRNVVSAWPDASRLGWTIGDGIPYIGSRTEDK
IGQPAIKPRCPSAALREAIELTSRLTWVTQGSANSDQLIRPFLEARVNLSVQEILQMTPS
HYSGNIVHRYNDQYSPHSFMANRMSNTATRLMVSTNTLGEFSGGGQAARDSNIIFQNVIN
FAVALYDIRFRNTCTSSIQYHRAHIHLTDCCTREVPAQYLTYTTTLNLDLSKYRNNELIY
DSEPLRGGLNCNLSIDSPLMKGPRLNIIEDDLIRLPHLSGWELAKTVLQSIISDSSNSST
DPISSGETRSFTTHFLTYPKIGLLYSFGALISFYLGNTILCTKKIGLTEFLYYLQNQIHN
LSHRSLRIFKPTFRHSSVMSRLMDIDPNFSIYIGGTAGDRGLSDAARLFLRIAISTFLSF
VEEWVIFRKANIPLWVVYPLEGQRPDPPGEFLNRVKSLIVGIEDDKNKGSILSRSEEKCS
SNLVYNCKSTASNFFHASLAYWRGRHRPKKTIGATKATTAPHIILPLGNSDRPPGLDLNQ
SNDTFIPTRIKQIVQGDSRNDRTTTTRLPPQSRSTPTSATEPPTKIYEGSTTYRGKSTDT
HLDEGHNAKEFPFNPHRLVVPFFKLTKDGEYSIEPSPEESRSNIKGLLQHLRTMVDTTIY
CRFTGIVSSMHYKLDEVLWEYNKFESAVTLAEGEGSGALLLIQKYGVKKLFLNTLATEHS
IESEVISGYTTPRMLLSVMPRTHRGELEVILNNSASQITDITHRDWFSNQKNRIPNDVDI
ITMDAETTENLDRSRLYEAVYTIICNHINPKTLKVVILKVFLSDLDGMCWINNYLAPMFG
SGYLIKPITSSARSSEWYLCLSNLLSTLRTTQHQTQANCLHVVQCALQQQVQRGSYWLSH
LTKYTTSRLHNSYIAFGFPSLEKVLYHRYNLVDSRNGPLVSITRHLALLQTEIRELVTDY
NQLRQSRTQTYHFIKTSKGRITKLVNDYLRFELVIRALKNNSTWHHELYLLPELIGVCHR
FNHTRNCTCSERFLVQTLYLHRMSDAEIKLMDRLTSLVNMFPEGFRSSSV
去这里吧,专业分析。http://www.compbio.dundee.ac.uk/~www-jpred/
步骤:
1.1 进入JPred http://www.compbio.dundee.ac.uk/~www-jpred/
1.2 点击Prediction(Submit a protein sequence for secondary structure prediction)
1.3 选择Email结果提交方式(建议)或留空为网页结果显示
1.4 输入蛋白质序列(原始序列)
1.5 选择File format的三个参数
1.6 点击Run提交
1.7 在邮箱中找到结果地址,并在弹出的结果显示界面选择第3项(Your results in HTML can be found here. )、第4项(A simple display of your query sequence and the prediction can be found here.)进行简单结果浏览、第5项(Postscript output can be found here.) 进行图形化输出
PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site :
44 - 46: TcR
160 - 162: TrR
246 - 248: TsR
314 - 316: TeR
331 - 333: TlR
383 - 385: SeK
432 - 434: SdR
472 - 474: StK
514 - 516: TkR
552 - 554: SlK
598 - 600: TeR
624 - 626: TvR
816 - 818: SlK
869 - 871: SvR
996 - 998: SaR
1041 - 1043: SgK
1222 - 1224: TsR
1482 - 1484: TkK
1502 - 1504: ShR
1505 - 1507: SlR
1512 - 1514: TfR
1670 - 1672: SdR
1705 - 1707: TtR
1732 - 1734: TyR
1871 - 1873: TpR
1882 - 1884: ThR
1902 - 1904: ThR
1952 - 1954: TlK
1991 - 1993: SaR
2007 - 2009: TlR
2046 - 2048: TsR
2116 - 2118: TsK
2170 - 2172: SeR
PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site :
99 - 102: SflD
108 - 111: TvqD
233 - 236: StaE
331 - 334: TlrE
488 - 491: TavE
493 - 496: TcwD
552 - 555: SlkE
576 - 579: TlcE
598 - 601: TerE
615 - 618: TsdD
632 - 635: TdlE
726 - 729: SlvE
758 - 761: SpnE
843 - 846: TafE
848 - 851: SisE
962 - 965: SaiD
1088 - 1091: SylD
1185 - 1188: TigD
1195 - 1198: SrtE
1250 - 1253: SvqE
1297 - 1300: TlgE
1419 - 1422: SgwE
1438 - 1441: SstD
1444 - 1447: SsgE
1536 - 1539: TagD
1559 - 1562: SfvE
1613 - 1616: SrsE
1698 - 1701: SrnD
1718 - 1721: SatE
1740 - 1743: ThlD
1776 - 1779: SpeE
1793 - 1796: TmvD
1829 - 1832: TlaE
1902 - 1905: ThrD
1963 - 1966: SdlD
2168 - 2171: TcsE
2184 - 2187: SdaE
PS00001 ASN_GLYCOSYLATION N-glycosylation site :
106 - 109: NYTV
548 - 551: NFSF
686 - 689: NVTL
960 - 963: NCSA
1158 - 1161: NNSS
1248 - 1251: NLSV
1392 - 1395: NLSI
1437 - 1440: NSST
1500 - 1503: NLSH
1528 - 1531: NFSI
1679 - 1682: NQSN
1682 - 1685: NDTF
1892 - 1895: NNSA
2140 - 2143: NNST
2141 - 2144: NSTW
2162 - 2165: NHTR
2166 - 2169: NCTC
PS00008 MYRISTYL N-myristoylation site :
168 - 173: GNvrST
340 - 345: GLkkSQ
425 - 430: GTwySV
585 - 590: GLakAF
707 - 712: GGieGL
836 - 841: GTlaSI
883 - 888: GSdlGQ
959 - 964: GNcsAI
1231 - 1236: GSanSD
1303 - 1308: GGgqAA
1304 - 1309: GGqaAR
1387 - 1392: GGlnCN
1388 - 1393: GLncNL
1462 - 1467: GLlySF
1468 - 1473: GAliSF
1534 - 1539: GGtaGD
1541 - 1546: GLsdAA
1609 - 1614: GSilSR
1653 - 1658: GAtkAT
1675 - 1680: GLdlNQ
1805 - 1810: GIvsSM
PS00003 SULFATION Tyrosine sulfation site :
261 - 275:
edpvsadYplvddiq687 - 701:
vtlenreYppegpsa1373 - 1387:
yrnneliYdseplrg1764 - 1778:
kltkdgeYsiepspePS00009 AMIDATION Amidation site :
1041 - 1044: sGKR
PS00004 CAMP_PHOSPHO_SITE cAMP- and cGMP-dependent protein kinase phosphorylation site :
1076 - 1079: RKiT
PS00007 TYR_PHOSPHO_SITE Tyrosine kinase phosphorylation site :
1374 - 1380: Rnn.Eli.Y
1764 - 1771: KltkDge.Y
1792 - 1800: RtmvDttiY
1935 - 1941: Rly.Eav.Y
PS00029 LEUCINE_ZIPPER Leucine zipper pattern :
1480 - 1501: LctkkigLteflyyLqnqihnL
这个问题还真有点意思。NCBI根据序列相似性,注释为“RNA-dependent RNA polymerase”但却没有相关结构域的注释。