Skip to content

Permissiviness on genbank files #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
isaacguerreir opened this issue May 20, 2024 · 0 comments
Open

Permissiviness on genbank files #5

isaacguerreir opened this issue May 20, 2024 · 0 comments

Comments

@isaacguerreir
Copy link

I was testing seqparse on some application with a specific genbank file. One of the tests I made was to remove some parts of the file format and try to trigger a throw error. However it seems Seqparse is very permissive regarding file content, even when is out of format. The output was a sequence with some incongruent caracters, and an unknown type.

I'm using a conditional to check if type is unknown and catch an possible error during file parsing (if(seq.type == 'unknown').

Should this permissiviness be the expected behavior?

STX-10034               2918 bp ds-DNA     linear       23-JAN-2024
DEFINITION  .
ACCESSION   urn.local...t-hbr8b24
KEYWORDS    "Indication:PKU" "CpG Content:140" "Molecular Weight:1896700";
            "Genetic Elements:RES-Base-ITR-v1, spacer_left-ITR_v2,;
            VandenDriessche_PromoterSet, PmeI_site,;
            Mod_Minimum_Consensus_Kozak_v2, hPAH_codop_ORF_v2, PacI_site,;
            WPRE_3pUTR, bGH, spacer_right-ITR_v1, RES-Base-ITR-v1" "Long Form;
            Name:RES-Base-ITR-v1, spacer_left-ITR_v2,;
            VandenDriessche_PromoterSet, PmeI_site,;
            Mod_Minimum_Consensus_Kozak_v2, hPAH_codop_ORF_v2, PacI_site,;
            WPRE_3pUTR, bGH, spacer_right-ITR_v1, RES-Base-ITR-v1";
            "Length:2918" "5' Oligo:SO-300002" "3' Oligo:SO-300002" "Parent;
            Plasmid (SP-#):SP-210174" "5' Cut Site:BsaI" "3' Cut Site:BsaI";
            "Tissue Specificity (Promoter Only):Liver";
            "Comments/Reference:pHK11-412 with WT-ITR-v1 oligo"
            "Name:pHK11-412; with WT-ITR-v1 oligo".
SOURCE      
  ORGANISM  .
FEATURES             Location/Qualifiers
     misc_feature    58..111
                     /standard_name="RES-Base-ITR-v1"
     misc_feature    112..145
                     /standard_name="spacer_left-ITR_v2"
     misc_feature    153..551
                     /standard_name="VandenDriessche_PromoterSet"
     misc_feature    154..225
                     /standard_name="SerpinEnhancer"
     misc_feature    277..460
                     /standard_name="Mouse TTR 5pUTR (NM_013697.5)"
     misc_feature    461..551
                     /standard_name="MVM Intron"
     misc_feature    552..559
                     /standard_name="PmeI_site"
     misc_feature    560..569
                     /standard_name="Mod_Minimum_Consensus_Kozak_v2"
     CDS             570..1928
                     /translation="MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEE
                     VGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDI
                     GATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ
                     FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCG
                     FHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPM
                     YTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLC
                     KQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESF
                     NDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQK
                     IK*"
                     /standard_name="Translation 570-1928"
     misc_feature    570..1928
                     /standard_name="hPAH_codop_ORF_v2"
     misc_feature    1932..1939
                     /standard_name="PacI_site"
     misc_feature    1940..2520
                     /standard_name="WPRE_3pUTR"
     misc_feature    2521..2745
                     /standard_name="bGH"
     misc_feature    2746..2806
                     /standard_name="spacer_right-ITR_v1"
     misc_feature    complement(2807..2860)
                     /standard_name="RES-Base-ITR-v1"
ORIGIN      
    gcgagcgagc gcgcagagag ggagtggcca actccatcac taggggttcc ttgtagttaa
      121 tgattaaccc gccatgctac ttatcgcggc cgcgggggag gctgctggtg aatattaacc
      181 aaggtcaccc cagttatcgg aggagcaaac aggggctaag tccacacgcg tggtaccgtc
      241 tgtctgcaca tttcgtagag cgagtgttcc gatactctaa tctccctagg caaggttcat
      301 atttgtgtag gttacttatt ctccttttgt tgactaagtc aataatcaga atcagcaggt
      361 ttggagtcag cttggcaggg atcagcagcc tgggttggaa ggagggggta taaaagcccc
      421 ttcaccagga gaagccgtca cacagatcca caagctcctg aagaggtaag ggtttaaggg
      481 atggttggtt ggtggggtat taatgtttaa ttacctggag cacctgcctg aaatcacttt
      541 ttttcaggtt ggtttaaacc gcagccacca tgagcaccgc cgtgctggaa aatcctggcc
      601 tgggcagaaa gctgagcgac ttcggccaag agacaagcta catcgaggac aactgcaacc
      661 agaacggcgc catcagcctg atcttcagcc tgaaagaaga agtgggcgcc ctggccaagg
      721 tgctgagact gttcgaagag aacgacgtga acctgacaca catcgagagc agacccagca
      781 gactgaagaa ggacgagtac gagttcttca cccacctgga caagcggagc ctgcctgctc
      841 tgaccaacat catcaagatc ctgcggcacg acatcggcgc cacagtgcac gaactgagcc
      901 gggacaagaa aaaggacacc gtgccatggt tccccagaac catccaagag ctggacagat
      961 tcgccaacca gatcctgagc tatggcgccg agctggacgc tgatcaccct ggctttaagg
     1021 accccgtgta ccgggccaga agaaagcagt ttgccgatat cgcctacaac taccggcacg
     1081 gccagcctat tcctcgggtc gagtacatgg aagaggaaaa gaaaacctgg ggcaccgtgt
     1141 tcaagaccct gaagtccctg tacaagaccc acgcctgcta cgagtacaac cacatcttcc
     1201 cactgctcga aaagtactgc ggcttccacg aggacaatat ccctcagctt gaggacgtgt
     1261 cccagttcct gcagacctgc accggcttta gactgaggcc agttgccgga ctgctgagca
     1321 gcagagattt tctcggcggc ctggccttca gagtgttcca ctgtacccag tacatcagac
     1381 acggcagcaa gcccatgtac acccctgagc ctgatatctg ccacgagctg ctgggacatg
     1441 tgcccctgtt cagcgataga agcttcgccc agttcagcca agagatcgga ctggcttctc
     1501 tgggagcccc tgacgagtac attgagaagc tggccaccat ctactggttc accgtggaat
     1561 tcggcctgtg caagcagggc gacagcatca aagcttatgg cgctggcctg ctgtctagct
     1621 tcggcgagct gcagtactgt ctgagcgaga agcctaagct gctgcccctg gaactggaaa
     1681 agaccgccat ccagaactac accgtgaccg agttccagcc tctgtactac gtggccgaga
     1741 gcttcaacga cgccaaagaa aaagtgcgga acttcgccgc caccattcct cggcctttca
     1801 gcgtcagata cgacccctac acacagcgga tcgaggtgct ggacaacaca cagcagctga
     1861 aaattctggc cgactccatc aacagcgaga tcggcatcct gtgcagcgcc ctgcagaaaa
     1921 tcaagtgata gttaattaag agcatcttac cgccatttat tcccatattt gttctgtttt
     1981 tcttgatttg ggtatacatt taaatgttaa taaaacaaaa tggtggggca atcatttaca
     2041 tttttaggga tatgtaatta ctagttcagg tgtattgcca caagacaaac atgttaagaa
     2101 actttcccgt tatttacgct ctgttcctgt taatcaacct ctggattaca aaatttgtga
     2161 aagattgact gatattctta actatgttgc tccttttacg ctgtgtggat atgctgcttt
     2221 atagcctctg tatctagcta ttgcttcccg tacggctttc gttttctcct ccttgtataa
     2281 atcctggttg ctgtctcttt tagaggagtt gtggcccgtt gtccgtcaac gtggcgtggt
     2341 gtgctctgtg tttgctgacg caacccccac tggctggggc attgccacca cctgtcaact
     2401 cctttctggg actttcgctt tccccctccc gatcgccacg gcagaactca tcgccgcctg
     2461 ccttgcccgc tgctggacag gggctaggtt gctgggcact gataattccg tggtgttgtc
     2521 tgtgccttct agttgccagc catctgttgt ttgcccctcc cccgtgcctt ccttgaccct
     2581 ggaaggtgcc actcccactg tcctttccta ataaaatgag gaaattgcat cgcattgtct
     2641 gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg gggaggattg
     2701 ggaagacaat agcaggcatg ctggggatgc ggtgggctct atggctctag agcatggcta
     2761 cgtagataag tagcatggcg ggttaatcat taactacacc tgcaggagga acccctagtg
     2821 atggagttgg ccactccctc tctgcgcgct cgctcgctca actgaggccg cccgggcaaa
     2881 gcccgggcgt cgggcgacct ttggtcgccc ggcctcag
//

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant