You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am seeking guidance to resolve an apparent inconsistency in my analysis. While learning to use hifiasm for a genome assembly project, I noticed two results that I might be misunderstanding:
k-mer profile: The k-mer coverage distribution shows a single peak at ~16X ([M::ha_ft_gen] peak_hom: 16), which initially suggested a homozygous genome based on standard k-mer theory. Heterozygosity statistics: However, the log reports a high heterozygosity ratio ([M::stat] heterozygous:homozygous bases ≈289:1), which typically implies a highly heterozygous genome with bimodal k-mer peaks.
How these observations might coexist under hifiasm’s internal model?
Whether adjustments to parameters (e.g., --hom-cov) could reconcile this discrepancy?
I would greatly appreciate any insights to improve my understanding and ensure proper assembly configuration.
The text was updated successfully, but these errors were encountered:
darkxin12
changed the title
[hifiasm_0.25.0-r726] Inconsistent homozygous coverage threshold (29X) vs k-mer peak (16X) in highly heterozygous genome
[hifiasm_0.25.0-r726] Single k-mer peak (16X) contradicts high heterozygosity ratio (289:1) in assembly
Mar 21, 2025
Hi,
you have a nice shoulder around 30-32x coverage, so I would not classify this distribution as "single peak". It is rather quite heterozygous (with the homozygous k-mers at 32x or so) as you say.
Since this is close to the homozygous coverage we discussed, do I need to reset the --hom-cov value and rerun Hifiasm, or is the current setting sufficient?
Thanks for the suggestion!
In our current phased assembly (haplotype-resolved assembly), the primary contigs (p_ctg) exhibit a 20% size increase compared to the last version of reference genome (Ref: 1.6 Gb vs. Assembly: 2.0 Gb).
Specifically, haplotype 1 (hap1) approximates the reference size at 1.6 Gb, whereas haplotype 2 (hap2) reaches 2.0 Gb.
I am testing smaller values for -s.
I am seeking guidance to resolve an apparent inconsistency in my analysis. While learning to use hifiasm for a genome assembly project, I noticed two results that I might be misunderstanding:
k-mer profile: The k-mer coverage distribution shows a single peak at ~16X ([M::ha_ft_gen] peak_hom: 16), which initially suggested a homozygous genome based on standard k-mer theory.
Heterozygosity statistics: However, the log reports a high heterozygosity ratio ([M::stat] heterozygous:homozygous bases ≈289:1), which typically implies a highly heterozygous genome with bimodal k-mer peaks.
[M::ha_analyze_count] lowest: count[5] = 11210960
[M::ha_analyze_count] highest: count[16] = 73404125
[M::ha_hist_line] 2: ********** 7517307
[M::ha_hist_line] 3: ******** 5950814
[M::ha_hist_line] 4: *********** 7923837
[M::ha_hist_line] 5: *************** 11210960
[M::ha_hist_line] 6: ********************* 15651122
[M::ha_hist_line] 7: ***************************** 21069458
[M::ha_hist_line] 8: ************************************** 27882266
[M::ha_hist_line] 9: ************************************************ 35166958
[M::ha_hist_line] 10: *********************************************************** 43039027
[M::ha_hist_line] 11: ********************************************************************* 50968341
[M::ha_hist_line] 12: ******************************************************************************** 58504231
[M::ha_hist_line] 13: ***************************************************************************************** 65161626
[M::ha_hist_line] 14: ************************************************************************************************ 70143067
[M::ha_hist_line] 15: *************************************************************************************************** 72966931
[M::ha_hist_line] 16: **************************************************************************************************** 73404125
[M::ha_hist_line] 17: ************************************************************************************************* 71437531
[M::ha_hist_line] 18: ******************************************************************************************** 67359635
[M::ha_hist_line] 19: ************************************************************************************ 61349348
[M::ha_hist_line] 20: ************************************************************************** 54563791
[M::ha_hist_line] 21: ***************************************************************** 47861182
[M::ha_hist_line] 22: ********************************************************* 41575691
[M::ha_hist_line] 23: ************************************************* 36265920
[M::ha_hist_line] 24: ******************************************** 32037788
[M::ha_hist_line] 25: *************************************** 28721116
[M::ha_hist_line] 26: ************************************ 26486023
[M::ha_hist_line] 27: ********************************** 25044742
[M::ha_hist_line] 28: ********************************* 24079502
[M::ha_hist_line] 29: ******************************** 23632477
[M::ha_hist_line] 30: ******************************** 23467603
[M::ha_hist_line] 31: ******************************** 23153048
[M::ha_hist_line] 32: ******************************* 23015716
[M::ha_hist_line] 33: ****************************** 22380247
[M::ha_hist_line] 34: ***************************** 21389297
[M::ha_hist_line] 35: **************************** 20376062
[M::ha_hist_line] 36: ************************** 19146936
[M::ha_hist_line] 37: ************************ 17626954
[M::ha_hist_line] 38: ********************** 15984342
[M::ha_hist_line] 39: ******************** 14378007
[M::ha_hist_line] 40: ***************** 12787803
[M::ha_hist_line] 41: *************** 11132014
[M::ha_hist_line] 42: ************* 9605896
[M::ha_hist_line] 43: *********** 8143294
[M::ha_hist_line] 44: ********* 6926763
[M::ha_hist_line] 45: ******** 5801225
[M::ha_hist_line] 46: ******* 4850012
[M::ha_hist_line] 47: ****** 4048750
[M::ha_hist_line] 48: ***** 3410010
[M::ha_hist_line] 49: **** 2845662
[M::ha_hist_line] 50: *** 2418978
[M::ha_hist_line] 51: *** 2045373
[M::ha_hist_line] 52: ** 1798807
[M::ha_hist_line] 53: ** 1573588
[M::ha_hist_line] 54: ** 1425472
[M::ha_hist_line] 55: ** 1288114
[M::ha_hist_line] 56: ** 1208866
[M::ha_hist_line] 57: ** 1117988
[M::ha_hist_line] 58: * 1034859
[M::ha_hist_line] 59: * 973983
[M::ha_hist_line] 60: * 928617
[M::ha_hist_line] 61: * 856342
[M::ha_hist_line] 62: * 814749
[M::ha_hist_line] 63: * 773728
[M::ha_hist_line] 64: * 733380
[M::ha_hist_line] 65: * 684491
[M::ha_hist_line] 66: * 661488
[M::ha_hist_line] 67: * 630514
[M::ha_hist_line] 68: * 609039
[M::ha_hist_line] 69: * 575526
[M::ha_hist_line] 70: * 555132
[M::ha_hist_line] 71: * 535845
[M::ha_hist_line] 72: * 517110
[M::ha_hist_line] 73: * 497096
[M::ha_hist_line] 74: * 473363
[M::ha_hist_line] 75: * 449207
[M::ha_hist_line] 76: * 429007
[M::ha_hist_line] 77: * 410004
[M::ha_hist_line] 78: * 400178
[M::ha_hist_line] 79: * 385339
[M::ha_hist_line] 80: * 375382
[M::ha_hist_line] rest: ************************** 19199508
[M::ha_analyze_count] left: none
[M::ha_analyze_count] right: none
[M::ha_ft_gen] peak_hom: 16; peak_het: -1
[M::ha_ct_shrink::930.6807.59] ==> counted 19574890 distinct minimizer k-mers
[M::ha_ft_gen::943.417[email protected]] ==> filtered out 19574890 k-mers occurring 80 or more times
[M::ha_opt_update_cov] updated max_n_chain to 100
[M::yak_count] collected 1440731874 minimizers
[M::ha_pt_gen::1164.312*11.07] ==> counted 83289253 distinct minimizer k-mers
[M::ha_pt_gen] count[4095] = 0 (for sanity check)
……
[M::stat] # heterozygous bases: 3820503823; # homozygous bases: 13212505
Could you help me understand:
I would greatly appreciate any insights to improve my understanding and ensure proper assembly configuration.
hifiasm2.log
The text was updated successfully, but these errors were encountered: