Persian-Keyword-Extraction

A hybrid syntactic-statistical method for extracting keywords from Persian texts.

Methodology

Preprocessing: Normalize text (e.g., unify spellings).
Processing:
- POS tagging using Hazm.
- Generate noun/adjective combinations.
- Stemming and frequency counting.
Selection: Pick top 5 combinations by frequency.

Usage

Install dependencies:

pip install -r requirements.txt
Run the pipeline

from preprocessing import normalize_persian_text from processing import generate_combinations from selection import select_keywords

text = open("data/sample_input.txt", encoding="utf-8").read() normalized = normalize_persian_text(text) combinations = generate_combinations(normalized) keywords = select_keywords(combinations) print(keywords)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
preprocessing.py		preprocessing.py
processing.py		processing.py
requirements.txt		requirements.txt
selection.py		selection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Persian-Keyword-Extraction

Methodology

Usage

About

Releases

Packages

Languages

License

msaeidm/Persian-Keyword-Extraction

Folders and files

Latest commit

History

Repository files navigation

Persian-Keyword-Extraction

Methodology

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages