Value error when max_stopword_similarity too low in extract_terms method #29

loctimize · 2022-08-17T05:06:16Z

When the max_stopword_similarity value passed to extract_terms method is too low, e. g. .10, no terms might be found at all. This results in the following error being raised in term_extractor.py line 124.

raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Suggestion:
Check if top_spans actually contains any term candidate by wrapping lines 124-132 in an if condition:

        **if len(top_spans) > 0:**
            if collapse_similarity is True:
                top_spans = self._collapse_similarity(top_spans)
    
            for i, span in enumerate(top_spans):
                span._.span_id = i
            top_spans = sorted(top_spans, key=lambda span: span._.span_id)
    
            if return_as_table is True:
                top_spans = self._return_as_table(top_spans)
        return top_spans

Does this make sense to you?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Value error when max_stopword_similarity too low in extract_terms method #29

Value error when max_stopword_similarity too low in extract_terms method #29

loctimize commented Aug 17, 2022

Value error when max_stopword_similarity too low in extract_terms method #29

Value error when max_stopword_similarity too low in extract_terms method #29

Comments

loctimize commented Aug 17, 2022