Metadata
cEP	9
Version	0.1
Title	Extract and utilize information from project files
Authors	Satwik Kansal mailto:satwikkansal@gmail.com
Status	Active
Type	Feature

Abstract

There are many kinds of project files like ".editorconfig", "Gruntfile.js", etc. that contain meta-information that can used by tools like coala-quickstart to provide better analysis. This cEP proposes a mechnaism to represent and extract information from project files and an interface to utilize the extracted information.

Proposed Approach

1. Implementation of the “Info” class

A subclass of Info class will represent a specific kind of information. For example, the class IgnorePathInfo may represent “ignored file paths”.

class Info:
    description = "Some information"
    # for validation
    value_type = ("data-type of the information value, can also be an ``Enum``"
                  "of possible choices.")

    def __init__(self,
                 source,
                 value: value_type
                 scope=None):
        """
        :param source: Source from where the information is discovered.
        :param value:  Value of the infomation to be represented.
        :param scope:  Scope of the applicability of the information. Some
                       of the values may be "bear-specific" or "global".
        """
        self.source = source
        self._value = value

        # optional metadata
        self.scope = scope

    @property
    def value(self):
        return self._value

    @property
    def name(self):
        return self.__class__.__name__

2. Implementation of “InfoExtractor” class

Any derived class of the InfoExtractor abstract class consists of:

Methods for parsing the project file.
Methods to find information in the parsed file.
A collection of Info objects representing the information extracted from a particular kind of project file.
Files to parse and extract information from.

from coalib.parsing.Globbing import glob

class InfoExtractor:
    # Kinds of files supported by the InfoExtractor
    supported_files = []

    # Meta-data
    # A list of ``Info`` classes relevant to the class
    supported_information_kinds = []

    def __init__(self,
                 target_files,
                 project_directory):
        """
        :param target_files:      list of file globs to extract information
                                  from.
        :param project_directory: Absolute path to project directory in which
                                  the target files will be searched.
        """
        self.target_files = target_files
        self.directory = project_directory
        self._information = dict()

    @property
    def information(self):
        """
        Return extracted information (if any)
        """
        return self._information

    def parse_file(self, filename):
        """
        Parses the given file and returns the parsed file.
        """
        raise NotImplementedError

    def _add_info(self, info):
        """
        Method to organize and add the supplied information in self.information attribute.

        :param info: Collection of ``Info`` instances.
        """
        pass

    def extract_information(self):
        """
        Extracts the information, saves in the object and returns it.
        """
        filenames = retrieve_files(self.target_files)

        for fname in filenames:
            pfile = self.parse_file(fname)
            file_info = find_information(pfile)
            if file_info:
                self._add_info(fname, file_info)

        return self.information

    @staticmethod
    def find_information(self, fname, parsed_file):
        """
        Returns a collection of ``Info`` instances.
        """
        raise NotImplementedError

    @staticmethod
    def retrieve_files(file_globs, directory):
        """
        Returns matched filenames acoording to the list of file globs and
        supported files of the extractor.
        """
        matched_files = []

        for g in file_globs:
            matched_files.append(glob(g))

        return matched_files

Example

import json


class LicenseUsedInfo(Info):
    description = "License of the project."
    value_type = Enum('MIT', 'Apache 2.0', 'GNU GPL')

class PackageJSONInfoExtractor(InfoExtractor):
    supported_files = "package.json"
    supported_information_kinds = [LicenseUsedInfo]

    def parse_file(self, filename):
        return json.loads(filename)

    def find_information(self, fname, parsed_file):
        if parsed_file.get("license"):
            return LicenseUsedInfo(
                fname,
                parsed_file["license"])


project_dir = "any_project_dir"
pjson_info =  PackageJSONInfoExtractor(
    "package.json",
    project_dir).extract_information()
license_used = pjson_info.get("LicenseUsedInfo")

Another example

class LinterUsedInfo(Info):
    description = "Linter used in project."
    value_type = str
    def __init__(self,
                 source,
                 value: value_type,
                 is_supported_by_coala=True):
        super().__init__(source, value)
        self.is_supported_by_coala = is_supported_by_coala

class GruntfileExtractor(InfoExtractor):
    def parse_file(self, filename):
        # parse the JS file using some library function
        parsed_file = parse_javascript(filename)
        return parsed_file

    def find_information(self, fname, parsed_file):
        # extract lint subtasks that coala supports from parsed Gruntfile.js
        linters_used = ["jshint", "csslint"]

        return [LinterUsedInfo(fname, l, True) for l in linters_used]


gfe = GruntfileExtractor("Gruntfile.js", project_dir)
gfe.extract_information()
coala_supported_linters = gfe.information.get("LinterUsedInfo")

Possible Applications

This mechanism can be used by coala-quickstart to automatically extract some of the setting values for the bears and also to suggest project-relevant bears.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cEP-0009.md

cEP-0009.md

Abstract

Proposed Approach

1. Implementation of the “Info” class

2. Implementation of “InfoExtractor” class

Example

Possible Applications

Files

cEP-0009.md

Latest commit

History

cEP-0009.md

File metadata and controls

Abstract

Proposed Approach

1. Implementation of the “Info” class

2. Implementation of “InfoExtractor” class

Example

Possible Applications