Skip to content

[Function Proposal] Simple Get Content from URL #315

Closed
@rawkintrevo

Description

@rawkintrevo

Function Proposal

Description

We have a tool to do it with a browser, but that's over kill for a lot of usecases.

Use Cases

Any content that isn't JS

Proposed API Interface

class GetUrlContent(BaseTool):  
    def __init__(self, name="get_url_content"):  
        super().__init__()  
        self.name = name  
  
    @property  
    def definition(self):  
        return {  
            "type": "function",  
            "function": {  
                "name": self.name,  
                "description": "Fetches the text content of a given URL. This tool makes a simple GET request and returns the raw text content. It does not render JavaScript or handle complex interactions.",  
                "parameters": {  
                    "type": "object",  
                    "properties": {  
                        "url": {  
                            "type": "string",  
                            "description": "The URL to fetch content from (e.g., 'https://www.example.com')."  
                        }  
                    },  
                    "required": ["url"]  
                }  
            }  
        }  
  
    def fn(self, url: str):  
        logger.debug(f"Attempting to fetch content from URL: {url}")  
        try:  
            # Add a User-Agent header to mimic a browser, which can help with some sites  
            headers = {  
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'  
            }  
            # Add a timeout to prevent the request from hanging indefinitely  
            response = requests.get(url, headers=headers, timeout=15)  
            response.raise_for_status()  # This will raise an HTTPError for bad responses (4xx or 5xx)  
            logger.info(f"Successfully fetched content from URL: {url}")  
            return response.text  
        except requests.exceptions.HTTPError as http_err:  
            logger.error(f"HTTP error occurred while fetching URL {url}: {http_err}")  
            return f"Error: HTTP error occurred - {http_err}"  
        except requests.exceptions.ConnectionError as conn_err:  
            logger.error(f"Connection error occurred while fetching URL {url}: {conn_err}")  
            return f"Error: Connection error occurred - {conn_err}"  
        except requests.exceptions.Timeout as timeout_err:  
            logger.error(f"Timeout occurred while fetching URL {url}: {timeout_err}")  
            return f"Error: Timeout occurred while fetching URL - {timeout_err}"  
        except requests.exceptions.RequestException as req_err:  
            logger.error(f"An error occurred during the request to URL {url}: {req_err}")  
            return f"Error: An error occurred during the request - {req_err}"  
        except Exception as e:  
            logger.error(f"An unexpected error occurred while fetching URL {url}: {e}")  
            return f"Error: An unexpected error occurred - {e}"  

Dependencies

requests

Documentation

docs/web/? Dealers choice

Additional Context

Add any other context or screenshots about the function proposal here.


Contribution Checklist

  • I have read the CONTRIBUTING.md guide
  • I have checked that this function doesn't already exist in the repository
  • I have followed the single responsibility principle
  • I have designed a simple and intuitive API interface
  • I have considered error handling and logging
  • I have outlined clear use cases
  • I have considered documentation requirements

Development Plan

Outline your plan for implementing this function, including:

  • Estimated timeline
  • Required resources
  • Testing strategy
  • Documentation plan

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions