Skip to content

Add instructions for embedding #434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
berengerdoneux opened this issue May 3, 2025 · 2 comments
Open

Add instructions for embedding #434

berengerdoneux opened this issue May 3, 2025 · 2 comments

Comments

@berengerdoneux
Copy link

Hi, maybe a feature request but lots of opensource embedding models require additionnal instructions for best performance (ex. nomic-embed-text-v1.5, multilingual-e5-large-instruct).

Is there a way to provide theses instructions ?

Regards.

@prasmussen15
Copy link
Collaborator

Hey, so graphiti has a generic abstract EmbedderClient that is used for our actual calls. It implements embedderClient.create and embedderClient.create_batch methods. If you want to use an embedder that isn't currently supported you can define a new class that inherits from EmbedderClient and implements those methods.

For example, you could make a NomicEmbed(EmbedderClient) class and either have it instantiate the embedding instructions or use some defaults on the nomicEmbed.create call. I can help advise on this and review if you need help implementing this.

Also if you create a new working embedder client for one of these models we would also be happy to merge in your PR so others in the community can use it as well!

@berengerdoneux
Copy link
Author

berengerdoneux commented May 3, 2025

Hi, thanks for your answer.

I looked quickly at the code but I don't think it will be a quick and easy implementation.
Thoses embedder generally require different prompt for document embedding vs retrieval.
Actually a lot of different prompts, depending on the type of source for multilingual-e5-large-instruct https://github.com/microsoft/unilm/blob/9c0f1ff7ca53431fe47d2637dfe253643d94185b/e5/utils.py#L106

I'm not sure how to implement this but it shoud'nt be infeasible I gues.

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants