Skip to content

Commit a4abd28

Browse files
committed
Add IEnumerable and more 'data management' docs
1 parent 80babed commit a4abd28

File tree

10 files changed

+176
-14
lines changed

10 files changed

+176
-14
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1010
Add:
1111

1212
- Added `VectorTextResultItem.Id` property so it's easy to get the database ID for search results if necessary.
13+
- `IVectorDatabase` now inherits from `IEnumerable` so you can easily look through the texts documents that have been added to the database.
1314

1415
Fixed:
1516

docs/docs/get-started/data-management/index.md

Lines changed: 79 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ title: Data Management
66

77
Since `Build5Nines.SharpVector` is a database, it also has data management methods available. These methods enable you to add, remove, and update the text documents that are vectorized and indexed within the semantic database.
88

9-
## Get Text ID
9+
## Get Text Item IDs
1010

1111
Every text item within a `Build5Nines.SharpVector` database is assigned a unique identifier (ID). There are a few ways to get access to the ID of the text items.
1212

@@ -22,10 +22,86 @@ Every text item within a `Build5Nines.SharpVector` database is assigned a unique
2222

2323
=== ".Search()"
2424

25-
25+
When you perform a semantic search, the search results will contain the list of texts; each have an ID property.
2626

27-
## Update Text and Metadata
27+
```csharp
28+
var results = vdb.Search("query text");
29+
30+
foreach(var text in results.Texts) {
31+
var id = text.Id;
32+
var text = text.Text;
33+
var metadata = text.Metadata;
34+
// do something here
35+
}
36+
```
37+
38+
=== "Enumerator"
39+
40+
The `IVectorDatabase` classes implement `IEnumerable` so you can easily loop through all the text items that have been added to the database.
41+
42+
```csharp
43+
foreach(var item in vdb) {
44+
var id = item.Id;
45+
var text = item.Text;
46+
var metadata = item.Metadata;
47+
var vector = item.Vector;
48+
49+
// do something here
50+
}
51+
```
52+
53+
## Get
54+
55+
If you know the `id` of a Text item in the database, you can retrieve it directly.
56+
57+
### Get By Id
58+
59+
The `.GetText` method can be used to retrieve a text item from the vector database directly.
60+
61+
```csharp
62+
vdb.GetText(id);
63+
```
64+
65+
## Update
66+
67+
Once text items have been added to the database "Update" methods can be used to modify them.
68+
69+
### Update Text
70+
71+
The `.UpdateText` method can be used to update the `Text` value, and associated vectors will be updated.
72+
73+
```csharp
74+
vdb.UpdateText(id, newTxt);
75+
```
76+
77+
When the `Text` is updated, new vector embeddings are generated for the new text.
78+
79+
### Update Metadata
80+
81+
The `.UpdateTextMetadata` method can be used to update the `Metadata` for a given text item by `Id`.
82+
83+
```csharp
84+
vdb.UpdateTextMetadata(id, newTxt);
85+
```
86+
87+
When `Metadata` is updated, the vector embeddings are not updated.
88+
89+
### Update Text and Metadata
90+
91+
The `.UpdateTextAndMetadata` method can be used to update the `Text` and `Metadata` for a text item in the database for the given text item `Id`.
2892

2993
```csharp
3094
vdb.UpdateTextAndMetadata(id, newTxt, newMetadata);
3195
```
96+
97+
## Delete
98+
99+
The vector database supports the ability to delete text items.
100+
101+
### Delete Text
102+
103+
The `.DeleteText` method can be used to delete a text item form the database for the given `Id'.
104+
105+
```csharp
106+
vdb.DeleteText(id);
107+
```

docs/mkdocs.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -139,8 +139,10 @@ nav:
139139
- Basic Example: get-started/#basic-example
140140
- Data Management:
141141
- get-started/data-management/index.md
142-
- Get Text ID: get-started/data-management/#get-text-id
143-
- Update Text and Metadata: get-started/data-management/#update-text-and-metadata
142+
- Get Text Item IDs: get-started/data-management/#get-text-item-ids
143+
- Get By Id: get-started/data-management/#get-by-id
144+
- Update: get-started/data-management/#update
145+
- Delete: get-started/data-management/#delete
144146
- Concepts:
145147
- concepts/index.md
146148
- What is a Vector Database?: concepts/#what-is-a-vector-database

src/Build5Nines.SharpVector/IVectorDatabase.cs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ namespace Build5Nines.SharpVector;
99
/// <typeparam name="TMetadata"></typeparam>
1010
/// <typeparam name="TDocument"></typeparam>
1111
public interface IVectorDatabase<TId, TMetadata, TDocument>
12+
: IEnumerable<IVectorTextDatabaseItem<TId, TDocument, TMetadata>>
1213
where TId : notnull
1314
{
1415
/// <summary>

src/Build5Nines.SharpVector/MemoryVectorDatabaseBase.cs

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
using System.Text.Json;
1111
using Build5Nines.SharpVector.Embeddings;
1212
using System.Runtime.ExceptionServices;
13+
using System.Collections;
1314

1415
namespace Build5Nines.SharpVector;
1516

@@ -145,8 +146,14 @@ public void UpdateTextMetadata(TId id, TMetadata metadata) {
145146
if (VectorStore.ContainsKey(id))
146147
{
147148
var existing = VectorStore.Get(id);
148-
existing.Metadata = metadata;
149-
VectorStore.Set(id, existing);
149+
150+
var item = new VectorTextItem<TVocabularyKey, TMetadata>(
151+
existing.Text,
152+
metadata,
153+
existing.Vector
154+
);
155+
156+
VectorStore.Set(id, item);
150157
}
151158
else
152159
{
@@ -340,6 +347,16 @@ public virtual void DeserializeFromBinaryStream(Stream stream)
340347
}
341348
DeserializeFromBinaryStreamAsync(stream).Wait();
342349
}
350+
351+
public IEnumerator<IVectorTextDatabaseItem<TId, TVocabularyKey, TMetadata>> GetEnumerator()
352+
{
353+
return VectorStore.Select(kvp => new VectorTextDatabaseItem<TId, TVocabularyKey, TMetadata>(kvp.Key, kvp.Value.Text, kvp.Value.Metadata, kvp.Value.Vector)).GetEnumerator();
354+
}
355+
356+
IEnumerator IEnumerable.GetEnumerator()
357+
{
358+
return GetEnumerator();
359+
}
343360
}
344361

345362

@@ -472,8 +489,14 @@ public void UpdateTextMetadata(TId id, TMetadata metadata) {
472489
if (VectorStore.ContainsKey(id))
473490
{
474491
var existing = VectorStore.Get(id);
475-
existing.Metadata = metadata;
476-
VectorStore.Set(id, existing);
492+
493+
var item = new VectorTextItem<string, TMetadata>(
494+
existing.Text,
495+
metadata,
496+
existing.Vector
497+
);
498+
499+
VectorStore.Set(id, item);
477500
}
478501
else
479502
{
@@ -660,4 +683,13 @@ public virtual void DeserializeFromBinaryStream(Stream stream)
660683
DeserializeFromBinaryStreamAsync(stream).Wait();
661684
}
662685

686+
public IEnumerator<IVectorTextDatabaseItem<TId, string, TMetadata>> GetEnumerator()
687+
{
688+
return VectorStore.Select(kvp => new VectorTextDatabaseItem<TId, string, TMetadata>(kvp.Key, kvp.Value.Text, kvp.Value.Metadata, kvp.Value.Vector)).GetEnumerator();
689+
}
690+
691+
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
692+
{
693+
return this.GetEnumerator();
694+
}
663695
}

src/Build5Nines.SharpVector/VectorStore/IVectorStore.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ public interface IVectorStore<TId, TMetadata, TDocument>
2121
/// <param name="id"></param>
2222
/// <returns></returns>
2323
/// <exception cref="KeyNotFoundException"></exception>
24-
VectorTextItem<TDocument, TMetadata> Get(TId id);
24+
IVectorTextItem<TDocument, TMetadata> Get(TId id);
2525

2626
/// <summary>
2727
/// Gets all the Ids for every text.
@@ -51,7 +51,7 @@ public interface IVectorStore<TId, TMetadata, TDocument>
5151
/// <param name="id"></param>
5252
/// <returns>The removed text item</returns>
5353
/// <exception cref="KeyNotFoundException"></exception>
54-
VectorTextItem<TDocument, TMetadata> Delete(TId id);
54+
IVectorTextItem<TDocument, TMetadata> Delete(TId id);
5555

5656
/// <summary>
5757
/// Checks if the database contains a key

src/Build5Nines.SharpVector/VectorStore/MemoryDictionaryVectorStore.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ public async Task SetAsync(TId id, VectorTextItem<TDocument, TMetadata> item)
6262
/// <param name="id"></param>
6363
/// <returns></returns>
6464
/// <exception cref="KeyNotFoundException"></exception>
65-
public VectorTextItem<TDocument, TMetadata> Get(TId id)
65+
public IVectorTextItem<TDocument, TMetadata> Get(TId id)
6666
{
6767
if (_database.TryGetValue(id, out var entry))
6868
{
@@ -77,7 +77,7 @@ public VectorTextItem<TDocument, TMetadata> Get(TId id)
7777
/// <param name="id"></param>
7878
/// <returns>The removed text item</returns>
7979
/// <exception cref="KeyNotFoundException"></exception>
80-
public VectorTextItem<TDocument, TMetadata> Delete(TId id)
80+
public IVectorTextItem<TDocument, TMetadata> Delete(TId id)
8181
{
8282
if (_database.ContainsKey(id))
8383
{
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
namespace Build5Nines.SharpVector;
2+
3+
public interface IVectorTextDatabaseItem<TId, TDocument, TMetadata>
4+
{
5+
TId Id { get; }
6+
TDocument Text { get; }
7+
TMetadata? Metadata { get; }
8+
float[] Vector { get; }
9+
}
10+
11+
public class VectorTextDatabaseItem<TId, TDocument, TMetadata>
12+
: IVectorTextDatabaseItem<TId, TDocument, TMetadata>
13+
{
14+
public VectorTextDatabaseItem(TId id, TDocument text, TMetadata? metadata, float[] vector)
15+
{
16+
Id = id;
17+
Text = text;
18+
Metadata = metadata;
19+
Vector = vector;
20+
}
21+
22+
public TId Id { get; private set; }
23+
public TDocument Text { get; private set; }
24+
public TMetadata? Metadata { get; private set; }
25+
public float[] Vector { get; private set; }
26+
}

src/SharpVectorTest/Preprocessing/BasicTextPreprocessorTests.cs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,10 @@ public class VectorDatabaseTests
1818
public void TokenizeAndPreprocess_Null()
1919
{
2020
var preprocessor = new BasicTextPreprocessor();
21+
#pragma warning disable CS8625 // Cannot convert null literal to non-nullable reference type.
2122
var tokens = preprocessor.TokenizeAndPreprocess(null);
22-
23+
#pragma warning restore CS8625 // Cannot convert null literal to non-nullable reference type.
24+
2325
Assert.AreEqual(0, tokens.Count());
2426
}
2527

src/SharpVectorTest/VectorDatabaseTests.cs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1002,6 +1002,28 @@ public void EmbeddingGeneratorMemoryVectorDatabase_001()
10021002
var db = new EmbeddingGeneratorMemoryVectorDatabase();
10031003
db.AddText("Test string", "metadata");
10041004
}
1005+
1006+
1007+
[TestMethod]
1008+
public void BasicMemoryVectorDatabase_LoopThroughAllTexts_01()
1009+
{
1010+
var vdb = new BasicMemoryVectorDatabase();
1011+
1012+
// // Load Vector Database with some sample text
1013+
vdb.AddText("The 👑 King", "metadata1");
1014+
vdb.AddText("It's 🔥 Fire.", "metadata2");
1015+
vdb.AddText("No emoji", "metadata3");
1016+
1017+
foreach(var item in vdb)
1018+
{
1019+
var id = item.Id;
1020+
var text = item.Text;
1021+
var metadata = item.Metadata;
1022+
var vector = item.Vector;
1023+
//Console.WriteLine($"ID: {item.Id}, Text: {item.Text}, Metadata: {item.Metadata}");
1024+
vdb.UpdateText(item.Id, item.Text + " - Updated");
1025+
}
1026+
}
10051027
}
10061028

10071029
public class MockMemoryVectorDatabase

0 commit comments

Comments
 (0)