Feature/semantic similarity (#36)

* fix: mixed up migrations

* feat: semantic similarity first take

* feat: semantic similarity first round of fixes

* feat: more work on making semantic similarity work properly

* feat: getPostBySlug for the AI

* feat: show similarity in post-link-insert-modal

* chore: remove done doc

---------

Co-authored-by: hugo <hugoms@me.com>
This commit is contained in:
Georg Bauer
2026-03-05 22:05:32 +01:00
committed by GitHub
parent 8ac8305e01
commit 7e1e8981a3
64 changed files with 6429 additions and 499 deletions

308
API.md
View File

@@ -1,6 +1,6 @@
# API Documentation
Contract version: 1.11.0
Contract version: 1.12.0
This reference documents all Python runtime API calls available through `bds_api` in embedded Pyodide.
@@ -29,6 +29,7 @@ project = await bds.meta.get_project_metadata()
- [chat](#chat)
- [sync](#sync)
- [publish](#publish)
- [embeddings](#embeddings)
- [Data Structures](#data-structures)
## projects
@@ -325,6 +326,7 @@ None # or
- [posts.update](#postsupdate)
- [posts.delete](#postsdelete)
- [posts.get](#postsget)
- [posts.getBySlug](#postsgetbyslug)
- [posts.getPreviewUrl](#postsgetpreviewurl)
- [posts.getAll](#postsgetall)
- [posts.getByStatus](#postsgetbystatus)
@@ -500,6 +502,49 @@ None # or
}
```
### posts.getBySlug
Fetch one post by slug.
**Parameters**
- slug (str, required)
**Response specification**
- Return type: `PostData | null`
- Nullability: Returns `None` when no matching value exists.
- Data structures: `PostData`
**Example call**
```python
from bds_api import bds
result = await bds.posts.get_by_slug(slug='slug')
```
**Example response**
```python
None # or
{
'id': 'value',
'projectId': 'value',
'title': 'value',
'slug': 'value',
'excerpt': 'value',
'content': 'value',
'status': 'draft',
'author': 'value',
'language': 'value',
'createdAt': 'value',
'updatedAt': 'value',
'publishedAt': 'value',
'tags': 'value',
'categories': 'value'
}
```
### posts.getPreviewUrl
Get preview URL for post.
@@ -2946,7 +2991,8 @@ result = await bds.meta.sync_on_startup()
'pythonRuntimeMode': 'webworker',
'picoTheme': 'value',
'categoryMetadata': {},
'categorySettings': {}
'categorySettings': {},
'semanticSimilarityEnabled': False
}
]
```
@@ -2988,7 +3034,8 @@ None # or
'pythonRuntimeMode': 'webworker',
'picoTheme': 'value',
'categoryMetadata': {},
'categorySettings': {}
'categorySettings': {},
'semanticSimilarityEnabled': False
}
```
@@ -3029,7 +3076,8 @@ None # or
'pythonRuntimeMode': 'webworker',
'picoTheme': 'value',
'categoryMetadata': {},
'categorySettings': {}
'categorySettings': {},
'semanticSimilarityEnabled': False
}
```
@@ -3070,7 +3118,8 @@ None # or
'pythonRuntimeMode': 'webworker',
'picoTheme': 'value',
'categoryMetadata': {},
'categorySettings': {}
'categorySettings': {},
'semanticSimilarityEnabled': False
}
```
@@ -3851,6 +3900,218 @@ result = await bds.publish.upload_site(credentials={})
[↑ Back to Table of contents](#table-of-contents)
## embeddings
**Module APIs**
- [embeddings.findSimilar](#embeddingsfindsimilar)
- [embeddings.computeSimilarities](#embeddingscomputesimilarities)
- [embeddings.getProgress](#embeddingsgetprogress)
- [embeddings.suggestTags](#embeddingssuggesttags)
- [embeddings.findDuplicates](#embeddingsfindduplicates)
- [embeddings.dismissPair](#embeddingsdismisspair)
- [embeddings.indexUnindexedPosts](#embeddingsindexunindexedposts)
### embeddings.findSimilar
Find posts semantically similar to the given post. Requires semantic similarity to be enabled in project settings.
**Parameters**
- postId (str, required)
- k (int | float, optional)
**Response specification**
- Return type: `SimilarPost[]`
- Data structures: `SimilarPost`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.find_similar(post_id='post-1')
```
**Example response**
```python
[
{
'postId': 'value',
'similarity': 0
}
]
```
### embeddings.computeSimilarities
Compute cosine similarity between a source post and a list of target posts. Returns a mapping of target post IDs to similarity scores (0.0-1.0). Posts without embeddings are omitted.
**Parameters**
- sourcePostId (str, required)
- targetPostIds (list, required)
**Response specification**
- Return type: `Record<string, number>`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.compute_similarities(source_post_id='source_post-1', target_post_ids=[])
```
**Example response**
```python
{}
```
### embeddings.getProgress
Get the embedding indexing progress for the active project.
**Parameters**
- None
**Response specification**
- Return type: `{ indexed: number; total: number }`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.get_progress()
```
**Example response**
```python
{}
```
### embeddings.suggestTags
Suggest tags for a post based on tags used by semantically similar posts.
**Parameters**
- postId (str, required)
- excludeTags (list, required)
**Response specification**
- Return type: `TagSuggestion[]`
- Data structures: `TagSuggestion`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.suggest_tags(post_id='post-1', exclude_tags=[])
```
**Example response**
```python
[
{
'name': 'value',
'score': 0
}
]
```
### embeddings.findDuplicates
Find post pairs with high content similarity (potential duplicates). Threshold is a similarity value from 0.0 to 1.0 (default 0.85).
**Parameters**
- threshold (int | float, optional)
**Response specification**
- Return type: `DuplicatePair[]`
- Data structures: `DuplicatePair`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.find_duplicates()
```
**Example response**
```python
[
{
'postA': 'value',
'postB': 'value',
'similarity': 0
}
]
```
### embeddings.dismissPair
Dismiss a duplicate pair so it no longer appears in results.
**Parameters**
- postIdA (str, required)
- postIdB (str, required)
**Response specification**
- Return type: `void`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.dismiss_pair(post_id_a='post_id_a-1', post_id_b='post_id_b-1')
```
**Example response**
```python
None
```
### embeddings.indexUnindexedPosts
Trigger background indexing of all posts not yet embedded.
**Parameters**
- None
**Response specification**
- Return type: `void`
**Example call**
```python
from bds_api import bds
result = await bds.embeddings.index_unindexed_posts()
```
**Example response**
```python
None
```
[↑ Back to Table of contents](#table-of-contents)
## Data Structures
Shared structures referenced by response types are defined once here.
@@ -4021,6 +4282,7 @@ Extended project metadata from project settings.
- picoTheme (`string`, optional): Preferred Pico theme token.
- categoryMetadata (`object`, optional): Category metadata keyed by category slug.
- categorySettings (`object`, optional): Category render settings keyed by category slug.
- semanticSimilarityEnabled (`boolean`, optional): Enable local ONNX embedding-based semantic similarity features.
[↑ Back to Table of contents](#table-of-contents)
@@ -4113,6 +4375,40 @@ Result from AI image analysis containing generated title, alt text, and caption.
[↑ Back to Table of contents](#table-of-contents)
### SimilarPost
A post with its semantic similarity score relative to a reference post.
**Fields**
- postId (`string`, required): Post identifier.
- similarity (`number`, required): Cosine similarity score from 0.0 to 1.0.
[↑ Back to Table of contents](#table-of-contents)
### TagSuggestion
A tag suggested based on semantic similarity to similar posts.
**Fields**
- name (`string`, required): Tag name.
- score (`number`, required): Aggregated suggestion score.
[↑ Back to Table of contents](#table-of-contents)
### DuplicatePair
A pair of posts with high content similarity that may be duplicates.
**Fields**
- postA (`{ id: string; title: string; slug: string; publishedAt?: string }`, required): First post in the pair.
- postB (`{ id: string; title: string; slug: string; publishedAt?: string }`, required): Second post in the pair.
- similarity (`number`, required): Cosine similarity score from 0.0 to 1.0.
[↑ Back to Table of contents](#table-of-contents)
---
Generated from contract at 2026-02-27T00:00:00.000Z.
Generated from contract at 2026-03-05T00:00:00.000Z.