Skip to content

Conversation

@ghukill
Copy link
Collaborator

@ghukill ghukill commented Nov 7, 2025

This PR is merging the approved code from #19 into main. The previous PR had the wrong base branch.

Why these changes are being introduced:

Initially, the CLI command create-embeddings only supported reading input records
from the TIMDEX dataset via TDA.  While this is likely the way we'll get input
records, supporting a JSONLines file as input is helpful for testing.

How this addresses that need:
* Adds a new --input-jsonl argument that reads a JSONLines file and uses
those rows as input for creating embeddings.
* Args --dataset-location and --run-id are required when --input-jsonl
is not set.

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/USE-137
@ghukill ghukill merged commit 0e295af into main Nov 7, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants