Skip to content

dstengle/knowledgebase-query-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

KB Query CLI - Natural Language SPARQL Interface

Python 3.8+ Tests Coverage

A powerful command-line interface that converts natural language queries into SPARQL using OWL ontologies. Built with the SPARC methodology (Specification, Pseudocode, Architecture, Refinement, Completion) for systematic development.

โœจ Features

  • ๐Ÿ—ฃ๏ธ Natural Language Queries: "meetings with John Smith", "todos assigned to Sarah"
  • ๐Ÿง  Automatic Pattern Generation: Creates query patterns from OWL ontology structure
  • โšก Interactive & Command Modes: Both REPL and single-command execution
  • ๐ŸŽฏ No SPARQL Knowledge Required: Users query in plain English
  • ๐Ÿ” Intelligent Suggestions: Provides corrections for failed queries
  • ๐Ÿ“Š Debug Mode: Shows pattern matching and SPARQL generation details
  • ๐ŸŒ SPARQL Endpoint Execution: Execute queries against any SPARQL endpoint
  • ๐Ÿ“‚ Named Graph Support: Query specific graphs or datasets
  • ๐Ÿ“‹ Multiple Output Formats: JSON, CSV, Table, Rich terminal, Turtle
  • ๐Ÿ” Authentication Support: Basic and Digest authentication for secured endpoints
  • ๐Ÿ—๏ธ Extensible Architecture: Clean separation of concerns, fully tested

๐Ÿš€ Quick Start

Installation

# Install directly from source
pip install -e .

# Or install required dependencies
pip install rdflib click

Basic Usage

# Query with an ontology file
kb-query --ontology data/kb-vocabulary.ttl "meetings with John Smith"

# Show generated SPARQL
kb-query --ontology data/kb-vocabulary.ttl --show-sparql "todos with Sarah"

# Execute query against SPARQL endpoint
kb-query --ontology data/kb-vocabulary.ttl -x -e http://localhost:3030/kb/sparql "meetings with John"

# Query specific named graphs
kb-query -o data/kb-vocabulary.ttl -x -e http://localhost:3030/dataset/sparql \
  -g http://example.org/graph1 -g http://example.org/graph2 "meetings"

# Query with default graph
kb-query -o data/kb-vocabulary.ttl -x -e http://localhost:3030/dataset/sparql \
  --default-graph http://example.org/default "todos"

# Test endpoint connection
kb-query --test-connection -e http://dbpedia.org/sparql

# Different output formats
kb-query -o data/kb-vocabulary.ttl -x -e http://localhost:3030/kb/sparql -f json "meetings"
kb-query -o data/kb-vocabulary.ttl -x -e http://localhost:3030/kb/sparql -f csv "todos"

# Interactive mode
kb-query --ontology data/kb-vocabulary.ttl --interactive

# List available query patterns
kb-query --ontology data/kb-vocabulary.ttl --list-patterns

Example Session

$ kb-query --ontology data/kb-vocabulary.ttl --interactive --show-sparql
KB Query Interactive Mode
Type 'help' for commands, 'exit' to quit
----------------------------------------
kb-query: meetings with John Smith
โœ“ Query understood

Generated SPARQL:
----------------------------------------
SELECT ?meeting ?person_name
WHERE {
  ?meeting <http://example.org/kb#hasAttendee> ?person .
  ?person <http://xmlns.com/foaf/0.1/name> ?person_name .
  FILTER (lcase(str(?person_name)) = lcase("John Smith"))
}
----------------------------------------

Execution time: 0.001s
kb-query: exit
Goodbye!

Example with Graph Specification

$ kb-query -o data/kb-vocabulary.ttl --show-sparql -g http://example.org/mydata "meetings"
โœ“ Query understood

Generated SPARQL:
----------------------------------------
SELECT ?meeting ?person_name
WHERE {
  GRAPH <http://example.org/mydata> {
    ?meeting <http://example.org/kb#hasAttendee> ?person .
    ?person <http://xmlns.com/foaf/0.1/name> ?person_name .
  }
}
----------------------------------------

๐Ÿ“‹ Supported Query Patterns

The system automatically generates patterns from your OWL ontology. For example, with the included sample ontology:

Query Pattern Example
meetings with {person} "meetings with John Smith"
{person}'s meetings "John's meetings"
todos with {person} "todos with Sarah"
{person}'s todos "Sarah's todos"
notes with {person} "notes with John"
meetings tagged {tag} "meetings tagged project"

๐Ÿ—๏ธ Architecture

Built with a clean 4-layer architecture:

  • CLI Layer: Interactive REPL + Command-line interface
  • Service Layer: Query orchestration and business logic
  • Core Layer: Grammar engine, pattern matching, SPARQL building
  • Infrastructure Layer: OWL parsing, caching, HTTP clients

Key Components

  • Grammar Engine: Parses OWL ontologies and generates natural language patterns
  • Pattern Matcher: Fuzzy matching of user queries to patterns with entity extraction
  • SPARQL Builder: Template-based SPARQL generation with injection prevention
  • Query Service: Orchestrates the complete query processing pipeline

๐Ÿงช Testing

The project follows Test-Driven Development with comprehensive test coverage:

# Run all tests
python -m pytest

# Run with coverage
python -m pytest --cov=kb_query

# Run specific test categories
python -m pytest tests/unit/        # Unit tests
python -m pytest tests/integration/ # Integration tests
python -m pytest tests/e2e/         # End-to-end tests

Test Statistics:

  • 17 unit tests (100% passing)
  • 79% code coverage (target: 80%+)
  • TDD approach throughout development

๐Ÿ“ Project Structure

kb_query/
โ”œโ”€โ”€ cli/                 # Command-line interface
โ”‚   โ””โ”€โ”€ main.py         # Main CLI entry point
โ”œโ”€โ”€ core/               # Core business logic
โ”‚   โ”œโ”€โ”€ entities.py     # Data structures
โ”‚   โ”œโ”€โ”€ grammar.py      # OWL parsing & pattern generation
โ”‚   โ”œโ”€โ”€ matcher.py      # Query pattern matching
โ”‚   โ””โ”€โ”€ builder.py      # SPARQL query building
โ”œโ”€โ”€ services/           # Service orchestration
โ”‚   โ””โ”€โ”€ query.py        # Main query service
โ””โ”€โ”€ exceptions.py       # Exception hierarchy

tests/
โ”œโ”€โ”€ unit/               # Unit tests
โ”œโ”€โ”€ integration/        # Integration tests
โ””โ”€โ”€ e2e/               # End-to-end tests

docs/
โ”œโ”€โ”€ adr/               # Architecture Decision Records
โ”œโ”€โ”€ sparc/             # SPARC methodology documentation
โ”œโ”€โ”€ architecture.md    # System architecture
โ”œโ”€โ”€ api-reference.md   # API documentation
โ””โ”€โ”€ examples.md        # Usage examples

๐Ÿ”ง Development

SPARC Methodology

This project was developed using the SPARC methodology:

  1. โœ… Specification: Requirements, user stories, success criteria
  2. โœ… Pseudocode: High-level algorithmic design
  3. โœ… Architecture: System design, interfaces, diagrams
  4. โœ… Refinement: TDD implementation, iterative improvement
  5. โœ… Completion: Integration, testing, documentation

Running Tests

# Install development dependencies
pip install -e .[dev]

# Run tests with coverage
python -m pytest --cov=kb_query --cov-report=html

# Run specific component tests
python -m pytest tests/unit/test_grammar_engine.py -v
python -m pytest tests/unit/test_pattern_matcher.py -v

Adding New Ontologies

  1. Create your OWL ontology file in Turtle format
  2. Ensure proper class and property definitions with domains/ranges
  3. Use the CLI to see generated patterns: kb-query --ontology your-file.ttl --list-patterns

Example Ontology Structure

@prefix kb: <http://example.org/kb#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

# Classes
kb:Meeting a owl:Class .
kb:Person a owl:Class .

# Properties (generates patterns automatically)
kb:hasAttendee a owl:ObjectProperty ;
    rdfs:domain kb:Meeting ;
    rdfs:range foaf:Person .

๐Ÿ“š Documentation

๐Ÿค Contributing

  1. Follow TDD: Write tests first, then implement
  2. Update Documentation: Keep docs current with changes
  3. Use Type Hints: Full type annotations required
  4. Run Tests: Ensure all tests pass before submitting
# Development setup
git clone <repository>
cd knowledgebase-query-system
pip install -e .[dev]

# Make changes, add tests
python -m pytest  # Ensure tests pass

๐Ÿ† Key Achievements

  • Systematic Development: Complete SPARC methodology application
  • High Test Coverage: 79% coverage with 17 comprehensive tests
  • Production Ready: Clean architecture with proper error handling
  • User Friendly: Natural language interface requires no SPARQL knowledge
  • Extensible: Easy to add new ontologies and query patterns

๐Ÿ”ฎ Future Enhancements

  • Configuration Profiles: Multiple endpoint support with authentication
  • Result Export: CSV, JSON output formats
  • Query History: Persistent query history and favorites
  • Advanced Patterns: Support for aggregation and complex queries
  • Web Interface: Browser-based query interface
  • Performance Optimization: Caching and query optimization

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ™ Acknowledgments

Built using modern Python best practices:

  • rdflib for OWL/RDF processing
  • click for CLI interface
  • pytest for testing framework
  • SPARC methodology for systematic development

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages