Release Notes | AllegroGraph 8.4.0

Introduction

This document lists all changes in the current release, which is 8.4.0.

See the release-notes document for links to release notes for some previous releases.

Change History repeats the list of user-visible modifications for this release and includes similar lists for all earlier releases.

Release 8.4.0

This is a minor release with significant new features, as well as other improvements and bug fixes.

The new features include:

Enhanced AI-powered Natural Language Query interface with collaborative features

AllegroGraph's Natural Language Query interface allows users to ask questions in natural language and automatically converts them into SPARQL queries for precise knowledge graph interrogation. This AI-powered capability depends on a vector database containing query examples that help the system learn and improve over time. With this feature, you have built-in GraphRAG capabilities for your bot-like AI applications.

In this release, we've enhanced the collaborative workflow around these Natural Language Query examples with new metadata tracking:

Author: Records who initially created the query example
Editor: Tracks who last modified the query example
Creation time: Timestamps when the query was first created
Edit time: Records when the query was last modified

These fields provide essential context about the history and ownership of query examples, making it easier for teams to collaboratively improve their AI question-answering capabilities. For existing query examples, these fields start empty and populate on first edit.

Additionally, a new tabular view option has been introduced that provides a more structured presentation of query metadata, making it easier to sort, filter, and compare query examples at a glance. This enhancement streamlines the process of maintaining high-quality training examples that drive improved natural language understanding.

Databricks seamless integration to AllegroGraph.

AllegroGraph now seamlessly integrates with Databricks to facilitate data transfer from Databricks into your Knowledge Graph. This integration allows you to:

Import CSV files directly from Databricks File System (DBFS)
Generate and download CSV files from SQL statements executed in Databricks
Import entire SQL database tables from Databricks
Process previously downloaded CSV files from Databricks

This integration positions AllegroGraph as a powerful metadata AI agent that can analyze and contextualize your Databricks data. For example, organizations can use AllegroGraph to build a knowledge graph from their Databricks data lake that:

Enriches raw data with semantic relationships and domain-specific ontologies
Provides more accurate and contextually relevant responses to natural language queries
Discovers hidden connections across disparate datasets through graph analytics
Acts as an intelligent metadata layer that enhances AI applications with domain knowledge from your enterprise data

The integration is available through the WebView interface and provides an intuitive workflow:

Connect to your Databricks environment using OAuth credentials
Select your data source (files, SQL queries, or tables)
Download the data to AllegroGraph
Define transformation rules to convert rows into triples

This feature requires an Enterprise Databricks account and the Databricks CLI to be available on the AllegroGraph server. Configuration options in the AllegroGraph configuration file allow you to specify workspace folders and volumes for data exchange. For complete details, see the Databricks Integration documentation.

Dynamic updates to AllegroGraph Vector store clusters.

Once a vector store is clustered (agtool llm cluster) new objects may be added and in order to search for them using clustering the new objects must be added to existing clusters and the new capability (agtool llm cluster --update) will do just that.

Verify Only File Validation saves significant time for possible formatting issues.

A new validation option has been added across all data import interfaces that allows you to verify file syntax without loading any triples:

The agtool load command now supports a --parse-only flag for validation-only operations
WebView includes a new VERIFY ONLY button on import pages to validate files without loading them

This feature is especially useful for validating large or complex files before committing to the import process, helping to identify formatting issues early without the need to roll back transactions.

A list of all significant user-visible changes is below.

Database internal format is unchanged since release 6.4.6

Release 6.4.6 had a different internal data format. All releases since then have the same internal format. Any archived repository or sets of repositories (see Repository Backup and Restore) from a version later that 6.4.6 may be loaded into into this version 8.4.0. Note that backing up repositories before loading them into a new version is always recommended. Once a database has been used in a later release, it cannot be used in an earlier release regardless of whether or not the database format is the same. Upgrading is discussed in the Repository Upgrading document.

Version 8.4.0 Admin notes

There are no admin notes for version 8.4.0.

Version 8.4.0 Programmer notes

There are no programmer notes for release 8.4.0.

User-visible changes version 8.4.0

AllegroGraph Server: General Changes

Extend Freetext indexes configuration to improve search performance.

AllegroGraph's freetext indexing capability has been significantly enhanced with new filtering options that provide more precise control over which triples are included in an index:

Graph-based filtering - Restrict freetext indexes to include only triples from specific graphs, allowing you to create specialized search indexes for different data partitions or contexts
Type-based filtering - Limit indexed content to only those triples whose subjects are of specific RDF types, enabling more focused domain-specific search capabilities

These new configuration options complement the existing ability to filter by predicates, literal types, and URI components. With these enhancements, you can create more granular and targeted freetext indexes that:

Improve search performance by reducing index size
Provide more relevant search results by focusing on specific data subsets
Reduce memory consumption by avoiding indexing unnecessary content

The new filtering options are available through all interfaces for managing freetext indices: WebView, Lisp API, and the new agtool freetext command line tools. For WebView UI, please read the Create Freetext Index section.

Include all literals in the embedding process

Before this release, the embedding process worked only for string literals. Now it includes all literals (numbers, dates, etc).

Override an api key for vector databases

A vector database usually contains the api-key used to get authorization from an LLM server. If you wish to use a different api key for a particular LLM command then you can now override the api key found in the vector database.

This affects the vector-store-add-object and vector-store-nearest-neighbor functions (in Lisp) and the agtool llm index command now has an optional --api-key API_KEY argument.

Non-standard load-transform `prefix`es are expanded

Load-transform's prefix rules (--prefix NS or --prefix key=NS) used to only expand namespace abbreviation NS if it was one of the standard namespace abbreviations, otherwise an error of the form

Unable to resolve UPI because there is no namespace mapping for prefix "NS"

used to be signaled. This has been fixed and load-transform can now expand any user-defined namespace abbreviations.

Add `http://www.w3.org/ns/shacl` namespace as `sh` abbreviation

Introduced sh as the default (global) namespace abbreviation from SHACL W3C, 1.2 Document Conventions section: sh: http://www.w3.org/ns/shacl#.

Extend OAuth configuration directive

AllegroGraph 8.4.0 enhances the OAuth/OIDC authentication support with new security options for token type validation:

explicit-id-token-type - Allows explicit checking of ID Token types to prevent token misuse
explicit-logout-token-type - Controls validation of Logout Token types to prevent token misuse

These new options provide more granular security control over token validation. By default, ID tokens are only checked to be different from Logout token types, while Logout tokens are checked to match the "logout+jwt" type exactly. This prevents potential token misuse while maintaining compatibility with most identity providers. Both checks can be fully disabled if needed for specific integration scenarios.

These enhancements make AllegroGraph more secure and compatible with modern Single Sign-On systems. For complete details, see the OAuth configuration documentation.

HTTP Client

New endpoints to register/unregister a vector database with Natural Language Query examples

There are two new endpoints:

PUT /catalogs/<catalog-name>/repositories/<repository-name>/nlq-vdb
DELETE /catalogs/<catalog-name>/repositories/<repository-name>/nlq-vdb

The first endpoint registers a vector database as a database with natural language query examples, and the second one unregisters it.

Update /sna/paths documentation

The generator parameter of the GET /catalogs/<catalog-name>/repositories/<repository-name>/sna/paths API is only useful in a direct client session. Therefore, it has been removed from the HTTP endpoint.

SPARQL

Resolved issues with parsing escaped quotes inside SPARQL literals

SPARQL lexer used to produce incorrect tokens when an escaped quote character followed several escaped `` characters in a literal enclosed with the same quotes. For example, expressions

# Search for \" sequence.  
FILTER (regex(str(?o), "\\\\\""))  
 
# Search for \'\ sequence.  
FILTER (regex(str(?o), '\\\\\'\\\\'))

would cause errors like the following

MALFORMED QUERY: Line 3, Invalid literal "\"\\\\\\\\\\\"".  
Note that single backslashes in literals must be escaped.  
I.e., use '\\x', not '\x'

This has been fixed.

Fixed GeoSPARQL geohash magic predicates with unbound variables

All GeoSPARQL magic predicates now work properly in the scenarios that both the subject and the object are variables, for example:

PREFIX my: <http://example.org/ApplicationSchema#>  
PREFIX geo: <http://www.opengis.net/ont/geosparql#>  
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>  
 
SELECT DISTINCT ?f ?x  
WHERE {  
   ?f my:hasPointGeometry ?fGeom .  
   ?x my:hasExactGeometry ?xGeom .  
   ?fGeom geo:sfWithin ?xGeom .  
   FILTER (?f != ?x)  
}

New boolean rewriteJoinsWithUnions query option

If yes, join-union query algebra constructions like these

join(A, union(B1, B2, ... BN))

will be rewritten as

union(join(A, B1), join(A, B2), ... join(A, BN))

which may improve performance significantly in some cases. This rewrite is on by default, but it may also worsen the performance if A is expensive, since it will be repeated N times, so the main use of the query option is to be able to turn it off in such cases.

Fix errors when creating constraints with constant args

The following query caused an error due to costraints with constants:

 select ?___ {  
   values ( ?x ) { ( 2 ) }  
   bind( IF(?x = 2, 'yes', 'no') as ?___ )  
 }

Now the bug is fixed.

AGWebView

Load CSV file in WebView

The agtool load CLI command has been available to transform CSV rows into triples and load them. In this release, WebView now includes a user interface for loading CSV files, making it easier to import CSV data directly through the web interface. The new UI provides:

An intuitive form for configuring how CSV data is converted to RDF triples
Preview of CSV sample data
Validation to ensure proper mapping configuration
The ability to save and reuse mapping configurations

Additionally, the WebView interface includes a PRINT AGTOOL COMMANDS button that produces the equivalent agtool load commands for the current configuration. This feature helps users with automation needs by allowing them to experiment with settings in the UI first, then easily transfer those settings to scripts or batch processes. For more details, please see the Load CSV file section.

Introduce UI to define expected store size on creating a store

WebView now provides a user interface for configuring the expected store size when creating a new repository. This setting doesn't impose a hard limit on the repository's size but rather serves as a performance optimization hint to AllegroGraph. Setting an appropriate expected store size helps AllegroGraph achieve optimal performance for both reading and inserting data. The tradeoff is shared memory usage - increasing the expected store size leads to more shared memory being allocated when opening a connection to the repository. This UI makes it easier to configure this important performance parameter without needing to modify configuration files directly.

Enable "External references" import options by default in WebView

This option is used for RDF/XML and JSON-LD input formats and determines whether external links should be followed or not. Enabling this option by default avoids import errors during loading, so we decided to relax the import options.

Add confirmation dialog on enabling or disabling Superuser permission for a user in WebView

Since Superuser permission is a significant change for a user, we decided to add a confirmation dialog to avoid accidental clicks.

Guard against accidental double load on importing pages in WebView

When you attempt to upload a file that appears to be a duplicate (by clicking the VERIFY AND IMPORT FILE button a second time for the same file), a confirmation dialog is now displayed to either cancel the operation or confirm that you actually want to load the same file again. This simple safeguard helps prevent unintended duplicate data while still preserving the flexibility to load the same file multiple times when needed.

Optimize WebView UI performance of Natural Language Queries configuration pages

As mentioned in the first release note, there is a Natural Language Query feature that includes several configuration pages to improve the quality of generated SPARQL queries. These pages cover viewing and editing of SHACL shapes and natural language query examples. In this release, we optimized performance to support managing thousands of SHACL shapes or query examples.

Fix the error on sign-in in WebView

Previously, if the WebView cookie did not have a value, an error was thrown that prevented new authentication. This issue has now been resolved.

Add hover effect for rows in all tables

This enhancement improves navigation over long rows by adding a hover effect to table rows.

Fix downloading SPARQL results in NL to SPARQL editor

There was an error when users downloaded SPARQL results after clicking the RUN SPARQL button. This issue has been fixed in this release.

Improve SPARQL editor autocomplete suggestions on large repositories

Instead of using general SPARQL queries to get autocomplete suggestions based on user input, a dedicated HTTP endpoint is now used. This endpoint has better time and memory performance since it is specifically designed to gather suggestions.

LLM support

No significant changes.

Ollama support

No significant changes.

agtool

New tools for copying a vector store into another store, vector store or not.

A new agtool command (see the agtool command description) and a new Lisp function (see vector-store-import) This is a good way to garbage collect a vector store and shrink down the vectors file. Also this puts vector data into a knowledge repo so interesting selectors can be done.

New arguments to agtool repo creation commands

For the agtool command agtool repos create and for the agtool command agtool load when it creates a new repo into which the data is loaded, these new arguments

[--expected-store-size INTEGER]  [--string-table-size INTEGER]

are permitted which allow you to specify the expected sizes of the repo, overriding what is found in agraph.cfg. Note that while an abbreviation like 512M is accepted in agraph.cfg the value of the parameter here must be an integer like 512000000.

Remove a vector store from inside a repository

The agtool command repos delete-vector-store will remove a vector store from inside a repository

AG-812 can specify an api key for vector databases

This affects the vector-store-add-object and vector-store-nearest-neighbor functions (in Lisp) and the agtool llm index command now has an optional --api-key API_KEY argument.

AG-1484 - non-standard load-transform `prefix`es are not expanded

Unable to resolve UPI because there is no namespace mapping for prefix "NS"

used to be signaled. This has been fixed and load-transform can now expand any user-defined namespace abbreviations.

Seamless Namespace Sharing Between Gruff and WebView

Now, any namespaces you create in WebView can be used and edited in Gruff, and vice versa. All Gruff namespaces are marked as "User" type because they are specific to each AllegroGraph user. This makes it easier to manage and utilize your namespaces across both tools.

HTTP Client

AG-1557 - Add support for x-binary-rdf-results-table

SPARQL select queries made via the REST API now support an Accept header of "application/x-binary-rdf-results-table" to improve interoperability with RDF4J.

SPARQL

AG-831 - issues with parsing escaped quotes inside SPARQL literals

SPARQL lexer used to produce incorrect tokens when an escaped quote character followed several escaped `` characters in a literal enclosed with the same quotes. For example, expressions

# Search for \" sequence.  
FILTER (regex(str(?o), "\\\\\""))  
 
# Search for \'\ sequence.  
FILTER (regex(str(?o), '\\\\\'\\\\'))

would cause errors like the following

MALFORMED QUERY: Line 3, Invalid literal "\"\\\\\\\\\\\"".  
Note that single backslashes in literals must be escaped.  
I.e., use '\\x', not '\x'

This has been fixed.

AG-1372 - exporting a repo with geospatial UPIs using multiple workers

The agtool export command can specify using multiple workers and this greatly speeds up the export process. However using multiple workers would fail if there were geospatial UPIs in the repository. With this change change multiple workers can be used if there are geospatial UPIs as long as --blank-node-handling is together..

There was already code to put all triples with blank nodes into one file. This change puts all triples with geospatial UPIs in that one file as well. Then multiple workers will be used to export all other triples in the repo.

AGWebView

No significant changes.

LLM support

No significant changes.

Ollama support

No significant changes.

agtool

AG-1454 - `agtool` commands for managing and querying freetext indices

The agtool freetext command tree has been added. Among other things, it supports exporting and importing text index configurations in lineparse format to facilitate transferring more complex text indices to different repositories.

Extend and improve template language for non-RDF data import.

The template language used for transforming non-RDF data (CSV, JSON) into RDF triples has been significantly enhanced with a more powerful expression system. The new capabilities include:

Function composition - Apply multiple transformations in sequence using Lisp-style s-expressions
Enhanced string manipulation - New functions for string operations including split, and join
DateTime normalization - The datetime function converts various date/time formats to W3C-DTF standard format
List processing - Generate multiple triples from a single data field by splitting values and processing each element
Field references with complex names - Support for referencing fields containing spaces or special characters using pipe (|) delimiters

These improvements make it much easier to transform complex source data into well-structured RDF. For example, you can now split a semicolon-separated list of dates in a single field, normalize each date, and create a separate triple for each one:

--tr-transform 'visit=${(datetime (split visits ";"))}'

The enhanced template language is available in all interfaces that support transform rules: agtool load, WebView's CSV import, and the Lisp API. For complete details, see the Using complex load-transform templates documentation.

Fixed error looking up transform rules in the repo without `--graph`

Transform rules can be added to the repository as RDF data and can be looked up by specifying the top-level statement subject with --transform-rules option. When --graph option were omitted, the default value would cause errors of the form

:default is not a valid UPI conversion keyword.  
Try :maximum or :minimum instead

This has been fixed and now transform rules subject is always looked up in all graphs, which is more convenient.

Loading JSON/CSV with `agtool` now respects `--error-strategy`

Loading JSON/CSV via load-transform using agtool now respects the --error-strategy option. Previously, only missing key references were ignored, but any type errors during transforms caused the load process to be aborted. This can now be overriden with --error-strategy option, just like with the other formats.

Exporting a repo with geospatial UPIs using multiple workers

Changes to the Lisp API

The `count-query` function returns 0 instead of nil for remote triple stores

An HTTP optimization caused nil to be returned instead of 0 for a remote triple store. This issue has been fixed so that count-query will always return an integer.

Prolog

No significant changes.

Documentation

Fixed links to 3-rd party classes in AGraph Java client documentation

AGraph Java client documentation now uses correct links to 3-rd party packages (json-20240205, apache.httpcomponents, com.atomikos).

Increased suggested --shm-size argument value for docker

The previous value was 1g but this is too small for vector databases. It's changed to 2g so kennedy vector database can be opened.

Include an example how to combine results from vector search and triples matching

The documentation now includes a detailed example demonstrating how to effectively combine vector similarity search results with traditional triple pattern matching in SPARQL queries. This provides a powerful technique for hybrid semantic search that leverages both vector embeddings and structured data.

The example shows how to:

Understand the VDB ID concept that connects embedded content with its original subject
Query both vector similarity results and related properties in a single SPARQL statement
Create queries that filter vector search results using standard triple patterns
Set up a repository that serves as both a vector database and a triple store

This addition helps developers create more sophisticated knowledge graph applications that take advantage of both vector similarity search for semantic understanding and traditional triple patterns for precise filtering and relationship traversal.

New monitoring page to describe AllegroGraph monitoring capabilities

A new comprehensive documentation page has been added that details AllegroGraph's built-in monitoring capabilities. This page provides system administrators and developers with a complete reference to the monitoring features available through HTTP endpoints, including:

Real-time process statistics and resource usage metrics
Active query monitoring with detailed timing information
Configurable audit logs for security and compliance
Repository disk usage and statistics reports

The monitoring documentation includes practical examples of how to use these endpoints with tools like curl, as well as suggestions for setting up automated monitoring systems. This enhancement makes it easier for users to implement proper operational oversight of their AllegroGraph deployments. For full details, see the AllegroGraph Monitoring Capabilities page.

New guidelines for query performance tuning using query options

A comprehensive new documentation section has been added that provides detailed guidance on improving SPARQL query performance using AllegroGraph's query options. This guidance covers:

How to identify and fix cross-product issues that can cause memory consumption problems
When and how to use alternative execution strategies like Chunk-at-a-Time (CaaT) processing and the MJQE query engine
Techniques for optimizing pattern ordering when the automatic query planner doesn't produce optimal results
Strategic use of dynamic reordering during execution for better performance
When to enable or disable the join-union optimization rewrite

This new documentation provides a systematic approach to performance tuning on a per-query basis. The guidelines include specific query option examples and detailed explanations of when to apply each technique.

For complete information, see the Using query options for query performance tuning section in the SPARQL reference documentation.

Java client

Update Java & Dependencies

Updated org.apache.derby:derby from 10.14.2.0 to 10.17.1.0.
Updated Java from 17 to 21.

Support multiple SPARQL update queries in a single commit

Update our rdftransaction parser to allow clients to submit multiple sparql update queries in a single message. This allows code like the following to be run more efficiently:

 conn.begin();  
 for (String queryString : queryStrings) {  
 	 conn.prepareUpdate(QueryLanguage.SPARQL, queryString).execute();  
 }  
 conn.commit();

Distributed AllegroGraph

Fix COUNT(DISTINCT *) SPARQL query for FedShard repository

There was an error in processing the following query in FedShard repository:

 select (count(distinct *) as ?count) {   
   ?s rdf:type ?o   
 }

The issue has been fixed.

Multi-Master Replication (MMR)

Secure data transport (SecureMMR)

It is now possible to cause the transfer of transaction records between instances to use the SSL/TLS protocol in order to protect sensitive data. Each instance can determine if it wants records sent to it to be encrypted. It does so by specifying an SSL configuration and setting the SecureMMR parameter to yes in the agraph.cfg file.

User-visible changes in earlier releases

See the general release-notes document for links to release notes for other recent releases. See Change History for user-visible changes to all earlier releases.

AllegroGraph 8.4.0 Release Notes

Introduction

Release 8.4.0

Database internal format is unchanged since release 6.4.6

Version 8.4.0 Admin notes

Version 8.4.0 Programmer notes

User-visible changes version 8.4.0

AllegroGraph Server: General Changes

Extend Freetext indexes configuration to improve search performance.

Include all literals in the embedding process

Override an api key for vector databases

Non-standard load-transform prefixes are expanded

Add http://www.w3.org/ns/shacl namespace as sh abbreviation

Extend OAuth configuration directive

HTTP Client

New endpoints to register/unregister a vector database with Natural Language Query examples

Update /sna/paths documentation

SPARQL

Resolved issues with parsing escaped quotes inside SPARQL literals

Fixed GeoSPARQL geohash magic predicates with unbound variables

New boolean rewriteJoinsWithUnions query option

Fix errors when creating constraints with constant args

AGWebView

Load CSV file in WebView

Introduce UI to define expected store size on creating a store

Enable "External references" import options by default in WebView

Add confirmation dialog on enabling or disabling Superuser permission for a user in WebView

Guard against accidental double load on importing pages in WebView

Optimize WebView UI performance of Natural Language Queries configuration pages

Fix the error on sign-in in WebView

Add hover effect for rows in all tables

Fix downloading SPARQL results in NL to SPARQL editor

Improve SPARQL editor autocomplete suggestions on large repositories

LLM support

Ollama support

agtool

New tools for copying a vector store into another store, vector store or not.

New arguments to agtool repo creation commands

Remove a vector store from inside a repository

AG-812 can specify an api key for vector databases

AG-1484 - non-standard load-transform prefixes are not expanded

Seamless Namespace Sharing Between Gruff and WebView

HTTP Client

AG-1557 - Add support for x-binary-rdf-results-table

SPARQL

AG-831 - issues with parsing escaped quotes inside SPARQL literals

AG-1372 - exporting a repo with geospatial UPIs using multiple workers

AGWebView

LLM support

Ollama support

agtool

AG-1454 - agtool commands for managing and querying freetext indices

Extend and improve template language for non-RDF data import.

Fixed error looking up transform rules in the repo without --graph

Loading JSON/CSV with agtool now respects --error-strategy

Exporting a repo with geospatial UPIs using multiple workers

Changes to the Lisp API

The count-query function returns 0 instead of nil for remote triple stores

Prolog

Documentation

Fixed links to 3-rd party classes in AGraph Java client documentation

Increased suggested --shm-size argument value for docker

Include an example how to combine results from vector search and triples matching

New monitoring page to describe AllegroGraph monitoring capabilities

New guidelines for query performance tuning using query options

Java client

Update Java & Dependencies

Support multiple SPARQL update queries in a single commit

Distributed AllegroGraph

Fix COUNT(DISTINCT *) SPARQL query for FedShard repository

Multi-Master Replication (MMR)

Secure data transport (SecureMMR)

User-visible changes in earlier releases

Non-standard load-transform `prefix`es are expanded

Add `http://www.w3.org/ns/shacl` namespace as `sh` abbreviation

AG-1484 - non-standard load-transform `prefix`es are not expanded

AG-1454 - `agtool` commands for managing and querying freetext indices

Fixed error looking up transform rules in the repo without `--graph`

Loading JSON/CSV with `agtool` now respects `--error-strategy`

The `count-query` function returns 0 instead of nil for remote triple stores