Skip to content

Releases: Pometry/Raphtory

v0.18.5

Choose a tag to compare

@github-actions github-actions released this 23 Jun 13:56
ca16562

Release v0.18.5

v0.18.0

Choose a tag to compare

@github-actions github-actions released this 23 Jun 13:43
ca16562

Release v0.18.0

v0.17.0

Choose a tag to compare

@github-actions github-actions released this 09 Mar 16:09
fa6d8d2

API Changes

Unified Filter API

The filter system has been completely overhauled. Multiple filter methods (filter_nodes, filter_edges, filter_exploded_edges) are replaced by a single unified filter() method. Filter expressions now require explicit context (filter.Node, filter.Edge, filter.ExplodedEdge). Views now consistently apply "to the right" in the call chain — no more one-hop semantics where filters reset after traversals. Pandas-style [] indexing is also supported for accessing nodes/edges without keeping the filter applied.

# Before (removed)
filtered = graph.filter_nodes(filter.Property("name") == "foo")
filtered = graph.filter_edges(filter.Property("weight") > 0.5)

# After
filtered = graph.filter(filter.Node.property("name") == "foo")
filtered = graph.filter(filter.Edge.property("weight") > 0.5)
filtered = graph[filter.Node.property("active") == True]  # Pandas-style

Important

See the Master filter PR (#2254) for a full migration guide covering Python and GraphQL changes. You can also read about this with the New documentation


Consolidated Load Functions

All format-specific load functions (load_edges_from_pandas, load_edges_from_parquet, etc.) have been removed in favour of unified load_edges(), load_nodes(), load_edge_metadata(), and load_node_metadata() methods that accept any Arrow-compatible data source, file path, or directory.

# Before (removed)
g.load_edges_from_pandas(df, ...)
g.load_edges_from_parquet("data.parquet", ...)

# After — single function, any source
g.load_edges(data=df, time="time", src="src", dst="dst")
g.load_edges(data="data.parquet", time="time", src="src", dst="dst")
g.load_edges(data="/dir/of/csvs/", time="time", src="src", dst="dst")

New capabilities include a schema parameter for explicit column type casting, a csv_options parameter for CSV reading configuration (delimiter, quoting, comments, etc.), and support for any Python object implementing the __arrow_c_stream__ interface — including Polars, FireDucks, DuckDB results, and PyArrow Tables — enabling zero-copy streaming into the Rust core. (#2423, #2391)


New NodeState

Nodestate has had a revamp and can now be used to join multple Raphtory outputs together.

New capabilities include:

  • merge() — combine multiple node states into multi-column results (e.g. merge PageRank with community labels)
  • sort_by(cols) / top_k(cols) — sort or rank by multiple columns
  • groups(cols) — group nodes by column values
  • to_parquet() / from_parquet() — serialise/deserialise node states

History API

  • All time-returning functions (earliest_time, latest_time, start, end, etc.) now return an EventTime object instead of a raw integer. The EventTime object provides .t (timestamp) and .dt (datetime) accessors, removing the need for separate earliest_datetime, latest_datetime, start_datetime, end_datetime etc. variants — these have all been removed.

  • A new History object replaces the old plain list return type, providing rich, built-in functionality for working with temporal histories. This includes merging different histories together and exploring intervals. You can read more about this here https://docs.pometry.com/docs/querying/history.

(#2075)


Embedding & Vectorisation API

The embedding API has been significantly reworked. Key breaking changes:

  • set_embeddings() removed — replaced by vectorise_graph() and vectorise_all_graphs(), which take the new OpenAIEmbeddings object directly
  • with_vectorised_graphs() removed — use vectorise_graph() instead
  • scores renamed to distances throughout all APIs (Python, Rust, GraphQL) — similarity search results are now ranked in ascending order of distance rather than descending order of score
  • get_documents_with_scores()get_documents_with_distances()
  • New classes: OpenAIEmbeddings, VectorCache, and embedding_server() decorator for custom embedding functions
# Before
server = GraphServer().set_embeddings(cache="/tmp/cache", embedding=my_fn)
selection.get_documents_with_scores()

# After
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
server = GraphServer()
server.vectorise_graph("my_graph", embeddings=embeddings)
selection.get_documents_with_distances()

(#2249)


Improvements

  • SSSP algorithm is now directed by default (#2426)
  • Stable GraphQL schema output for deterministic builds (#2427)
  • Local clustering coefficient no longer requires a static graph (#2433)
  • Forbid special characters in graph names and namespaces (#2496)
  • Filter documentation added (#2472)
  • Generate random graphs using the Erdős–Rényi model directly from Raphtory. (#2253)

Bug Fixes

  • Fix client errors silently disappearing with async_graphql 7.1.0 (#2439)
  • Fix GraphQL query cache not evicting automatically when graph data changes (#2454)
  • Fix test issues revealed by pandas 3.0 (#2448)
  • Fix schema casting issues (#2479)
  • Fix auto-release GitHub Action (#2498)

New Contributors

What's Changed

New Contributors

Full Changelog: v0.16.4...v0.17.0

v0.16.5

Choose a tag to compare

@github-actions github-actions released this 24 Feb 10:38

What's Changed

v0.16.4

Choose a tag to compare

@github-actions github-actions released this 12 Dec 14:16
7e5a1e6

Release v0.16.4

Highlights

  • Raphtory is now be available for Python 3.14.
  • We have dropped support for 3.10 to allow Raphtory to be brought up to date with the latest version of PyO3. The minimum Python version is now 3.11.

UI

  • Filtered out all non-valid edges in direct connections.
  • Added a node type loading indicator to search page to let you know when the graph has finished loading into the cache.
  • Added a new layout customiser panel which allows you to change the parameters of the layout algorithm and run multiple layouts as part of a pipeline.
image
  • Added node and edge styling which can be set in metadata for node types, layers or individual nodes and edges. Styles can be manually written into the graph or added from the UI. Currently, you can adjust the node colour or size and the edge colour in the timeline.

    • In the future, style options will expand to everything that our underlying graph visualisation library (g6) offers.
    image

image

Bug fixes


  • Fixed the DataFrame and Arrow loaders now correctly handle different datetime formats (including Date32).
  • Dataframe and Arrow loaders now correctly convert strings into timestamps (if possible) when provided as the time column. This brings them in line with the functionality of add_node and add_edge .
  • The GraphQL health check now goes through the read and write rayon pools before returning, so deadlocks don't go undetected when using it.
  • Added a check on otlp_agent_host to confirm the host accepts OpenTelemetry data (otherwise logs the failure to let the user know).
  • Enabled logs from OTLP libraries so any connection errors are properly logged when debug level is enabled.
  • Fixed an issue where the python server would only fail, given wrong arguments, after the timeout completes.

Known issues


What's Changed

This version also adds a docker compose setup under examples/grafana with:

  • Raphtory set up to send traces to Tempo.
  • Tempo set up to compute TraceQL metrics.
  • Grafana set up with tempo as a datasource and a basic dashboard template.
image

Full tracing for complex queries generates quite large spans so we have added some different tracing levels. Available options are:

  • COMPLETE: Provides full traces for each query.
  • ESSENTIAL: Tracks key functions — addEdge, addEdges, deleteEdge, graph, updateGraph, addNode, node, nodes, edge, edges.
  • MINIMAL: Provides only summary execution times.

v0.16.3

Choose a tag to compare

@github-actions github-actions released this 21 Oct 09:19
fa389cb

Highlights

Step aligned windows

Rolling and expanding functions have been updated so that the start of each window is aligned with the smallest unit of time passed by the user within the step.

For example, if the step is "1 month and 1 day", the first window will begin at the start of the most recent day. Explicitly, if the earliest time in the graph is 15/01/25 14:02:23 and you call the rolling function you would get the following increments:

Increments in previous versions:

15/01/25 14:02:2316/01/25 14:02:2317/01/25 14:02:2318/01/25 14:02:23 → …

Increments in v0.16.3:

15/01/25 00:00:0016/01/25 00:00:0017/01/25 00:00:0018/01/25 00:00:00→ …

This change was made to make windows more intuitive. If someone wants a rolling window over "1 year", they typically want it to start at the beginning of the calendar year and end at the end of the year. You can also explicitly set the alignment_unit. For example, you can set g.rolling("1 month", alignment_unit="day") if you want to align to the most recent day.

In addition to this change, if rolling or expanding on the 29th, 30th or 31st in monthly increments, you will return to this day if it is present in the next month (or as close as possible). Previously if your date was decremented you would stay at that date:

Increments in previous versions:

31/01/2528/02/2528/03/2528/04/25 → …

Increments in v0.16.3:

31/01/2528/02/2531/03/2530/04/25 → …

Bug fixes

  • Previously, the timeline_start and timeline_end fallbacks for not explicitly windowed graphs previously looked at the filtered earliest and latest time. This made rolling/expanding inconsistent between different layers. Now when you call rolling or expanding functions on individual layers they will have the same window alignment.
  • Computing the filtered time has improved performance.
  • Significant stress testing added for the server discovered several deadlocks at high concurrency. We rebuilt the locking mechanism in the Graphql server to fix this.
  • Fixed panics in case of simultaneous additions and reads (not all nodes were guaranteed to be initialised in iterators).

What's Changed

New Contributors

Full Changelog: v0.16.2...v0.16.3

v0.16.2

Choose a tag to compare

@github-actions github-actions released this 30 Sep 14:18
ccacea1

What's Changed

New Contributors

Full Changelog: v0.16.1...v0.16.2

v0.16.1

Choose a tag to compare

@github-actions github-actions released this 14 Aug 15:15
b81c0c1

What's Changed

Full Changelog: v0.16.0...v0.16.1

v0.16.0

Choose a tag to compare

@github-actions github-actions released this 30 Jul 18:09
3ea4e8f

Replace constant properties with metadata

Constant properties have be completely seperated from temporal properties and are now known as metadata. This means that expressions like x.properties.constant should be replaced with x.metadata as in the sample below.

This was done for two reasons:

  • The fallback search where x.properties.get("...") would first check temporal properties and then constant properties was confusing and caused very unexpected behaviour in the filters.
  • These are quite different concepts and upon reflection we felt that completely seperating them in the API would make it clearer that there isn't any overlap.

You can now have metadata and properties of different types with the same key:

g = PersistentGraph()
node = g.add_node(timestamp=1,id=1,properties={"weight":1})
node.add_metadata(metadata = {"weight":"string weight"})
print(node.metadata.get("weight"))
print(node.properties.get("weight"))

Time semantics overhaul

  • Seperated explicit node updates from connected edge updates, allowing for better filtering.
  • Filtering layers or edges now filters nodes if all the edge updates that added them are filtered out i.e. the node is not added explicitly via add_node.
    • As a result, subgraph filters out nodes that don't have edges in the subgraph and were not explicitly added via add_node.
  • Changed latest_time semantics for the PersistentGraph to return the time of the last update for the node, edge, or graph in the current view or the start of the window if there are no updates (previously + Infinity).
  • The earliest_time and latest_time within a filtered Event Graph will now reflect the updates within the graph view instead of just window bounds.
  • Added a Graph.valid() filter that only keeps edges that are currently valid without removing their history.
  • For a PersistentGraph is_valid and is_active are no longer the same.
    • Active means there is an update during the period (addition or deletion).
    • Valid means that the edges most recent update is an addition (persistent semantics).
    • Deleted means that the edges most recent update is a deletion.
  • The event graph preserves deletions if created from a persistent graph. An edge can have the following statuses:
    • Included - is active in the window (has an addition or deletion event).
    • Valid - has an addition event in the current view.
    • Deleted - has an addition event in the current view.
  • The default layer only exists if it has updates on it.
  • Filtering an edge update on a persistent graph turns it into a deletion to keep the semantics sensible.

New APIs

  • Edge filtering and exploded edge filtering is now available on the PersistentGraph.
  • Enabled filter negation within the property filter APIs.
  • filter_exploded_edges now take FilterExpr as input in Python.
    • The old Prop("name") api has been removed, use filter.Property("name") instead.
  • Added node filters to PathFromNode and PathFromGraph.
  • Added edge_history_count() to the nodes API.

GraphQL server

  • Drastically improved the performance of the server - over 100 times faster within internal benchmarks.
  • Enabled compression by default.
  • Changed the Python client to only have one internal client instead of creating one for each query, resulting in 100x faster querying from Python.
  • Added rolling and expanding to Graph, Node, Nodes, PathFromNode, Edge and Edges.
  • Renamed all GraphQL structs that started with GQL to make the user facing schema cleaner.
  • Changed all page endpoints to have two separate arguments for item-based and page-based offsets. The existing offset argument has been changed to be item-based, and a separate page_index argument has been added for the old page-based behavior. Both can also be used simultaneously.
  • Added a new API for fetching both namespaces and graphs at the same time.
    • The new object is called a NamespacedItem.
  • Added apply_views to PathFromNode.
  • You can now generate the GraphQL schema in Raphtory via the new CLI.
    • You can run raphtory-graphql schema > schema.graphql removing the need to run a server.
  • You can now insert a custom UI into your custom Raphtory builds via a environment variable.
  • Exposed the GraphQL schema in Python - can now be printed via raphtory.graphql.schema()

GraphQL Bug fixes

  • Fixed GraphQL signed integer fields not accepting negative numbers.
  • Fixed a problem with namespaces returning null paths and not returning root.
  • Fixed an issue with recursive writing of indexes causing the server to crash.
  • Fixed an issue in rolling where if the step was bigger than the window size the final window would be empty.
  • Changed caching policy to never kick out graphs after some timeout by default.
  • Changed WindowSet to not allow zero size step.
  • Added validation to edge and node filters to ensure the property type matches the given value.

Raphtory CLI

  • Adding a Raphtory CLI which is installed via Python where you can start the server or print the schema.
    • image
    • image

UI

Temporal View

  • Scrolling has been drastically improved so that hovering over the bar behaves nicely.
  • Added the ability to pin nodes in the Temporal view to keep them at the top.
  • Nodes now are highlighted in the Temporal view when selected in the graph. The old behaviour of filtering only to edges between highlighted nodes is togglable from the bottom right of the Temporal view.
  • The bucketing of edges is now fixed.

Graph view

  • Fixed visual artifacts when swapping between highlighting.
  • Highlighting relationship types now highlights the edges correctly.
  • The activity log and direct connections in the Context menu are now sorted correctly.

Search page

  • Added relationship searching.
  • Added namespace searching.
  • Clarified that timeline filtering is optional.
  • Fixed the filters so that comparisons, like 'greater than' or 'less than', work.
  • String searching now can do partial matching.

Saved graphs page

  • Minor bug fixes and UX improvements.

GraphRag

  • Swapped our default embedded vector store from a homebrewed solution to Arroy.
  • Add an argument to the vectorise function so that the user can set a path for storing there the vector cache.
  • Added support for missing apis on the template:
    • access to constant_properties
    • temporal_properties.

Property Indexes Alpha

  • Indexes in Raphtory are now updatable and produce the same answer as the filter APIs. They can be saved to disk alongside the proto file and loaded back into memory via Rust, Python or a GraphQL server.
  • Indexes are turned off by default, but can be enabled for for the whole graph, or individual properties via Graph.create_index().

Python

  • Removed unneeded Python dependencies and make those that are not needed for core functions optional.
  • Relaxed the Numpy version to 1.26.

General Bug fixes

  • Fixed filter_edges for layers after adding a constant property.
  • Fixed a bug in the interaction between windowing and exploded edge filtering.
  • Fixed parquet reader where Utf8View columns were being converted to LargeUtf8 which was causing problems further downt the pipeline.
  • Fixed some issues with decoding updates from proto between different versions of Raphtory.

What's Changed

Read more

v0.15.1

Choose a tag to compare

@github-actions github-actions released this 23 Apr 22:42
a9e6f61

Graphql

  • Added new option to output the graphql schema without running the server via raphtory-graphql schema > schema.graphql
  • Graphql now accepts signed integers (bug with underlying library that we patched)
  • Created gqldocuments + output nodes and edges as well as gqldocument in that object -- for vector search
  • You can now provide a custom UI as part of a private raphtory server.

misc

  • Removed dependency on numpy 2.0, will now install/run with <2
  • Several library upgrades for CVE reasons.
  • Improved python testing pipeline

What's Changed

Full Changelog: v0.15.0...v0.15.1