Neo4j

If you’re new to Unstructured, read this note first.

Before you can create a destination connector, you must first sign in to your Unstructured account:

If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created.
If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

After you sign in, the Unstructured user interface (UI) appears, which you use to get your Unstructured API key. To learn how, watch this 40-second how-to video.

After you create the destination connector, add it along with a source connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on Workflow Endpoint quickstart, go directly to the quickstart notebook, or watch the two 4-minute video tutorials for the Unstructured Python SDK.

You can also create destination connectors with the Unstructured user interface (UI). Learn how.

If you need help, reach out to the community on Slack, or contact us directly.

You are now ready to start creating a destination connector! Keep reading to learn how.

Send processed data from Unstructured to Neo4j.

The requirements are as follows.

A Neo4j deployment.
- For the Unstructured UI or the Unstructured API, local Neo4j deployments are not supported.
- For Unstructured Ingest, local and non-local Neo4j deployments are supported.
The following video shows how to set up a Neo4j Aura deployment:
The username and password for the user who has access to the Neo4j deployment. The default user is typically neo4j.
- For a Neo4j Aura instance, the defaut user’s is typically set when the instance is created.
- For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, the default user is typically set during the deployment process.
- For a local Neo4j deployment, you can set the default user’s initial password or recover an admin user and its password.
The connection URI for the Neo4j deployment, which starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; followed by localhost or the host name; and sometimes ending with a colon and the port number (such as :7687). For example:
- For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click Connect > Drivers to get the connection URI, which follows the format neo4j+s://<host-name>. A port number is not used or needed.
- For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, see Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP for details about how to get the connection URI.
- For a local Neo4j deployment, the URI is typically bolt://localhost:7687
- For other Neo4j deployment types, see the deployment provider’s documentation.
Learn more.
The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named neo4j for user data and another named system for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment; Neo4j Aura instances do not.
- Create additional databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.
- Get a list of additional available databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.

Graph Output

The graph ouput of the Neo4j destination connector is represented in the following diagram:

View the preceding diagram in full-screen mode.

In the preceding diagram:

The Document node represents the source file.
The UnstructuredElement nodes represent the source file’s Unstructured Element objects, before chunking.
The Chunk nodes represent the source file’s Unstructured Element objects, after chunking.
Each UnstructuredElement node has a PART_OF_DOCUMENT relationship with the Document node.
Each Chunk node also has a PART_OF_DOCUMENT relationship with the Document node.
Each UnstructuredElement node has a PART_OF_CHUNK relationship with a Chunk element.
Each Chunk node, except for the “last” Chunk node, has a NEXT_CHUNK relationship with its “next” Chunk node.

Learn more about document elements and chunking.

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

MATCH path=(source)-[relationship]->(target)
RETURN path

Query for Chunk to Document relationships:

MATCH (chunk:Chunk)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN chunk, relationship, doc

Query for UnstructuredElement to Document relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, relationship, doc

Query for UnstructuredElement to Chunk relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
RETURN element, relationship, chunk

Query for Chunk to Chunk relationships:

MATCH (this:Chunk)-[relationship:NEXT_CHUNK]->(previous:Chunk)
RETURN this, relationship, previous

Query for UnstructuredElement to Chunk to Document relationships:

MATCH (element:UnstructuredElement)-[ecrelationship:PART_OF_CHUNK]-(chunk:Chunk)-[cdrelationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, ecrelationship, chunk, cdrelationship, doc

Query for UnstructuredElements containing the text jury, and show their Chunk relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE element.text =~ '(?i).*jury.*'
RETURN element, relationship, chunk

Query for the Chunk with the specified id, and show its UnstructuredElement relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE chunk.id = '731508bf53637ce4431fe93f6028ebdf'
RETURN element, relationship, chunk

Additionally, for the Unstructured UI and Unstructured Workflow Endpoint, when a Named entity recognition (NER) DAG node is added to a custom workflow, any recognized entities are output as Entity nodes in the graph.

This additional graph ouput of the Neo4j destination connector is represented in the following diagram:

In the preceding diagram:

The Chunk node represents one of the source file’s Unstructured Element objects, after chunking.
The Entity node represents a recognized entity.
A Chunk node can have HAS_ENTITY relationships with Entity nodes.
An Entity node can have ENTITY_TYPE relationships with other Entity nodes.

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

MATCH path=(source)-[relationship]->(target)
RETURN path

Query for Entity to Entity relationships:

MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
RETURN child, relationship, parent

Query for Entity nodes containing the text PERSON, and show their Entity relationships:

MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id = 'PERSON'
RETURN child, relationship, parent

Query for Entity nodes containing the text amendment, and show their Chunk relationships:

MATCH (element:Chunk)-[relationship:HAS_ENTITY]->(entity:Entity)
WHERE entity.id =~ '(?i).*amendment.*'
RETURN element, relationship, entity

QUERY FOR Entity nodes containing the text PERSON, and show their Entity to Entity to Chunk relationships:

MATCH (chunk:Chunk)-[ccrelationship:HAS_ENTITY]-(child:Entity)-[cprelationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id =~ 'PERSON'
RETURN chunk, ccrelationship, child, cprelationship, parent

To create a Neo4j destination connector, see the following examples.

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateDestinationRequest
from unstructured_client.models.shared import (
    CreateDestinationConnector,
    DestinationConnectorType,
    Neo4jDestinationConnectorConfigInput
)

with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
    response = client.destinations.create_destination(
        request=CreateDestinationRequest(
            create_destination_connector=CreateDestinationConnector(
                name="<name>",
                type=DestinationConnectorType.NEO4J,
                config=Neo4jDestinationConnectorConfigInput(
                    uri="<uri>",
                    database="<database>",
                    username="<username>",
                    password="<password>",
                    batch_size=<batch-size>
                )
            )
        )
    )

    print(response.destination_connector_information)

Replace the preceding placeholders as follows:

<name> (required) - A unique name for this connector.
<uri> (required) - The connection URI for the Neo4j deployment, which typically starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; is followed by the host name; and ends with a colon and the port number (such as :7473, :7474, or :7687).
<database> (required) - The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains a standard database named neo4j for user data.
<username> (required) - The name of the user who has access to the Neo4j deployment. A default Neo4j deployment typically contains a default user named neo4j.
<password> (required) - The password for the user.
<batch-size> - The maximum number of nodes or relationships to be transmitted per batch. The default is 100 if not otherwise specified.

MotherDuck OneDrive

On this page

Graph Output

If you’re new to Unstructured, read this note first.

Before you can create a destination connector, you must first sign in to your Unstructured account:

If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created.
If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

After you sign in, the Unstructured user interface (UI) appears, which you use to get your Unstructured API key. To learn how, watch this 40-second how-to video.

You can also create destination connectors with the Unstructured user interface (UI). Learn how.

If you need help, reach out to the community on Slack, or contact us directly.

You are now ready to start creating a destination connector! Keep reading to learn how.

Send processed data from Unstructured to Neo4j.

The requirements are as follows.

A Neo4j deployment.
- For the Unstructured UI or the Unstructured API, local Neo4j deployments are not supported.
- For Unstructured Ingest, local and non-local Neo4j deployments are supported.
The following video shows how to set up a Neo4j Aura deployment:
The username and password for the user who has access to the Neo4j deployment. The default user is typically neo4j.
- For a Neo4j Aura instance, the defaut user’s is typically set when the instance is created.
- For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, the default user is typically set during the deployment process.
- For a local Neo4j deployment, you can set the default user’s initial password or recover an admin user and its password.
The connection URI for the Neo4j deployment, which starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; followed by localhost or the host name; and sometimes ending with a colon and the port number (such as :7687). For example:
- For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click Connect > Drivers to get the connection URI, which follows the format neo4j+s://<host-name>. A port number is not used or needed.
- For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, see Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP for details about how to get the connection URI.
- For a local Neo4j deployment, the URI is typically bolt://localhost:7687
- For other Neo4j deployment types, see the deployment provider’s documentation.
Learn more.
The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named neo4j for user data and another named system for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment; Neo4j Aura instances do not.
- Create additional databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.
- Get a list of additional available databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.

Graph Output

The graph ouput of the Neo4j destination connector is represented in the following diagram:

View the preceding diagram in full-screen mode.

In the preceding diagram:

The Document node represents the source file.
The UnstructuredElement nodes represent the source file’s Unstructured Element objects, before chunking.
The Chunk nodes represent the source file’s Unstructured Element objects, after chunking.
Each UnstructuredElement node has a PART_OF_DOCUMENT relationship with the Document node.
Each Chunk node also has a PART_OF_DOCUMENT relationship with the Document node.
Each UnstructuredElement node has a PART_OF_CHUNK relationship with a Chunk element.
Each Chunk node, except for the “last” Chunk node, has a NEXT_CHUNK relationship with its “next” Chunk node.

Learn more about document elements and chunking.

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

MATCH path=(source)-[relationship]->(target)
RETURN path

Query for Chunk to Document relationships:

MATCH (chunk:Chunk)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN chunk, relationship, doc

Query for UnstructuredElement to Document relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, relationship, doc

Query for UnstructuredElement to Chunk relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
RETURN element, relationship, chunk

Query for Chunk to Chunk relationships:

MATCH (this:Chunk)-[relationship:NEXT_CHUNK]->(previous:Chunk)
RETURN this, relationship, previous

Query for UnstructuredElement to Chunk to Document relationships:

MATCH (element:UnstructuredElement)-[ecrelationship:PART_OF_CHUNK]-(chunk:Chunk)-[cdrelationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, ecrelationship, chunk, cdrelationship, doc

Query for UnstructuredElements containing the text jury, and show their Chunk relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE element.text =~ '(?i).*jury.*'
RETURN element, relationship, chunk

Query for the Chunk with the specified id, and show its UnstructuredElement relationships:

MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE chunk.id = '731508bf53637ce4431fe93f6028ebdf'
RETURN element, relationship, chunk

This additional graph ouput of the Neo4j destination connector is represented in the following diagram:

In the preceding diagram:

The Chunk node represents one of the source file’s Unstructured Element objects, after chunking.
The Entity node represents a recognized entity.
A Chunk node can have HAS_ENTITY relationships with Entity nodes.
An Entity node can have ENTITY_TYPE relationships with other Entity nodes.

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

MATCH path=(source)-[relationship]->(target)
RETURN path

Query for Entity to Entity relationships:

MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
RETURN child, relationship, parent

Query for Entity nodes containing the text PERSON, and show their Entity relationships:

MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id = 'PERSON'
RETURN child, relationship, parent

Query for Entity nodes containing the text amendment, and show their Chunk relationships:

MATCH (element:Chunk)-[relationship:HAS_ENTITY]->(entity:Entity)
WHERE entity.id =~ '(?i).*amendment.*'
RETURN element, relationship, entity

QUERY FOR Entity nodes containing the text PERSON, and show their Entity to Entity to Chunk relationships:

MATCH (chunk:Chunk)-[ccrelationship:HAS_ENTITY]-(child:Entity)-[cprelationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id =~ 'PERSON'
RETURN chunk, ccrelationship, child, cprelationship, parent

To create a Neo4j destination connector, see the following examples.

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateDestinationRequest
from unstructured_client.models.shared import (
    CreateDestinationConnector,
    DestinationConnectorType,
    Neo4jDestinationConnectorConfigInput
)

with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
    response = client.destinations.create_destination(
        request=CreateDestinationRequest(
            create_destination_connector=CreateDestinationConnector(
                name="<name>",
                type=DestinationConnectorType.NEO4J,
                config=Neo4jDestinationConnectorConfigInput(
                    uri="<uri>",
                    database="<database>",
                    username="<username>",
                    password="<password>",
                    batch_size=<batch-size>
                )
            )
        )
    )

    print(response.destination_connector_information)

Replace the preceding placeholders as follows:

<name> (required) - A unique name for this connector.
<uri> (required) - The connection URI for the Neo4j deployment, which typically starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; is followed by the host name; and ends with a colon and the port number (such as :7473, :7474, or :7687).
<database> (required) - The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains a standard database named neo4j for user data.
<username> (required) - The name of the user who has access to the Neo4j deployment. A default Neo4j deployment typically contains a default user named neo4j.
<password> (required) - The password for the user.
<batch-size> - The maximum number of nodes or relationships to be transmitted per batch. The default is 100 if not otherwise specified.

MotherDuck OneDrive

On this page

Graph Output

Graph Output

Unstructured API

Workflow Endpoint

Partition Endpoint

Legacy APIs

Troubleshooting

Neo4j

Graph Output

​Graph Output

Unstructured API

Workflow Endpoint

Partition Endpoint

Legacy APIs

Troubleshooting

​Graph Output

Graph Output

Graph Output