If you’re new to Unstructured, read this note first.
Before you can create a destination connector, you must first sign in to your Unstructured account:
After you sign in, the Unstructured user interface (UI) appears, which you use to get your Unstructured API key. To learn how, watch this 40-second how-to video.
After you create the destination connector, add it along with a source connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on Workflow Endpoint quickstart, go directly to the quickstart notebook, or watch the two 4-minute video tutorials for the Unstructured Python SDK.
You can also create destination connectors with the Unstructured user interface (UI). Learn how.
If you need help, reach out to the community on Slack, or contact us directly.
You are now ready to start creating a destination connector! Keep reading to learn how.
Send processed data from Unstructured to Neo4j.
The requirements are as follows.
The following video shows how to set up a Neo4j Aura deployment:
The username and password for the user who has access to the Neo4j deployment. The default user is typically neo4j
.
The connection URI for the Neo4j deployment, which starts with neo4j://
, neo4j+s://
, bolt://
, or bolt+s://
; followed by localhost
or the host name; and sometimes ending with a colon and the port number (such as :7687
). For example:
neo4j+s://<host-name>
. A port number is not used or needed.bolt://localhost:7687
The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named neo4j
for user data and another
named system
for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment;
Neo4j Aura instances do not.
The graph ouput of the Neo4j destination connector is represented in the following diagram:
View the preceding diagram in full-screen mode.
In the preceding diagram:
Document
node represents the source file.UnstructuredElement
nodes represent the source file’s Unstructured Element
objects, before chunking.Chunk
nodes represent the source file’s Unstructured Element
objects, after chunking.UnstructuredElement
node has a PART_OF_DOCUMENT
relationship with the Document
node.Chunk
node also has a PART_OF_DOCUMENT
relationship with the Document
node.UnstructuredElement
node has a PART_OF_CHUNK
relationship with a Chunk
element.Chunk
node, except for the “last” Chunk
node, has a NEXT_CHUNK
relationship with its “next” Chunk
node.Learn more about document elements and chunking.
Some related example Neo4j graph queries include the following.
Query for all available nodes and relationships:
Query for Chunk
to Document
relationships:
Query for UnstructuredElement
to Document
relationships:
Query for UnstructuredElement
to Chunk
relationships:
Query for Chunk
to Chunk
relationships:
Query for UnstructuredElement
to Chunk
to Document
relationships:
Query for UnstructuredElements
containing the text jury
, and show their Chunk
relationships:
Query for the Chunk
with the specified id
, and show its UnstructuredElement
relationships:
Additionally, for the Unstructured UI and Unstructured Workflow Endpoint,
when a Named entity recognition (NER) DAG node is added to a custom workflow,
any recognized entities are output as Entity
nodes in the graph.
This additional graph ouput of the Neo4j destination connector is represented in the following diagram:
In the preceding diagram:
Chunk
node represents one of the source file’s Unstructured Element
objects, after chunking.Entity
node represents a recognized entity.Chunk
node can have HAS_ENTITY
relationships with Entity
nodes.Entity
node can have ENTITY_TYPE
relationships with other Entity
nodes.Some related example Neo4j graph queries include the following.
Query for all available nodes and relationships:
Query for Entity
to Entity
relationships:
Query for Entity
nodes containing the text PERSON
, and show their Entity
relationships:
Query for Entity
nodes containing the text amendment
, and show their Chunk
relationships:
QUERY FOR Entity
nodes containing the text PERSON
, and show their Entity
to Entity
to Chunk
relationships:
To create a Neo4j destination connector, see the following examples.
Replace the preceding placeholders as follows:
<name>
(required) - A unique name for this connector.<uri>
(required) - The connection URI for the Neo4j deployment, which typically starts with neo4j://
, neo4j+s://
, bolt://
, or bolt+s://
; is followed by the host name; and ends with a colon and the port number (such as :7473
, :7474
, or :7687
).<database>
(required) - The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains a standard database named neo4j for user data.<username>
(required) - The name of the user who has access to the Neo4j deployment. A default Neo4j deployment typically contains a default user named neo4j
.<password>
(required) - The password for the user.<batch-size>
- The maximum number of nodes or relationships to be transmitted per batch. The default is 100
if not otherwise specified.If you’re new to Unstructured, read this note first.
Before you can create a destination connector, you must first sign in to your Unstructured account:
After you sign in, the Unstructured user interface (UI) appears, which you use to get your Unstructured API key. To learn how, watch this 40-second how-to video.
After you create the destination connector, add it along with a source connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on Workflow Endpoint quickstart, go directly to the quickstart notebook, or watch the two 4-minute video tutorials for the Unstructured Python SDK.
You can also create destination connectors with the Unstructured user interface (UI). Learn how.
If you need help, reach out to the community on Slack, or contact us directly.
You are now ready to start creating a destination connector! Keep reading to learn how.
Send processed data from Unstructured to Neo4j.
The requirements are as follows.
The following video shows how to set up a Neo4j Aura deployment:
The username and password for the user who has access to the Neo4j deployment. The default user is typically neo4j
.
The connection URI for the Neo4j deployment, which starts with neo4j://
, neo4j+s://
, bolt://
, or bolt+s://
; followed by localhost
or the host name; and sometimes ending with a colon and the port number (such as :7687
). For example:
neo4j+s://<host-name>
. A port number is not used or needed.bolt://localhost:7687
The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named neo4j
for user data and another
named system
for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment;
Neo4j Aura instances do not.
The graph ouput of the Neo4j destination connector is represented in the following diagram:
View the preceding diagram in full-screen mode.
In the preceding diagram:
Document
node represents the source file.UnstructuredElement
nodes represent the source file’s Unstructured Element
objects, before chunking.Chunk
nodes represent the source file’s Unstructured Element
objects, after chunking.UnstructuredElement
node has a PART_OF_DOCUMENT
relationship with the Document
node.Chunk
node also has a PART_OF_DOCUMENT
relationship with the Document
node.UnstructuredElement
node has a PART_OF_CHUNK
relationship with a Chunk
element.Chunk
node, except for the “last” Chunk
node, has a NEXT_CHUNK
relationship with its “next” Chunk
node.Learn more about document elements and chunking.
Some related example Neo4j graph queries include the following.
Query for all available nodes and relationships:
Query for Chunk
to Document
relationships:
Query for UnstructuredElement
to Document
relationships:
Query for UnstructuredElement
to Chunk
relationships:
Query for Chunk
to Chunk
relationships:
Query for UnstructuredElement
to Chunk
to Document
relationships:
Query for UnstructuredElements
containing the text jury
, and show their Chunk
relationships:
Query for the Chunk
with the specified id
, and show its UnstructuredElement
relationships:
Additionally, for the Unstructured UI and Unstructured Workflow Endpoint,
when a Named entity recognition (NER) DAG node is added to a custom workflow,
any recognized entities are output as Entity
nodes in the graph.
This additional graph ouput of the Neo4j destination connector is represented in the following diagram:
In the preceding diagram:
Chunk
node represents one of the source file’s Unstructured Element
objects, after chunking.Entity
node represents a recognized entity.Chunk
node can have HAS_ENTITY
relationships with Entity
nodes.Entity
node can have ENTITY_TYPE
relationships with other Entity
nodes.Some related example Neo4j graph queries include the following.
Query for all available nodes and relationships:
Query for Entity
to Entity
relationships:
Query for Entity
nodes containing the text PERSON
, and show their Entity
relationships:
Query for Entity
nodes containing the text amendment
, and show their Chunk
relationships:
QUERY FOR Entity
nodes containing the text PERSON
, and show their Entity
to Entity
to Chunk
relationships:
To create a Neo4j destination connector, see the following examples.
Replace the preceding placeholders as follows:
<name>
(required) - A unique name for this connector.<uri>
(required) - The connection URI for the Neo4j deployment, which typically starts with neo4j://
, neo4j+s://
, bolt://
, or bolt+s://
; is followed by the host name; and ends with a colon and the port number (such as :7473
, :7474
, or :7687
).<database>
(required) - The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains a standard database named neo4j for user data.<username>
(required) - The name of the user who has access to the Neo4j deployment. A default Neo4j deployment typically contains a default user named neo4j
.<password>
(required) - The password for the user.<batch-size>
- The maximum number of nodes or relationships to be transmitted per batch. The default is 100
if not otherwise specified.