jq is a lightweight and flexible command-line JSON processor. You can use jq
on a local development machine to
slice, filter, map, and transform the JSON data that Unstructured outputs in much the same ways that tools such as sed
, awk
, and grep
let you work with text.
To get jq
, see the Download jq page.
jq
is not owned or supported by Unstructured. For questions about jq
and
feature requests for future versions of jq
, see the Issues tab of the
jq
repository in GitHub.
The following command examples use jq
with the
spring-weather.html.json file in the
example-docs directory within the Unstructured-IO/unstructured repository in GitHub.
Find the element with a type
of Address
, and print the element’s text
field’s value.
The output is:
Find all elements with a type
of Title
, and print the text
field of each found element as a string in a JSON array.
The output is:
Find all elements with a type
of Title
. Of these, find the ones that have a text
field that contains the phrase Contact Us
, and print the contents of each found element’s metadata.link_urls
field.
The output is:
Find all elements with a type
of ListItem
. Of these, find the ones that have a text
field that contains the phrase Weather Safety
.
For each item in metadata.link_texts
, print the item’s value as the key, followed by the matching item in
metadata.link_urls
as the value. Trim any leading and trailing whitespace from all values. Wrap the output in a JSON array.
The output is:
jq is a lightweight and flexible command-line JSON processor. You can use jq
on a local development machine to
slice, filter, map, and transform the JSON data that Unstructured outputs in much the same ways that tools such as sed
, awk
, and grep
let you work with text.
To get jq
, see the Download jq page.
jq
is not owned or supported by Unstructured. For questions about jq
and
feature requests for future versions of jq
, see the Issues tab of the
jq
repository in GitHub.
The following command examples use jq
with the
spring-weather.html.json file in the
example-docs directory within the Unstructured-IO/unstructured repository in GitHub.
Find the element with a type
of Address
, and print the element’s text
field’s value.
The output is:
Find all elements with a type
of Title
, and print the text
field of each found element as a string in a JSON array.
The output is:
Find all elements with a type
of Title
. Of these, find the ones that have a text
field that contains the phrase Contact Us
, and print the contents of each found element’s metadata.link_urls
field.
The output is:
Find all elements with a type
of ListItem
. Of these, find the ones that have a text
field that contains the phrase Weather Safety
.
For each item in metadata.link_texts
, print the item’s value as the key, followed by the matching item in
metadata.link_urls
as the value. Trim any leading and trailing whitespace from all values. Wrap the output in a JSON array.
The output is: