Skip to main content

Airbyte

Airbyte is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.

AirbyteLoader​

This loader is built on top of PyAirbyte for easy setup and use.

Installation and Setup​

pip install -U langchain-airbyte
note

Currently, the airbyte library does not support Pydantic v2. Please downgrade to Pydantic v1 to use this package.

This package also currently requires Python 3.10+.

The integration package doesn't have any global environment variables that need to be set, but some integrations (e.g. source-github) may need credentials passed in.

Document Loader​

AirbyteLoader class exposes a single document loader for Airbyte sources.

from langchain_airbyte import AirbyteLoader

loader = AirbyteLoader(
source="source-faker",
stream="users",
config={"count": 100},
)
docs = loader.load()

For more information, see the full AirbyteLoader docs.

AirbyteJSONLoader (Deprecated)​

This loader is deprecated and should be swapped out for AirbyteLoader, which doesn't require any of the docker setup!

Installation and Setup​

This instruction shows how to load any source from Airbyte into a local JSON file that can be read in as a document.

Prerequisites: Have docker desktop installed.

Steps:

  1. Clone Airbyte from GitHub - git clone https://github.com/airbytehq/airbyte.git.
  2. Switch into Airbyte directory - cd airbyte.
  3. Start Airbyte - docker compose up.
  4. In your browser, just visit http://localhost:8000. You will be asked for a username and password. By default, that's username airbyte and password password.
  5. Setup any source you wish.
  6. Set destination as Local JSON, with specified destination path - lets say /json_data. Set up a manual sync.
  7. Run the connection.
  8. To see what files are created, navigate to: file:///tmp/airbyte_local/.

Document Loader​

See a usage example.

from langchain_community.document_loaders import AirbyteJSONLoader

Help us out by providing feedback on this documentation page: