From any source to the lake. In minutes.
ERP, eCommerce, CRM, API and file connectors on the same engine. Streaming and batch. Validation at entry. Retries with backoff. No manual pipelines.
What kind of sources?
Modern APIs
REST, GraphQL, gRPC. OAuth 2.0, JWT, API Key authentication. Rate limiting handled automatically.
Legacy protocols
SOAP, RFC, EDI (EDIFACT, X12), COM. For ERPs that haven't been updated in decades.
Databases
PostgreSQL, MySQL, SQL Server, Oracle, DB2. Change Data Capture when available; scheduled polling otherwise.
Files
CSV, JSON, XML, Parquet. On FTP, SFTP, S3, Azure Blob or local folders. Incremental processing.
Webhooks
Secure signed endpoint to receive push events. PrestaShop, Shopify, HubSpot, Salesforce — all supported.
Streaming
Kafka, RabbitMQ, AWS Kinesis. Real-time ingestion for high volume with at-least-once guarantees.
Visual pipeline: from event to Data Lake
Frequently asked questions about Ingestion
What source types can I ingest?
Any source with a REST API, GraphQL, SOAP, database (PostgreSQL, MySQL, SQL Server, Oracle, DB2), files (CSV, JSON, Parquet, XML, EDI) over FTP/SFTP/S3, or inbound webhooks. If your system exposes data in some form, Integrafy-OS can read it.
How are changing schemas handled?
Connectors support schema-on-read (raw ingestion, schema applied later) and schema-on-write (schema validated at entry). When a field changes at the source, the lake keeps previous versions with explicit lineage, and Data Hub offers assisted reconciliation.
Streaming or batch?
Both on the same engine. Real-time events via webhooks/Kafka/web services; scheduled batch for heavy sources (daily files, weekly full loads). The decision is per connector, not per product.
What happens if a source goes down?
Integrafy-OS keeps an event buffer and retries with exponential backoff. When the source returns, the buffer is drained respecting order. Insight alerts notify the team if the delay exceeds configurable thresholds.
Can I validate data before it reaches the lake?
Yes. Every pipeline supports validation rules (type, range, regex, references to other tables) and transformations (cleanup, enrichment, deduplication). Records that fail validation go to a dead letter queue for review.
Which source do you still need to connect?
Free 30-minute diagnostic.