Software Solutions for Data Lake Management

Zaloni Manages the Complete Big Data Pipeline

To obtain business value from big data and the powerful, but ever-­changing Hadoop ecosystem, a robust, enterprise­-grade data lake management and governance platform is required. It must enable and automate key management and governance activities that span the entire data pipeline: Ingestion, organization, enrichment, and engagement.

  • Ingest
  • Organize
  • Enrich
  • Engage

Ingest vast amounts and variety of data, from any source, with ease

  • Single page to configure ingest, metadata and workflow
  • Support for streaming data
  • Automated cataloging of existing data
  • Complete visibility of data coming into the data lake
  • Repeatable and scalable process
  • Notifications in case of failures

Know what is in the data lake

  • Single place to view operational, technical and business metadata
  • Search, browse, and find the data you need for analytics, reducing your time to insight
  • Configure to desensitize PII (personally identifiable information) and perform change data capture
  • Set up data quality rules at both file and field levels
  • Integrated with the Hadoop ecosystem (HCatalog)

Orchestrate and automate data preparation

  • Drag-n-drop to orchestrate complex workflows
  • Drag-n-drop to create Spark transformations
  • Complete visibility of completed, queued, and running workflows
  • Built-in actions for watermarking, masking, and tokenization
  • Convert data formats
  • Notifications in case of failures

Democratize access to the data lake

  • Enterprise-wide data catalog via search & explore
  • Curation of metadata via popularity ratings and tags
  • Self-service interactive data preparation
  • Workspaces for collaboration
  • Saved smart searches
  • Management & monitoring of enrichments via Bedrock integration