boletín de noticias

Reciba actualizaciones recientes de Hortonworks por correo electrónico

Una vez al mes, recibir nuevas ideas, tendencias, información de análisis y conocimiento de macrodatos.

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

Una vez al mes, recibir nuevas ideas, tendencias, información de análisis y conocimiento de macrodatos.

cta

Empezar

nube

¿Está preparado para empezar?

Descargue sandbox

¿Cómo podemos ayudarle?

* Entiendo que puedo darme de baja en cualquier momento. Agradezco asimismo la información complementaria porporcionada en la Política de privacidad de Hortonworks.
cerrarBotón de cerrar
Proyectos de Apache
Apache Atlas

Apache Atlas

MENÚ

INFORMACIÓN GENERAL

Agile enterprise compliance through metadata

Atlas is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address compliance requirements

 

What Atlas Does

Screen Shot 2016-09-06 at 4.30.46 PM

Apache Atlas provides scalable governance for Enterprise Hadoop that is driven by metadata. Atlas, at its core, is designed to easily model new business processes and data assets with agility. This flexible type system allows exchange of metadata with other tools and processes within and outside of the Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address compliance requirements

Apache Atlas is developed around two guiding principles:

  • Metadata Truth in Hadoop: Atlas provides true visibility in Hadoop. By using native connector to Hadoop components, Atlas provides technical and operational tracking enriched by business taxonomical metadata. Atlas facilitates easy exchange of metadata by enabling any metadata consumer to share a common metadata store that facilitates interoperability across many metadata producers.
  • Developed in the Open: Engineers from Aetna, Merck, SAS, Schlumberger, and Target are working together to help ensure Atlas is purposely built to solve real data governance problems across a wide range of industries that use Hadoop. This approach is an example of open source community innovation that helps accelerate product maturity and time-to-value for the data-first enterprise.

Apache Atlas empowers enterprises to effectively and efficiently address their compliance requirements through a scalable set of core governance services. These services include:

  • Datos Lineage: Captures lineage across Hadoop components at platform level
  • Agile Data Modeling: Type system allows custom metadata structures in a hierarchy taxonomy
  • REST API: Modern, flexible access to Atlas services, HDP components, UI & external tools
  • Metadata Exchange: Leverage existing metadata / models by importing it from current tools. Export metadata to downstream systems

 

 

 

How Atlas Works

Apache Atlas is designed to effectively exchange metadata within Hadoop and the broader data ecosystem. Atlas’s adaptive model reduces enterprise time to compliance by leveraging existing metadata and industry-specific taxonomy. With Atlas, data administrators and stewards also have the ability to define, annotate and automate the capture of relationships between data sets and underlying elements including source, target and derivation processes.

Atlas also ensures downstream metadata consistency across the ecosystem by enabling enterprises to easily export metadata to third-party systems.

 

atlas_architecture

Previsualización técnica

Business Taxonomy (Catalog)

Los macrodatos brindan la democratización del acceso a la información y facilitan la forma en que se puede compartir la información en toda la empresa. Sin embargo, el crecimiento no planificado puede resultar en 'inundaciones de datos' con contenido que no sea etiquetado o catalogado adecuadamente. Las taxonomías de negocio pueden proporcionar el eslabón perdido para cerrar esta brecha. Del griego, 'taxis', que significa 'orden' y 'arreglo', las taxonomías utilizan una jerarquía de términos para clasificar y organizar los conceptos u objetos físicos/lógicos, lo cual hace que sean el vehículo ideal para capturar la estructura de todo el dominio de los contenidos de la empresa.

Consistent classification and tagging across the enterprise using taxonomies supports system/ platform interoperability and value generation from structured and unstructured data sources by mapping them to common shared vocabulary. This authoritative reference taxonomy improves both data confidence and time to insight.

Requirements for a Big Data Business Catalog

  • Purpose-Built Platform Solution: In order to make sense of big data and provide users with the ability to find the right information, enterprises need a data governance solution that is designed for Hadoop and operates at the platform level, so that it consistently classifies data across all the engines used by the organization to move and analyze data.
  • Una solución de plataforma creada para un propósito puede servir como la única fuente de verdad de metadatos en Hadoop al rastrear automáticamente a usuarios múltiples, actividad de aplicaciones múltiples en los componentes de Hadoop con conectores nativos, mientras que las soluciones de gestión de datos que operan en el nivel de aplicación requieren una ruta de solución de propiedad única que termina por proliferar los silos de datos.
  • Faster Data Discovery: The business catalog enables data officers and stewards to search for data and metadata quickly and in a number of different ways to reduce time to value. This includes the ability to search by:
    • Asset Type: Search for a Hive table, Storm Topology or any connected component.
    • Tags: Search for all columns or tables that have a specific tag such as PII
    • Business Language: Aligned with compliance standards & policies

La combinación de estas capacidades de búsqueda permite que los administradores de datos construyan un modelo de su organización y cómo llevan a cabo negocios. Estos incluye la capacidad de modelar un negocio mediante la combinación de ambas entidades de datos lógicos y físicos para desarrollar una comprensión más completa.

What's New in HDP 2.6

nube

  • Shared enterprise services for governance

Component Coverage

  • Tag-based policy support for HDFS, Kafka and HBase
  • Knox SSO para Atlas UI

Facilidad de uso

  • API revamp
  • Simplified UI for basic search
  • Manual entity creation – support for HDFS, HBase, Kafka & custom entity types etc.
  • Performance and scalability improvements
  • SmartSense metrics

Recent Progress with Atlas

The Atlas/ Ranger integration represents a paradigm shift for big data governance and security. By integrating Atlas with Ranger enterprises can now implement dynamic classification-based security policies, in addition to role-based security. Ranger’s centralized platform empowers data administrators to define security policy based on Atlas metadata tags or attributes and apply this policy in real-time to the entire hierarchy of data assets including databases, tables and columns.

Latest release of Apache Atlas has focused on delivering scalable metadata services to model any business process enhanced with industry-specific terminology, as well as the ability to import and export metadata from other systems and tools.

Apache Atlas Version Progress
Apache Atlas 0.7
  • Enterprise deployment
    • Performance enhancements
    • HA, DR and BC support
    • AD integration
  • Component lineage
    • Kafka/ Storm
    • Scoop
    • Falcon
  • Seguridad
    • Support for Kerberos
    • Atlas/ Ranger integration for dynamic tag-based security
  • Interfaz de usuario
    • Improved GUI
    • Business catalog (Technical Preview)
  • Governance-ready partner ecosystem
 Apache Atlas 0.6
  • Built-in types for HDFS
  • Metadata tag management
  • Expanded support for Apache Hive
Apache Atlas 0.5
  • Scalable metadata service
    • Enterprise/Business unit level modeling with industry-specific vocabulary
    • Extend visibility into HDFS Path, Hive DB, table, columns
    • Flexible access to Atlas services
  • Hive integration leverages existing metadata
    • Leverage existing metadata with import / export capability
    • Capture SQL runtime metrics directly
  • UI driven Hive table lineage and domain-specific search
    • Support for keyword, faceted and free text searches

Governance Ready Certification

Screen Shot 2016-09-07 at 4.11.40 PM

To address enterprise requirements for Hadoop application integration, Atlas strives to foster a vibrant ecosystem based on a centralized metadata store. The Governance Ready program aims to create a curated group of partners that contribute a rich set of data management features focusing on data preparation, integration, cleansing, tagging, ETL visualization and collaboration areas.

 

Certified partners will help define a set of standards to exchange metadata and contribute conforming data integration features to the metadata store. Customers can then subscribe to desired features with low switching costs and faster deployment time.

Foros

Atlas Tutorials

Atlas in the Press

Seminarios web y presentaciones